Re: Virtual generated columns

Richard Guo <guofenglinux@gmail.com>

From: Richard Guo <guofenglinux@gmail.com>
To: Zhang Mingli <zmlpostgres@gmail.com>
Cc: jian he <jian.universality@gmail.com>, Peter Eisentraut <peter@eisentraut.org>, Dean Rasheed <dean.a.rasheed@gmail.com>, Alexander Lakhin <exclusion@gmail.com>, pgsql-hackers <pgsql-hackers@postgresql.org>
Date: 2025-02-11T02:34:10Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Expand virtual generated columns for ALTER COLUMN TYPE

  2. Eliminate code duplication in replace_rte_variables callbacks

  3. Expand virtual generated columns in the planner

  4. Virtual generated columns

  5. Additional tests for stored generated columns

  6. Improve generated_stored test

  7. Fix handling of CREATE DOMAIN with GENERATED constraint syntax

  8. Add pg_constraint rows for not-null constraints

  9. Put generated_stored test objects in a schema

  10. Rename regress test generated to generated_stored

  11. Small code simplification

  12. Remove useless code

  13. Remove useless initializations

  14. doc: Clarify that pg_attrdef also stores generation expressions

  15. Clean out column-level pg_init_privs entries when dropping tables.

  16. Re-implement the ereport() macro using __VA_ARGS__.

On Mon, Feb 10, 2025 at 1:16 PM Zhang Mingli <zmlpostgres@gmail.com> wrote:
> I believe virtual columns should behave like stored columns, except they don't actually use storage.
> Virtual columns are computed when the table is read, and they should adhere to the same rules of join semantics.
> I agree with Richard, the result seems incorrect. The right outcome should be:
> gpadmin=# SELECT t2.a, t2.b FROM t t1 LEFT JOIN t t2 ON FALSE;
>  a | b
> ------+------
>  NULL | NULL
>  NULL | NULL
> (2 rows)

Yeah, I also feel that the virtual generated columns should adhere to
outer join semantics, rather than being unconditionally replaced by
the generation expressions.  But maybe I'm wrong.

If that's the case, this incorrect-result issue isn't limited to
constant expressions; it could also occur with non-strict ones.

CREATE TABLE t (a int, b int GENERATED ALWAYS AS (COALESCE(a, 100)));
INSERT INTO t VALUES (1);
INSERT INTO t VALUES (2);

# SELECT t2.a, t2.b FROM t t1 LEFT JOIN t t2 ON FALSE;
 a |  b
---+-----
   | 100
   | 100
(2 rows)

It seems to me that virtual generated columns should be expanded in
the planner rather than in the rewriter.  Additionally, we may need to
wrap the replacement expressions in PHVs if the virtual generated
columns come from the nullable side of an outer join, similar to what
we do when pulling up subqueries.

Thanks
Richard