Re: Virtual generated columns

Richard Guo <guofenglinux@gmail.com>

From: Richard Guo <guofenglinux@gmail.com>

To: Dean Rasheed <dean.a.rasheed@gmail.com>

Cc: Peter Eisentraut <peter@eisentraut.org>, jian he <jian.universality@gmail.com>, Zhang Mingli <zmlpostgres@gmail.com>, Alexander Lakhin <exclusion@gmail.com>, pgsql-hackers <pgsql-hackers@postgresql.org>

Date: 2025-02-18T10:09:17Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Expand virtual generated columns for ALTER COLUMN TYPE
- 5069fef1cfae 18.0 landed
Eliminate code duplication in replace_rte_variables callbacks
- 363a6e8c6fcf 18.0 landed
Expand virtual generated columns in the planner
- 1e4351af329f 18.0 landed
Virtual generated columns
- 83ea6c54025b 18.0 landed
Additional tests for stored generated columns
- 41084409f635 18.0 landed
Improve generated_stored test
- 44b61efb7928 18.0 landed
- 86749ea3b766 18.0 landed
Fix handling of CREATE DOMAIN with GENERATED constraint syntax
- 84a67725cd11 18.0 landed
Add pg_constraint rows for not-null constraints
- 14e87ffa5c54 18.0 cited
Put generated_stored test objects in a schema
- 894be11adfa6 18.0 landed
Rename regress test generated to generated_stored
- b9ed4969250d 18.0 landed
Small code simplification
- 7ff9afbbd1df 18.0 landed
Remove useless code
- e26d313bad92 18.0 landed
Remove useless initializations
- da2aeba8f533 18.0 landed
doc: Clarify that pg_attrdef also stores generation expressions
- da486d360103 18.0 landed
Clean out column-level pg_init_privs entries when dropping tables.
- 76618097a6c0 17.0 cited
Re-implement the ereport() macro using __VA_ARGS__.
- e3a87b4991cc 13.0 cited

On Sat, Feb 15, 2025 at 9:37 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> On Fri, 14 Feb 2025 at 10:59, Peter Eisentraut <peter@eisentraut.org> wrote:
> > Maybe a short-term fix would be to error out if we find ourselves about
> > to expand a Var with varnullingrels != NULL.  That would mean you
> > couldn't use a virtual generated column on the nullable output side of
> > an outer join, which is annoying but not fatal, and we could fix it
> > incrementally later.
>
> I think that would be rather a sad limitation to have. It would be
> nice to have this fully working for the next release.

Besides being a limitation, this approach doesn't address all the
issues with incorrect results.  In some cases, PHVs are needed to
isolate subexpressions, even when varnullingrels != NULL.  As an
example, please consider

create table t (a int primary key, b int generated always as (10 + 10));
insert into t values (1);
insert into t values (2);

# select a, b from t group by grouping sets (a, b) having b = 20;
 a | b
---+----
 2 |
 1 |
   | 20
(3 rows)

This result set is incorrect.  The first two rows, where b is NULL,
should not be included in the result set.

> Attached is a rough patch that moves the expansion of virtual
> generated columns to the planner. It needs a lot more testing (and
> some regression tests), but it does seem to fix all the issues
> mentioned in this thread.

Yeah, I believe this is the right way to go: virtual generated columns
should be expanded in the planner, rather than in the rewriter.

It seems to me that, for a relation in the rangetable that has virtual
generated columns, we can consider it a subquery to some extent.  For
instance, suppose we have a query:

select ... from ... join t on ...;

and suppose t.b is a virtual generated column.  We can consider this
query as:

select ... from ... join (select a, expr() as b from t) as t on ...;

In this sense, I'm wondering if we can leverage the
pullup_replace_vars architecture to expand the virtual generated
columns.  I believe this would help avoid a lot of duplicate code with
pullup_replace_vars_callback.

Thanks
Richard