Re: Virtual generated columns

Peter Eisentraut <peter@eisentraut.org>

From: Peter Eisentraut <peter@eisentraut.org>
To: jian he <jian.universality@gmail.com>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>
Date: 2024-08-20T10:38:25Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Expand virtual generated columns for ALTER COLUMN TYPE

  2. Eliminate code duplication in replace_rte_variables callbacks

  3. Expand virtual generated columns in the planner

  4. Virtual generated columns

  5. Additional tests for stored generated columns

  6. Improve generated_stored test

  7. Fix handling of CREATE DOMAIN with GENERATED constraint syntax

  8. Add pg_constraint rows for not-null constraints

  9. Put generated_stored test objects in a schema

  10. Rename regress test generated to generated_stored

  11. Small code simplification

  12. Remove useless code

  13. Remove useless initializations

  14. doc: Clarify that pg_attrdef also stores generation expressions

  15. Clean out column-level pg_init_privs entries when dropping tables.

  16. Re-implement the ereport() macro using __VA_ARGS__.

Attachments

Thanks for the great testing again.  Here is an updated patch that 
addresses the issues you have pointed out.


On 14.08.24 02:00, jian he wrote:
> On Thu, Aug 8, 2024 at 2:23 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>>
>> Thank you for your extensive testing.  Here is a new patch set that has
>> fixed all the issues you have reported (MERGE, sublinks, statistics,
>> ANALYZE).
> 
>                      if (coldef->generated && restdef->generated &&
> coldef->generated != restdef->generated)
>                          ereport(ERROR,
>                                  (errcode(ERRCODE_INVALID_COLUMN_DEFINITION),
>                                   errmsg("column \"%s\" inherits from
> generated column of different kind",
>                                          restdef->colname)));
> the error message is not informal. maybe add errhint that
> "column \"%s\" should be same as parent table's generated column kind:
> %s", "virtual"|"stored"

Ok, I added an errdetail().

>   .../regress/expected/create_table_like.out    |  23 +-
>   .../regress/expected/generated_stored.out     |  27 +-
>   ...rated_stored.out => generated_virtual.out} | 835 +++++++++---------
>   src/test/regress/parallel_schedule            |   3 +
>   src/test/regress/sql/create_table_like.sql    |   2 +-
>   src/test/regress/sql/generated_stored.sql     |  10 +-
>   ...rated_stored.sql => generated_virtual.sql} | 301 ++++---
>   src/test/subscription/t/011_generated.pl      |  38 +-
>   55 files changed, 1280 insertions(+), 711 deletions(-)
>   copy src/test/regress/expected/{generated_stored.out
> generated_virtual.out} (69%)
>   copy src/test/regress/sql/{generated_stored.sql => generated_virtual.sql} (72%)
> 
> I don't understand the "copy =>" part, I guess related to copy content
> from stored to virtual.
> anyway. some minor issue:

That's just what git format-patch produces.  It shows that 72% of the 
new file is a copy of the existing file.

> -- alter generation expression of parent and all its children altogether
> ALTER TABLE gtest_parent ALTER COLUMN f3 SET EXPRESSION AS (f2 * 2);
> \d gtest_parent
> \d gtest_child
> \d gtest_child2
> \d gtest_child3
> SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
> 
> The first line ALTER TABLE will fail for
> src/test/regress/sql/generated_virtual.sql.
> so no need
> """
> \d gtest_parent
> \d gtest_child
> \d gtest_child2
> \d gtest_child3
> SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
> """
> 
> Similarly the following tests for gtest29 may aslo need change
> -- ALTER TABLE ... ALTER COLUMN ... DROP EXPRESSION
> 
> since we cannot do ALTER TABLE SET EXPRESSION for virtual generated columns.

I left all these tests in place from the equivalent STORED tests, in 
case we want to add support for the VIRTUAL case as well.  I expect that 
we'll add support for some of these before too long.

> -- ALTER TABLE ... ALTER COLUMN
> CREATE TABLE gtest27 (
>      a int,
>      b int,
>      x int GENERATED ALWAYS AS ((a + b) * 2) VIRTUAL
> );
> INSERT INTO gtest27 (a, b) VALUES (3, 7), (4, 11);
> ALTER TABLE gtest27 ALTER COLUMN a TYPE text;  -- error
> ALTER TABLE gtest27 ALTER COLUMN x TYPE numeric;
> 
> will
> ALTER TABLE gtest27 ALTER COLUMN a TYPE int4;
> be a no-op?

Changing the type of a column that is used by a generated column is 
already prohibited.  Are you proposing to change anything here?

> do we need a document that virtual generated columns will use the
> expression's collation.
> see:
> drop table if exists t5;
> CREATE TABLE t5 (
>      a text collate "C",
>      b text collate "C" GENERATED ALWAYS AS (a collate case_insensitive) ,
>      d int DEFAULT 22
> );
> INSERT INTO t5(a,d) values ('d1',28), ('D2',27), ('D1',26);
> select * from t5 order by b asc, d asc;

I have fixed this.  It will now apply the collation of the column.

> create domain mydomain as int4;
> create type mydomainrange as range(subtype=mydomain);
> CREATE TABLE t3( b bigint, c mydomain GENERATED ALWAYS AS ('11') VIRTUAL);
> CREATE TABLE t3( b bigint, c mydomainrange GENERATED ALWAYS AS
> ('[4,50)') VIRTUAL);
> domain will error out, domain over range is ok, is this fine?

Fixed.  The check is now in CheckAttributeType() in heap.c, which has 
the ability to recurse into composite data types.

> +      When <literal>VIRTUAL</literal> is specified, the column will be
> +      computed when it is read, and it will not occupy any storage.  When
> +      <literal>STORED</literal> is specified, the column will be computed on
> +      write and will be stored on disk.  <literal>VIRTUAL</literal> is the
> +      default.
> drop table if exists t5;
> CREATE TABLE t5 (
>      a int,
>      b text storage extended collate "C"  GENERATED ALWAYS AS (a::text
> collate case_insensitive) ,
>      d int DEFAULT 22
> );
> select reltoastrelid <> 0 as has_toast_table from pg_class where oid =
> 't5'::regclass;
> 
> if really no storage, should table t5 have an associated toast table or not?
> also check ALTER TABLE variant:
> alter table t5 alter b set storage extended;

Fixed.  It does not trigger a toast table now.

> Do we need to do something in ATExecSetStatistics for cases like:
> ALTER TABLE t5 ALTER b SET STATISTICS 2000;
> (b is a generated virtual column).
> because of
> examine_attribute
>      if (attr->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)
>          return NULL;
> i guess, this won't have a big impact.

This is also an error now.

> There are some issues with changing virtual generated column type.
> like:
> drop table if exists another;
> create table another (f4 int, f2 text, f3 text, f1 int GENERATED
> ALWAYS AS (f4));
> insert into another values(1, 'one', 'uno'), (2, 'two', 'due'),(3,
> 'three', 'tre');
> alter table another
>    alter f1 type text using f2 || ' and ' || f3 || ' more';
> table another;
> 
> or
> alter table another
>    alter f1 type text using f2 || ' and ' || f3 || ' more',
>    drop column f1;
> ERROR:  column "f1" of relation "another" does not exist
> 
> These two command outputs seem not right.
> the stored generated column which works as expected.

I noticed this is already buggy for stored generated columns.  It should 
prevent the use of the USING clause here.  I'll propose a fix for that 
in a separate thread.  There might be further adjustments needed for 
changing the types of virtual columns, but I'll come back to that after 
the existing bug is fixed.