Thread

  1. Re: Adding a stored generated column without long-lived locks

    Alberto Piai <alberto.piai@gmail.com> — 2026-05-14T22:46:32Z

    On Fri Apr 24, 2026 at 2:10 AM PDT, Alberto Piai wrote:
    > On Tue Apr 7, 2026 at 5:02 PM +08, Alberto Piai wrote:
    >> On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote:
    >>
    >>> I recently needed to add a stored generated column to a table of
    >>> nontrivial size, and realized that currently there is no way to do
    >>> that without rewriting the table under an AccessExclusiveLock.
    
    The attached v4 is a rebase against current master, plus:
    
    - I moved the call to RememberAllDependentForRebuilding before the
      update to pg_attribute, since it provides checks for some
      invalid/unsupported invocations it seems more useful to do it before
      changing anything.
    
    - I hadn't noticed before that AddRelationNewConstraints returns the new
      (cooked) default definitions, so we can use those instead of building
      them again.
    
    - cleaned up some includes I had added by mistake, and moved some tests
      around between the two commits
    
    A while back I also posted a fix for the issue of DROP EXPRESSION not
    working with subpartitions [1], this patch isn't ajusted yet to match, I
    would do that if the bugfix would be committed first.
    
    I am still hoping to get a reviewer for the in-person commitfest at the
    upcoming pgconf.dev :)
    
    It's my first contribution, but the change is pretty self-contained and
    hopefully not terribly complex to review. I'm trying to address a real
    world use case, it would be fantastic to make some progress with this
    patch.
    
    Anyone's motivated? :)
    
    
    Regards,
    
    Alberto
    
    
    [1] https://www.postgresql.org/message-id/DHMT78XOD8BK.341V3H87KZ7NO%40gmail.com
    
    
    -- 
    Alberto Piai
    Sensational AG
    Zürich, Switzerland
    
  2. Re: Adding a stored generated column without long-lived locks

    Laurenz Albe <laurenz.albe@cybertec.at> — 2026-05-26T15:23:31Z

    On Thu, 2026-05-14 at 15:46 -0700, Alberto Piai wrote:
    > > > On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote:
    > > > 
    > > > > I recently needed to add a stored generated column to a table of
    > > > > nontrivial size, and realized that currently there is no way to do
    > > > > that without rewriting the table under an AccessExclusiveLock.
    > 
    > The attached v4 is a rebase against current master
    
    I understand the need that the patch fulfills, and I agree that it would be a
    nice feature.
    
    I have a few thoughts about this that don't concern the implementation:
    
    1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...)
       and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision
       for ALTER TABLE ... ADD GENERATED ALWAYS AS (...).
       So this patch adds non-standard syntax that may one day conflict with
       a new version of the standard.  I think we can still do it, and the
       proposed syntax looks right, but I thought I should mention it.
    
    2) We currently have ALTER TABLE ... ALTER ... SET EXPRESSION AS (...) to
       change the generation expression of a column.  This command always
       rewrites the table, according to the documentation.
       I think that if the present patch adds support to skip rewriting the table
       when a generation expression is added and the expression matches a check
       constraint, changing the generation expression should also be possible
       without a rewrite.  If not, I would consider that a violation of the
       principle of least astonishment.
       Would it be difficult to extend the patch to support that?
    
    3) We already have a couple of tricks to avoid blocking for a long time:
    
       - ALTER TABLE ... ALTER ... SET NOT NULL can skip the table scan if there
         is a check constraint that makes sure that the column is NOT NULL
    
       - ALTER TABLE ... ATTACH PARTITION can skip the scan of the new partition
         if there is a check constraint matching the partition constraint
    
       It would be great to document these little tricks in the documentation,
       probably on the ALTER TABLE page.  This is not necessarily the job of
       this patch, but it would also not be off-topic for the patch.
    
    Comments on the patch:
    ----------------------
    
    The patch applies and builds cleanly and passes the regression tests.
    
    Missing parts:
    
    - There is no documentation.  At least ALTER TABLE needs a description of the
      new syntax, and would ideally mention the trick with the check constraint.
    
    - There should be support for command line completion for the new syntax.
    
    Bugs:
    
    - The patch doesn't test if the column is an identity column:
    
      CREATE TABLE tab (id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY);
      INSERT INTO tab VALUES (DEFAULT);
      ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
    
      The ALTER TABLE should fail, but doesn't.
    
    - Strange behavior with sequences owned by the column:
    
      CREATE TABLE tab (id bigserial);
      INSERT INTO tab VALUES (DEFAULT);
      ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
      \ds tab_id_seq
                   List of sequences
       Schema |    Name    |   Type   |  Owner   
      --------+------------+----------+----------
       public | tab_id_seq | sequence | postgres
      (1 row)
    
      I think that any sequence owned by the column should be dropped.
      Alternatively, you could throw an error.
    
    - Incorrect handling of NULL values:
    
      CREATE TABLE tab (col1 integer, col2 integer);
      INSERT INTO tab VALUES (2, NULL);
      -- works, because NULL results from the check are accepted
      ALTER TABLE tab ADD CHECK (col2 = col1);
    
      SELECT pg_relation_filenode('tab');
       pg_relation_filenode 
      ----------------------
                      19920
      (1 row)
    
      ALTER TABLE tab ALTER col2 ADD GENERATED ALWAYS AS (col1) STORED;
      SELECT pg_relation_filenode('tab');
       pg_relation_filenode 
      ----------------------
                      19920
      (1 row)
    
      TABLE tab;
       col1 | col2 
      ------+------
          2 |    ∅
      (1 row)
    
      I am not sure what the correct approach would be.  The simple approach would be
      to only skip the rewrite if the column has a NOT NULL constraint or an equivalent
      check constraint, but perhaps you can think of a way to do better.
    
    Comments on the code:
    
    > --- a/src/backend/commands/tablecmds.c
    > +++ b/src/backend/commands/tablecmds.c
    > @@ -5093,6 +5102,13 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
    >             ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode, context);
    >             pass = AT_PASS_SET_EXPRESSION;
    >             break;
    > +       case AT_AddGeneratedAsExprStored:
    
    You should add a comment, same as for the other branches.
    
    > @@ -6695,6 +6717,8 @@ alter_table_type_to_string(AlterTableType cmdtype)
    >             return "ALTER COLUMN ... SET NOT NULL";
    >         case AT_SetExpression:
    >             return "ALTER COLUMN ... SET EXPRESSION";
    > +       case AT_AddGeneratedAsExprStored:
    > +           return "ALTER COLUMN ... ADD GENERATED ALWAYS AS (...) STORED";
    
    Keep it short, like "ALTER COLUMN ... ADD GENERATED".
    
    > --- a/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
    > +++ b/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
    > @@ -129,6 +129,9 @@ get_altertable_subcmdinfo(PG_FUNCTION_ARGS)
    >             case AT_SetNotNull:
    >                 strtype = "SET NOT NULL";
    >                 break;
    > +           case AT_AddGeneratedAsExprStored:
    > +               strtype = "ADD GENERATED ALWAYS AS (...) STORED";
    > +               break;
    
    I suggest "ALTER COLUMN ADD GENERATED ALWAYS AS", but I won't insist.
    
    Yours,
    Laurenz Albe
    
    
    
    
  3. Re: Adding a stored generated column without long-lived locks

    Alberto Piai <alberto.piai@gmail.com> — 2026-05-27T17:43:53Z

    Hi,
    
    On Fri May 15, 2026 at 12:46 AM CEST, Alberto Piai wrote:
    > On Fri Apr 24, 2026 at 2:10 AM PDT, Alberto Piai wrote:
    >> On Tue Apr 7, 2026 at 5:02 PM +08, Alberto Piai wrote:
    >>> On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote:
    >>>
    >>>> I recently needed to add a stored generated column to a table of
    >>>> nontrivial size, and realized that currently there is no way to do
    >>>> that without rewriting the table under an AccessExclusiveLock.
    
    here's a not-so-brief summary of the conversations around this topic at
    pgconf.dev, and a new proposal at the end.
    
    
    I had the chance to bring this up with other attendees, and many
    recognized the use case as a useful one, addressing a real operational
    issue.
    
    In particular, I had great feedback from Staš Kotarac Guček, who pointed
    out a major flaw in my current proposal: a constraint of the form
    
      CHECK (c = expr)
    
    would not work correctly when expr evaluates to null for some rows.
    Thank you Staš, in the next iteration I will change the constraint to
    use IS NOT DISTINCT FROM, instead.
    
    
    I briefly mentioned this topic to Tom Lane, who quickly replied with the
    question: should this not fail when it can't use the constraint, instead
    of overwriting the contents of the column?
    
    Thanks Tom, I will get to this later in this mail.
    
    
    I had registered this patch for the in-person commitfest at pgconf.dev,
    and Álvaro Herrera picked it up for review. Thank you Álvaro, and thank
    you Peter for organizing the event.
    
    We managed to find some time on the very last day of the conference, and
    went through the current design and code. The open items (which I will
    address in the next iteration of this patch) are:
    
    * missing user documentation
    
      I will work on this next. I think it's a good way to explain the
      feature even early during development. I just didn't want to do it
      _too_ early, without having had any feedback.
    
    * try to minimize command counter increments
    
      There might be one call to CommandCounterIncrement() which is not
      necessary, I'll try remove it.
    
    * comment on why it is necessary to clear missing values when rewriting
      the table
    
      ATExecAlterColumnType() and ATExecSetExpression() both do this
      explicitly when requesting a table rewrite. I'll extend the comment,
      and also look into whether this is something that should be done any
      time a table rewrite happens. In that case, it might be worth moving
      this into the rewriting code rather than having each caller do it.
    
    * interactions with other subcommands in the same alter table statement
    
      My reasoning regarding this was: if I do this in
      AT_PASS_SET_EXPRESSION, it should be safe. I will invest some more
      time into this and add tests, too.
    
    We also looked at the overall design of the new command, and we agreed
    that it is a fitting addition to our current SET EXPRESSION and DROP
    EXPRESSION. Regarding the question of whether it should be SET or ADD,
    we agreed that ADD (i.e. the current proposal) is clearer, especially
    for its similarity to ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY.
    
    Regarding the question of "should this fail or rewrite the table when a
    usable constraint isn't found": Álvaro's suggestion here was to use a
    more ad-hoc command, meant more specifically for this use case of
    converting into a stored generated column without rewriting it. If the
    command would be dedicated specifically to this, it would make sense to
    have it fail when a usable constraint isn't found.
    
    
    Last but not least, I also discussed this with Laurenz Albe, and he
    wrote a very useful review in this thread. I will address that
    separately and reply directly to that mail, but one point I can already
    merge in this discussion is about the syntax of the command:
    
    > 1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...)
    >    and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision
    >    for ALTER TABLE ... ADD GENERATED ALWAYS AS (...).
    >    So this patch adds non-standard syntax that may one day conflict with
    >    a new version of the standard.  I think we can still do it, and the
    >    proposed syntax looks right, but I thought I should mention it.
    
    I'd like to take his point, together with the question from Tom and the
    suggestion by Álvaro, and make a new proposal for the design of this
    command.
    
    
    Design iteration 2
    ------------------
    
    
    Syntax:
    
      ALTER TABLE t ALTER COLUMN c
        ADD GENERATED ALWAYS AS (expr) STORED USING CONSTRAINT check_name
    
    check_name must be a valid constraint of the form
    
      CHECK (c IS NOT DISTINCT FROM (expr))
    
    This fails if:
    
    - a check constraint named check_name is not found for table c
    - the constraint is not valid
    - the constraint does not match exactly the expr the user intends to use
      as a stored default expression
    
    On success, the table c is now a stored generated column with the given
    default expression, and the check_name constraint has been removed.
    
    
    
    This addresses Tom's remark, we can now fail instead of just rewriting
    the column.
    
    It improves slightly upon the issue of a potential conflict with a
    future edition of the SQL standard, by being more specific. I don't see
    a way to be completely sure we won't have conflicts. We could improve
    more by making the syntax more "alien" and very unlinkely to be picked
    up by the standard, but at a usability cost for Postgres. I'm open to
    suggestions.
    
    It improves upon another question raised by Álvaro: does the user have
    to clean up the constraint? In v1 I felt it was better to have the user
    remove it after the migration. Since here it's explicitly mentioned as
    the constraint to use to migrate the column, I think it's OK to remove
    it. We are conceptually moving it from being a constraint to being the
    new default expression.
    
    The implementation should also be simpler, since there will never be a
    table rewrite.
    
    
    Any thoughts about this?
    
    
    
    Best regards,
    
    Alberto
    
    -- 
    Alberto Piai
    Sensational AG
    Zürich, Switzerland
    
    
    
    
  4. Re: Adding a stored generated column without long-lived locks

    Alberto Piai <alberto.piai@gmail.com> — 2026-05-27T17:44:23Z

    On Tue May 26, 2026 at 5:23 PM CEST, Laurenz Albe wrote:
    >
    > 1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...)
    >    and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision
    >    for ALTER TABLE ... ADD GENERATED ALWAYS AS (...).
    >    So this patch adds non-standard syntax that may one day conflict with
    >    a new version of the standard.  I think we can still do it, and the
    >    proposed syntax looks right, but I thought I should mention it.
    
    Thank you for bringing this up. I don't have access to the standard, but
    the chance of a possible conflict with future editions was at the back
    of my mind. I don't see a way to exclude it completely. In a sibling
    mail in this thread (you should be in CC), I have made a new iteration
    on this proposal, which also tries to make the command more specific to
    avoid future conflicts.
    
    > 2) We currently have ALTER TABLE ... ALTER ... SET EXPRESSION AS (...) to
    >    change the generation expression of a column.  This command always
    >    rewrites the table, according to the documentation.
    >    I think that if the present patch adds support to skip rewriting the table
    >    when a generation expression is added and the expression matches a check
    >    constraint, changing the generation expression should also be possible
    >    without a rewrite.  If not, I would consider that a violation of the
    >    principle of least astonishment.
    >    Would it be difficult to extend the patch to support that?
    
    Yes, I don't see a way to make that work. Since we're talking only about
    stored values, a rewrite will always be necessary. However, using this
    new command, a user could add a column with the new expression, then
    atomically drop the old one and rename. All without holding onto an
    AccessExclusiveLock for a long time :)
    
    > 3) We already have a couple of tricks to avoid blocking for a long time:
    >
    >    - ALTER TABLE ... ALTER ... SET NOT NULL can skip the table scan if there
    >      is a check constraint that makes sure that the column is NOT NULL
    >
    >    - ALTER TABLE ... ATTACH PARTITION can skip the scan of the new partition
    >      if there is a check constraint matching the partition constraint
    >
    >    It would be great to document these little tricks in the documentation,
    >    probably on the ALTER TABLE page.  This is not necessarily the job of
    >    this patch, but it would also not be off-topic for the patch.
    
    The SET NOT NULL one and the ATTACH PARTITION one are documented in the
    section specific to the command. However
    
      or, if an equivalent index already exists, it will be attached to the
      target table's index, as if ALTER INDEX ATTACH PARTITION had been
      executed
    
    is not very explicit about the advantages this has for online
    migrations.
    
    In the NOTES section of the ALTER TABLE page, there is a paragraph about
    NOT VALID / VALIDATE, which is another operation in the same spirit as
    this.
    
    Maybe we could group them all in a new section dedicated to online
    schema migrations?
    
    (However, even if it's definitely on-topic with this patch, I would work
    on this in a separate patch / email thread.)
    
    
    > Missing parts:
    >
    > - There is no documentation.  At least ALTER TABLE needs a description of the
    >   new syntax, and would ideally mention the trick with the check constraint.
    
    Yes, I will work on this next. I also believe it's a great way to show a
    feature, even early during development. I just wanted to avoid doing it
    _too_ early, before having had any feedback about the idea.
    
    > - There should be support for command line completion for the new syntax.
    
    Great idea, I'll add this too.
    
    > Bugs:
    >
    > - The patch doesn't test if the column is an identity column:
    >
    >   CREATE TABLE tab (id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY);
    >   INSERT INTO tab VALUES (DEFAULT);
    >   ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
    >
    >   The ALTER TABLE should fail, but doesn't.
    >
    > - Strange behavior with sequences owned by the column:
    >
    >   CREATE TABLE tab (id bigserial);
    >   INSERT INTO tab VALUES (DEFAULT);
    >   ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
    >   \ds tab_id_seq
    >                List of sequences
    >    Schema |    Name    |   Type   |  Owner   
    >   --------+------------+----------+----------
    >    public | tab_id_seq | sequence | postgres
    >   (1 row)
    >
    >   I think that any sequence owned by the column should be dropped.
    >   Alternatively, you could throw an error.
    
    Thanks for testing this!
    
    I have reused RememberAllDependentForRebuilding() which does some
    validation, but was originally meant for ALTER COLUMN TYPE. I will add
    checks and tests for these cases, but to be consistent with how the
    other dependencies are handled, I think it's better to throw an error
    here (this is what happens for example if trying to ALTER TYPE of a
    column used by a function).
    
    >
    > - Incorrect handling of NULL values:
    
    See sibling mail, in the next iteration the constraint will have to use
    IS NOT DISTINCT FROM. I think that should cover all cases.
    
    >
    > Comments on the code:
    >
    >> --- a/src/backend/commands/tablecmds.c
    >> +++ b/src/backend/commands/tablecmds.c
    >> @@ -5093,6 +5102,13 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
    >>             ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode, context);
    >>             pass = AT_PASS_SET_EXPRESSION;
    >>             break;
    >> +       case AT_AddGeneratedAsExprStored:
    >
    > You should add a comment, same as for the other branches.
    >
    >> @@ -6695,6 +6717,8 @@ alter_table_type_to_string(AlterTableType cmdtype)
    >>             return "ALTER COLUMN ... SET NOT NULL";
    >>         case AT_SetExpression:
    >>             return "ALTER COLUMN ... SET EXPRESSION";
    >> +       case AT_AddGeneratedAsExprStored:
    >> +           return "ALTER COLUMN ... ADD GENERATED ALWAYS AS (...) STORED";
    >
    > Keep it short, like "ALTER COLUMN ... ADD GENERATED".
    >
    >> --- a/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
    >> +++ b/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
    >> @@ -129,6 +129,9 @@ get_altertable_subcmdinfo(PG_FUNCTION_ARGS)
    >>             case AT_SetNotNull:
    >>                 strtype = "SET NOT NULL";
    >>                 break;
    >> +           case AT_AddGeneratedAsExprStored:
    >> +               strtype = "ADD GENERATED ALWAYS AS (...) STORED";
    >> +               break;
    >
    > I suggest "ALTER COLUMN ADD GENERATED ALWAYS AS", but I won't insist.
    
    Agreed, will fix all these in the next version of the patch.
    
    
    Thank you again for the review!
    
    
    Best regards,
    
    Alberto
    
    -- 
    Alberto Piai
    Sensational AG
    Zürich, Switzerland