Thread

Re: Adding a stored generated column without long-lived locks

Alberto Piai <alberto.piai@gmail.com> — 2026-05-14T22:46:32Z

On Fri Apr 24, 2026 at 2:10 AM PDT, Alberto Piai wrote:
> On Tue Apr 7, 2026 at 5:02 PM +08, Alberto Piai wrote:
>> On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote:
>>
>>> I recently needed to add a stored generated column to a table of
>>> nontrivial size, and realized that currently there is no way to do
>>> that without rewriting the table under an AccessExclusiveLock.

The attached v4 is a rebase against current master, plus:

- I moved the call to RememberAllDependentForRebuilding before the
  update to pg_attribute, since it provides checks for some
  invalid/unsupported invocations it seems more useful to do it before
  changing anything.

- I hadn't noticed before that AddRelationNewConstraints returns the new
  (cooked) default definitions, so we can use those instead of building
  them again.

- cleaned up some includes I had added by mistake, and moved some tests
  around between the two commits

A while back I also posted a fix for the issue of DROP EXPRESSION not
working with subpartitions [1], this patch isn't ajusted yet to match, I
would do that if the bugfix would be committed first.

I am still hoping to get a reviewer for the in-person commitfest at the
upcoming pgconf.dev :)

It's my first contribution, but the change is pretty self-contained and
hopefully not terribly complex to review. I'm trying to address a real
world use case, it would be fantastic to make some progress with this
patch.

Anyone's motivated? :)


Regards,

Alberto


[1] https://www.postgresql.org/message-id/DHMT78XOD8BK.341V3H87KZ7NO%40gmail.com


-- 
Alberto Piai
Sensational AG
Zürich, Switzerland

Re: Adding a stored generated column without long-lived locks

Laurenz Albe <laurenz.albe@cybertec.at> — 2026-05-26T15:23:31Z

On Thu, 2026-05-14 at 15:46 -0700, Alberto Piai wrote:
> > > On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote:
> > > 
> > > > I recently needed to add a stored generated column to a table of
> > > > nontrivial size, and realized that currently there is no way to do
> > > > that without rewriting the table under an AccessExclusiveLock.
> 
> The attached v4 is a rebase against current master

I understand the need that the patch fulfills, and I agree that it would be a
nice feature.

I have a few thoughts about this that don't concern the implementation:

1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...)
   and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision
   for ALTER TABLE ... ADD GENERATED ALWAYS AS (...).
   So this patch adds non-standard syntax that may one day conflict with
   a new version of the standard.  I think we can still do it, and the
   proposed syntax looks right, but I thought I should mention it.

2) We currently have ALTER TABLE ... ALTER ... SET EXPRESSION AS (...) to
   change the generation expression of a column.  This command always
   rewrites the table, according to the documentation.
   I think that if the present patch adds support to skip rewriting the table
   when a generation expression is added and the expression matches a check
   constraint, changing the generation expression should also be possible
   without a rewrite.  If not, I would consider that a violation of the
   principle of least astonishment.
   Would it be difficult to extend the patch to support that?

3) We already have a couple of tricks to avoid blocking for a long time:

   - ALTER TABLE ... ALTER ... SET NOT NULL can skip the table scan if there
     is a check constraint that makes sure that the column is NOT NULL

   - ALTER TABLE ... ATTACH PARTITION can skip the scan of the new partition
     if there is a check constraint matching the partition constraint

   It would be great to document these little tricks in the documentation,
   probably on the ALTER TABLE page.  This is not necessarily the job of
   this patch, but it would also not be off-topic for the patch.

Comments on the patch:
----------------------

The patch applies and builds cleanly and passes the regression tests.

Missing parts:

- There is no documentation.  At least ALTER TABLE needs a description of the
  new syntax, and would ideally mention the trick with the check constraint.

- There should be support for command line completion for the new syntax.

Bugs:

- The patch doesn't test if the column is an identity column:

  CREATE TABLE tab (id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY);
  INSERT INTO tab VALUES (DEFAULT);
  ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;

  The ALTER TABLE should fail, but doesn't.

- Strange behavior with sequences owned by the column:

  CREATE TABLE tab (id bigserial);
  INSERT INTO tab VALUES (DEFAULT);
  ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
  \ds tab_id_seq
               List of sequences
   Schema |    Name    |   Type   |  Owner   
  --------+------------+----------+----------
   public | tab_id_seq | sequence | postgres
  (1 row)

  I think that any sequence owned by the column should be dropped.
  Alternatively, you could throw an error.

- Incorrect handling of NULL values:

  CREATE TABLE tab (col1 integer, col2 integer);
  INSERT INTO tab VALUES (2, NULL);
  -- works, because NULL results from the check are accepted
  ALTER TABLE tab ADD CHECK (col2 = col1);

  SELECT pg_relation_filenode('tab');
   pg_relation_filenode 
  ----------------------
                  19920
  (1 row)

  ALTER TABLE tab ALTER col2 ADD GENERATED ALWAYS AS (col1) STORED;
  SELECT pg_relation_filenode('tab');
   pg_relation_filenode 
  ----------------------
                  19920
  (1 row)

  TABLE tab;
   col1 | col2 
  ------+------
      2 |    ∅
  (1 row)

  I am not sure what the correct approach would be.  The simple approach would be
  to only skip the rewrite if the column has a NOT NULL constraint or an equivalent
  check constraint, but perhaps you can think of a way to do better.

Comments on the code:

> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -5093,6 +5102,13 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
>             ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode, context);
>             pass = AT_PASS_SET_EXPRESSION;
>             break;
> +       case AT_AddGeneratedAsExprStored:

You should add a comment, same as for the other branches.

> @@ -6695,6 +6717,8 @@ alter_table_type_to_string(AlterTableType cmdtype)
>             return "ALTER COLUMN ... SET NOT NULL";
>         case AT_SetExpression:
>             return "ALTER COLUMN ... SET EXPRESSION";
> +       case AT_AddGeneratedAsExprStored:
> +           return "ALTER COLUMN ... ADD GENERATED ALWAYS AS (...) STORED";

Keep it short, like "ALTER COLUMN ... ADD GENERATED".

> --- a/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
> +++ b/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
> @@ -129,6 +129,9 @@ get_altertable_subcmdinfo(PG_FUNCTION_ARGS)
>             case AT_SetNotNull:
>                 strtype = "SET NOT NULL";
>                 break;
> +           case AT_AddGeneratedAsExprStored:
> +               strtype = "ADD GENERATED ALWAYS AS (...) STORED";
> +               break;

I suggest "ALTER COLUMN ADD GENERATED ALWAYS AS", but I won't insist.

Yours,
Laurenz Albe

Re: Adding a stored generated column without long-lived locks

Alberto Piai <alberto.piai@gmail.com> — 2026-05-27T17:43:53Z

Hi,

On Fri May 15, 2026 at 12:46 AM CEST, Alberto Piai wrote:
> On Fri Apr 24, 2026 at 2:10 AM PDT, Alberto Piai wrote:
>> On Tue Apr 7, 2026 at 5:02 PM +08, Alberto Piai wrote:
>>> On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote:
>>>
>>>> I recently needed to add a stored generated column to a table of
>>>> nontrivial size, and realized that currently there is no way to do
>>>> that without rewriting the table under an AccessExclusiveLock.

here's a not-so-brief summary of the conversations around this topic at
pgconf.dev, and a new proposal at the end.

I had the chance to bring this up with other attendees, and many
recognized the use case as a useful one, addressing a real operational
issue.

In particular, I had great feedback from Staš Kotarac Guček, who pointed
out a major flaw in my current proposal: a constraint of the form

  CHECK (c = expr)

would not work correctly when expr evaluates to null for some rows.
Thank you Staš, in the next iteration I will change the constraint to
use IS NOT DISTINCT FROM, instead.

I briefly mentioned this topic to Tom Lane, who quickly replied with the
question: should this not fail when it can't use the constraint, instead
of overwriting the contents of the column?

Thanks Tom, I will get to this later in this mail.

I had registered this patch for the in-person commitfest at pgconf.dev,
and Álvaro Herrera picked it up for review. Thank you Álvaro, and thank
you Peter for organizing the event.

We managed to find some time on the very last day of the conference, and
went through the current design and code. The open items (which I will
address in the next iteration of this patch) are:

* missing user documentation

  I will work on this next. I think it's a good way to explain the
  feature even early during development. I just didn't want to do it
  _too_ early, without having had any feedback.

* try to minimize command counter increments

  There might be one call to CommandCounterIncrement() which is not
  necessary, I'll try remove it.

* comment on why it is necessary to clear missing values when rewriting
  the table

  ATExecAlterColumnType() and ATExecSetExpression() both do this
  explicitly when requesting a table rewrite. I'll extend the comment,
  and also look into whether this is something that should be done any
  time a table rewrite happens. In that case, it might be worth moving
  this into the rewriting code rather than having each caller do it.

* interactions with other subcommands in the same alter table statement

  My reasoning regarding this was: if I do this in
  AT_PASS_SET_EXPRESSION, it should be safe. I will invest some more
  time into this and add tests, too.

We also looked at the overall design of the new command, and we agreed
that it is a fitting addition to our current SET EXPRESSION and DROP
EXPRESSION. Regarding the question of whether it should be SET or ADD,
we agreed that ADD (i.e. the current proposal) is clearer, especially
for its similarity to ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY.

Regarding the question of "should this fail or rewrite the table when a
usable constraint isn't found": Álvaro's suggestion here was to use a
more ad-hoc command, meant more specifically for this use case of
converting into a stored generated column without rewriting it. If the
command would be dedicated specifically to this, it would make sense to
have it fail when a usable constraint isn't found.

Last but not least, I also discussed this with Laurenz Albe, and he
wrote a very useful review in this thread. I will address that
separately and reply directly to that mail, but one point I can already
merge in this discussion is about the syntax of the command:

> 1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...)
>    and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision
>    for ALTER TABLE ... ADD GENERATED ALWAYS AS (...).
>    So this patch adds non-standard syntax that may one day conflict with
>    a new version of the standard.  I think we can still do it, and the
>    proposed syntax looks right, but I thought I should mention it.

I'd like to take his point, together with the question from Tom and the
suggestion by Álvaro, and make a new proposal for the design of this
command.

Design iteration 2
------------------

Syntax:

  ALTER TABLE t ALTER COLUMN c
    ADD GENERATED ALWAYS AS (expr) STORED USING CONSTRAINT check_name

check_name must be a valid constraint of the form

  CHECK (c IS NOT DISTINCT FROM (expr))

This fails if:

- a check constraint named check_name is not found for table c
- the constraint is not valid
- the constraint does not match exactly the expr the user intends to use
  as a stored default expression

On success, the table c is now a stored generated column with the given
default expression, and the check_name constraint has been removed.

This addresses Tom's remark, we can now fail instead of just rewriting
the column.

It improves slightly upon the issue of a potential conflict with a
future edition of the SQL standard, by being more specific. I don't see
a way to be completely sure we won't have conflicts. We could improve
more by making the syntax more "alien" and very unlinkely to be picked
up by the standard, but at a usability cost for Postgres. I'm open to
suggestions.

It improves upon another question raised by Álvaro: does the user have
to clean up the constraint? In v1 I felt it was better to have the user
remove it after the migration. Since here it's explicitly mentioned as
the constraint to use to migrate the column, I think it's OK to remove
it. We are conceptually moving it from being a constraint to being the
new default expression.

The implementation should also be simpler, since there will never be a
table rewrite.

Any thoughts about this?

Best regards,

Alberto

-- 
Alberto Piai
Sensational AG
Zürich, Switzerland

Re: Adding a stored generated column without long-lived locks

Alberto Piai <alberto.piai@gmail.com> — 2026-05-27T17:44:23Z

On Tue May 26, 2026 at 5:23 PM CEST, Laurenz Albe wrote:
>
> 1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...)
>    and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision
>    for ALTER TABLE ... ADD GENERATED ALWAYS AS (...).
>    So this patch adds non-standard syntax that may one day conflict with
>    a new version of the standard.  I think we can still do it, and the
>    proposed syntax looks right, but I thought I should mention it.

Thank you for bringing this up. I don't have access to the standard, but
the chance of a possible conflict with future editions was at the back
of my mind. I don't see a way to exclude it completely. In a sibling
mail in this thread (you should be in CC), I have made a new iteration
on this proposal, which also tries to make the command more specific to
avoid future conflicts.

> 2) We currently have ALTER TABLE ... ALTER ... SET EXPRESSION AS (...) to
>    change the generation expression of a column.  This command always
>    rewrites the table, according to the documentation.
>    I think that if the present patch adds support to skip rewriting the table
>    when a generation expression is added and the expression matches a check
>    constraint, changing the generation expression should also be possible
>    without a rewrite.  If not, I would consider that a violation of the
>    principle of least astonishment.
>    Would it be difficult to extend the patch to support that?

Yes, I don't see a way to make that work. Since we're talking only about
stored values, a rewrite will always be necessary. However, using this
new command, a user could add a column with the new expression, then
atomically drop the old one and rename. All without holding onto an
AccessExclusiveLock for a long time :)

> 3) We already have a couple of tricks to avoid blocking for a long time:
>
>    - ALTER TABLE ... ALTER ... SET NOT NULL can skip the table scan if there
>      is a check constraint that makes sure that the column is NOT NULL
>
>    - ALTER TABLE ... ATTACH PARTITION can skip the scan of the new partition
>      if there is a check constraint matching the partition constraint
>
>    It would be great to document these little tricks in the documentation,
>    probably on the ALTER TABLE page.  This is not necessarily the job of
>    this patch, but it would also not be off-topic for the patch.

The SET NOT NULL one and the ATTACH PARTITION one are documented in the
section specific to the command. However

  or, if an equivalent index already exists, it will be attached to the
  target table's index, as if ALTER INDEX ATTACH PARTITION had been
  executed

is not very explicit about the advantages this has for online
migrations.

In the NOTES section of the ALTER TABLE page, there is a paragraph about
NOT VALID / VALIDATE, which is another operation in the same spirit as
this.

Maybe we could group them all in a new section dedicated to online
schema migrations?

(However, even if it's definitely on-topic with this patch, I would work
on this in a separate patch / email thread.)

> Missing parts:
>
> - There is no documentation.  At least ALTER TABLE needs a description of the
>   new syntax, and would ideally mention the trick with the check constraint.

Yes, I will work on this next. I also believe it's a great way to show a
feature, even early during development. I just wanted to avoid doing it
_too_ early, before having had any feedback about the idea.

> - There should be support for command line completion for the new syntax.

Great idea, I'll add this too.

> Bugs:
>
> - The patch doesn't test if the column is an identity column:
>
>   CREATE TABLE tab (id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY);
>   INSERT INTO tab VALUES (DEFAULT);
>   ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
>
>   The ALTER TABLE should fail, but doesn't.
>
> - Strange behavior with sequences owned by the column:
>
>   CREATE TABLE tab (id bigserial);
>   INSERT INTO tab VALUES (DEFAULT);
>   ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
>   \ds tab_id_seq
>                List of sequences
>    Schema |    Name    |   Type   |  Owner   
>   --------+------------+----------+----------
>    public | tab_id_seq | sequence | postgres
>   (1 row)
>
>   I think that any sequence owned by the column should be dropped.
>   Alternatively, you could throw an error.

Thanks for testing this!

I have reused RememberAllDependentForRebuilding() which does some
validation, but was originally meant for ALTER COLUMN TYPE. I will add
checks and tests for these cases, but to be consistent with how the
other dependencies are handled, I think it's better to throw an error
here (this is what happens for example if trying to ALTER TYPE of a
column used by a function).

>
> - Incorrect handling of NULL values:

See sibling mail, in the next iteration the constraint will have to use
IS NOT DISTINCT FROM. I think that should cover all cases.

>
> Comments on the code:
>
>> --- a/src/backend/commands/tablecmds.c
>> +++ b/src/backend/commands/tablecmds.c
>> @@ -5093,6 +5102,13 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
>>             ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode, context);
>>             pass = AT_PASS_SET_EXPRESSION;
>>             break;
>> +       case AT_AddGeneratedAsExprStored:
>
> You should add a comment, same as for the other branches.
>
>> @@ -6695,6 +6717,8 @@ alter_table_type_to_string(AlterTableType cmdtype)
>>             return "ALTER COLUMN ... SET NOT NULL";
>>         case AT_SetExpression:
>>             return "ALTER COLUMN ... SET EXPRESSION";
>> +       case AT_AddGeneratedAsExprStored:
>> +           return "ALTER COLUMN ... ADD GENERATED ALWAYS AS (...) STORED";
>
> Keep it short, like "ALTER COLUMN ... ADD GENERATED".
>
>> --- a/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
>> +++ b/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
>> @@ -129,6 +129,9 @@ get_altertable_subcmdinfo(PG_FUNCTION_ARGS)
>>             case AT_SetNotNull:
>>                 strtype = "SET NOT NULL";
>>                 break;
>> +           case AT_AddGeneratedAsExprStored:
>> +               strtype = "ADD GENERATED ALWAYS AS (...) STORED";
>> +               break;
>
> I suggest "ALTER COLUMN ADD GENERATED ALWAYS AS", but I won't insist.

Agreed, will fix all these in the next version of the patch.

Thank you again for the review!

Best regards,

Alberto

-- 
Alberto Piai
Sensational AG
Zürich, Switzerland