Thread
-
Re: Adding a stored generated column without long-lived locks
Alberto Piai <alberto.piai@gmail.com> — 2026-05-14T22:46:32Z
On Fri Apr 24, 2026 at 2:10 AM PDT, Alberto Piai wrote: > On Tue Apr 7, 2026 at 5:02 PM +08, Alberto Piai wrote: >> On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote: >> >>> I recently needed to add a stored generated column to a table of >>> nontrivial size, and realized that currently there is no way to do >>> that without rewriting the table under an AccessExclusiveLock. The attached v4 is a rebase against current master, plus: - I moved the call to RememberAllDependentForRebuilding before the update to pg_attribute, since it provides checks for some invalid/unsupported invocations it seems more useful to do it before changing anything. - I hadn't noticed before that AddRelationNewConstraints returns the new (cooked) default definitions, so we can use those instead of building them again. - cleaned up some includes I had added by mistake, and moved some tests around between the two commits A while back I also posted a fix for the issue of DROP EXPRESSION not working with subpartitions [1], this patch isn't ajusted yet to match, I would do that if the bugfix would be committed first. I am still hoping to get a reviewer for the in-person commitfest at the upcoming pgconf.dev :) It's my first contribution, but the change is pretty self-contained and hopefully not terribly complex to review. I'm trying to address a real world use case, it would be fantastic to make some progress with this patch. Anyone's motivated? :) Regards, Alberto [1] https://www.postgresql.org/message-id/DHMT78XOD8BK.341V3H87KZ7NO%40gmail.com -- Alberto Piai Sensational AG Zürich, Switzerland
-
Re: Adding a stored generated column without long-lived locks
Laurenz Albe <laurenz.albe@cybertec.at> — 2026-05-26T15:23:31Z
On Thu, 2026-05-14 at 15:46 -0700, Alberto Piai wrote: > > > On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote: > > > > > > > I recently needed to add a stored generated column to a table of > > > > nontrivial size, and realized that currently there is no way to do > > > > that without rewriting the table under an AccessExclusiveLock. > > The attached v4 is a rebase against current master I understand the need that the patch fulfills, and I agree that it would be a nice feature. I have a few thoughts about this that don't concern the implementation: 1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...) and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision for ALTER TABLE ... ADD GENERATED ALWAYS AS (...). So this patch adds non-standard syntax that may one day conflict with a new version of the standard. I think we can still do it, and the proposed syntax looks right, but I thought I should mention it. 2) We currently have ALTER TABLE ... ALTER ... SET EXPRESSION AS (...) to change the generation expression of a column. This command always rewrites the table, according to the documentation. I think that if the present patch adds support to skip rewriting the table when a generation expression is added and the expression matches a check constraint, changing the generation expression should also be possible without a rewrite. If not, I would consider that a violation of the principle of least astonishment. Would it be difficult to extend the patch to support that? 3) We already have a couple of tricks to avoid blocking for a long time: - ALTER TABLE ... ALTER ... SET NOT NULL can skip the table scan if there is a check constraint that makes sure that the column is NOT NULL - ALTER TABLE ... ATTACH PARTITION can skip the scan of the new partition if there is a check constraint matching the partition constraint It would be great to document these little tricks in the documentation, probably on the ALTER TABLE page. This is not necessarily the job of this patch, but it would also not be off-topic for the patch. Comments on the patch: ---------------------- The patch applies and builds cleanly and passes the regression tests. Missing parts: - There is no documentation. At least ALTER TABLE needs a description of the new syntax, and would ideally mention the trick with the check constraint. - There should be support for command line completion for the new syntax. Bugs: - The patch doesn't test if the column is an identity column: CREATE TABLE tab (id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY); INSERT INTO tab VALUES (DEFAULT); ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED; The ALTER TABLE should fail, but doesn't. - Strange behavior with sequences owned by the column: CREATE TABLE tab (id bigserial); INSERT INTO tab VALUES (DEFAULT); ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED; \ds tab_id_seq List of sequences Schema | Name | Type | Owner --------+------------+----------+---------- public | tab_id_seq | sequence | postgres (1 row) I think that any sequence owned by the column should be dropped. Alternatively, you could throw an error. - Incorrect handling of NULL values: CREATE TABLE tab (col1 integer, col2 integer); INSERT INTO tab VALUES (2, NULL); -- works, because NULL results from the check are accepted ALTER TABLE tab ADD CHECK (col2 = col1); SELECT pg_relation_filenode('tab'); pg_relation_filenode ---------------------- 19920 (1 row) ALTER TABLE tab ALTER col2 ADD GENERATED ALWAYS AS (col1) STORED; SELECT pg_relation_filenode('tab'); pg_relation_filenode ---------------------- 19920 (1 row) TABLE tab; col1 | col2 ------+------ 2 | ∅ (1 row) I am not sure what the correct approach would be. The simple approach would be to only skip the rewrite if the column has a NOT NULL constraint or an equivalent check constraint, but perhaps you can think of a way to do better. Comments on the code: > --- a/src/backend/commands/tablecmds.c > +++ b/src/backend/commands/tablecmds.c > @@ -5093,6 +5102,13 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, > ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode, context); > pass = AT_PASS_SET_EXPRESSION; > break; > + case AT_AddGeneratedAsExprStored: You should add a comment, same as for the other branches. > @@ -6695,6 +6717,8 @@ alter_table_type_to_string(AlterTableType cmdtype) > return "ALTER COLUMN ... SET NOT NULL"; > case AT_SetExpression: > return "ALTER COLUMN ... SET EXPRESSION"; > + case AT_AddGeneratedAsExprStored: > + return "ALTER COLUMN ... ADD GENERATED ALWAYS AS (...) STORED"; Keep it short, like "ALTER COLUMN ... ADD GENERATED". > --- a/src/test/modules/test_ddl_deparse/test_ddl_deparse.c > +++ b/src/test/modules/test_ddl_deparse/test_ddl_deparse.c > @@ -129,6 +129,9 @@ get_altertable_subcmdinfo(PG_FUNCTION_ARGS) > case AT_SetNotNull: > strtype = "SET NOT NULL"; > break; > + case AT_AddGeneratedAsExprStored: > + strtype = "ADD GENERATED ALWAYS AS (...) STORED"; > + break; I suggest "ALTER COLUMN ADD GENERATED ALWAYS AS", but I won't insist. Yours, Laurenz Albe -
Re: Adding a stored generated column without long-lived locks
Alberto Piai <alberto.piai@gmail.com> — 2026-05-27T17:43:53Z
Hi, On Fri May 15, 2026 at 12:46 AM CEST, Alberto Piai wrote: > On Fri Apr 24, 2026 at 2:10 AM PDT, Alberto Piai wrote: >> On Tue Apr 7, 2026 at 5:02 PM +08, Alberto Piai wrote: >>> On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote: >>> >>>> I recently needed to add a stored generated column to a table of >>>> nontrivial size, and realized that currently there is no way to do >>>> that without rewriting the table under an AccessExclusiveLock. here's a not-so-brief summary of the conversations around this topic at pgconf.dev, and a new proposal at the end. I had the chance to bring this up with other attendees, and many recognized the use case as a useful one, addressing a real operational issue. In particular, I had great feedback from Staš Kotarac Guček, who pointed out a major flaw in my current proposal: a constraint of the form CHECK (c = expr) would not work correctly when expr evaluates to null for some rows. Thank you Staš, in the next iteration I will change the constraint to use IS NOT DISTINCT FROM, instead. I briefly mentioned this topic to Tom Lane, who quickly replied with the question: should this not fail when it can't use the constraint, instead of overwriting the contents of the column? Thanks Tom, I will get to this later in this mail. I had registered this patch for the in-person commitfest at pgconf.dev, and Álvaro Herrera picked it up for review. Thank you Álvaro, and thank you Peter for organizing the event. We managed to find some time on the very last day of the conference, and went through the current design and code. The open items (which I will address in the next iteration of this patch) are: * missing user documentation I will work on this next. I think it's a good way to explain the feature even early during development. I just didn't want to do it _too_ early, without having had any feedback. * try to minimize command counter increments There might be one call to CommandCounterIncrement() which is not necessary, I'll try remove it. * comment on why it is necessary to clear missing values when rewriting the table ATExecAlterColumnType() and ATExecSetExpression() both do this explicitly when requesting a table rewrite. I'll extend the comment, and also look into whether this is something that should be done any time a table rewrite happens. In that case, it might be worth moving this into the rewriting code rather than having each caller do it. * interactions with other subcommands in the same alter table statement My reasoning regarding this was: if I do this in AT_PASS_SET_EXPRESSION, it should be safe. I will invest some more time into this and add tests, too. We also looked at the overall design of the new command, and we agreed that it is a fitting addition to our current SET EXPRESSION and DROP EXPRESSION. Regarding the question of whether it should be SET or ADD, we agreed that ADD (i.e. the current proposal) is clearer, especially for its similarity to ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY. Regarding the question of "should this fail or rewrite the table when a usable constraint isn't found": Álvaro's suggestion here was to use a more ad-hoc command, meant more specifically for this use case of converting into a stored generated column without rewriting it. If the command would be dedicated specifically to this, it would make sense to have it fail when a usable constraint isn't found. Last but not least, I also discussed this with Laurenz Albe, and he wrote a very useful review in this thread. I will address that separately and reply directly to that mail, but one point I can already merge in this discussion is about the syntax of the command: > 1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...) > and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision > for ALTER TABLE ... ADD GENERATED ALWAYS AS (...). > So this patch adds non-standard syntax that may one day conflict with > a new version of the standard. I think we can still do it, and the > proposed syntax looks right, but I thought I should mention it. I'd like to take his point, together with the question from Tom and the suggestion by Álvaro, and make a new proposal for the design of this command. Design iteration 2 ------------------ Syntax: ALTER TABLE t ALTER COLUMN c ADD GENERATED ALWAYS AS (expr) STORED USING CONSTRAINT check_name check_name must be a valid constraint of the form CHECK (c IS NOT DISTINCT FROM (expr)) This fails if: - a check constraint named check_name is not found for table c - the constraint is not valid - the constraint does not match exactly the expr the user intends to use as a stored default expression On success, the table c is now a stored generated column with the given default expression, and the check_name constraint has been removed. This addresses Tom's remark, we can now fail instead of just rewriting the column. It improves slightly upon the issue of a potential conflict with a future edition of the SQL standard, by being more specific. I don't see a way to be completely sure we won't have conflicts. We could improve more by making the syntax more "alien" and very unlinkely to be picked up by the standard, but at a usability cost for Postgres. I'm open to suggestions. It improves upon another question raised by Álvaro: does the user have to clean up the constraint? In v1 I felt it was better to have the user remove it after the migration. Since here it's explicitly mentioned as the constraint to use to migrate the column, I think it's OK to remove it. We are conceptually moving it from being a constraint to being the new default expression. The implementation should also be simpler, since there will never be a table rewrite. Any thoughts about this? Best regards, Alberto -- Alberto Piai Sensational AG Zürich, Switzerland -
Re: Adding a stored generated column without long-lived locks
Alberto Piai <alberto.piai@gmail.com> — 2026-05-27T17:44:23Z
On Tue May 26, 2026 at 5:23 PM CEST, Laurenz Albe wrote: > > 1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...) > and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision > for ALTER TABLE ... ADD GENERATED ALWAYS AS (...). > So this patch adds non-standard syntax that may one day conflict with > a new version of the standard. I think we can still do it, and the > proposed syntax looks right, but I thought I should mention it. Thank you for bringing this up. I don't have access to the standard, but the chance of a possible conflict with future editions was at the back of my mind. I don't see a way to exclude it completely. In a sibling mail in this thread (you should be in CC), I have made a new iteration on this proposal, which also tries to make the command more specific to avoid future conflicts. > 2) We currently have ALTER TABLE ... ALTER ... SET EXPRESSION AS (...) to > change the generation expression of a column. This command always > rewrites the table, according to the documentation. > I think that if the present patch adds support to skip rewriting the table > when a generation expression is added and the expression matches a check > constraint, changing the generation expression should also be possible > without a rewrite. If not, I would consider that a violation of the > principle of least astonishment. > Would it be difficult to extend the patch to support that? Yes, I don't see a way to make that work. Since we're talking only about stored values, a rewrite will always be necessary. However, using this new command, a user could add a column with the new expression, then atomically drop the old one and rename. All without holding onto an AccessExclusiveLock for a long time :) > 3) We already have a couple of tricks to avoid blocking for a long time: > > - ALTER TABLE ... ALTER ... SET NOT NULL can skip the table scan if there > is a check constraint that makes sure that the column is NOT NULL > > - ALTER TABLE ... ATTACH PARTITION can skip the scan of the new partition > if there is a check constraint matching the partition constraint > > It would be great to document these little tricks in the documentation, > probably on the ALTER TABLE page. This is not necessarily the job of > this patch, but it would also not be off-topic for the patch. The SET NOT NULL one and the ATTACH PARTITION one are documented in the section specific to the command. However or, if an equivalent index already exists, it will be attached to the target table's index, as if ALTER INDEX ATTACH PARTITION had been executed is not very explicit about the advantages this has for online migrations. In the NOTES section of the ALTER TABLE page, there is a paragraph about NOT VALID / VALIDATE, which is another operation in the same spirit as this. Maybe we could group them all in a new section dedicated to online schema migrations? (However, even if it's definitely on-topic with this patch, I would work on this in a separate patch / email thread.) > Missing parts: > > - There is no documentation. At least ALTER TABLE needs a description of the > new syntax, and would ideally mention the trick with the check constraint. Yes, I will work on this next. I also believe it's a great way to show a feature, even early during development. I just wanted to avoid doing it _too_ early, before having had any feedback about the idea. > - There should be support for command line completion for the new syntax. Great idea, I'll add this too. > Bugs: > > - The patch doesn't test if the column is an identity column: > > CREATE TABLE tab (id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY); > INSERT INTO tab VALUES (DEFAULT); > ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED; > > The ALTER TABLE should fail, but doesn't. > > - Strange behavior with sequences owned by the column: > > CREATE TABLE tab (id bigserial); > INSERT INTO tab VALUES (DEFAULT); > ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED; > \ds tab_id_seq > List of sequences > Schema | Name | Type | Owner > --------+------------+----------+---------- > public | tab_id_seq | sequence | postgres > (1 row) > > I think that any sequence owned by the column should be dropped. > Alternatively, you could throw an error. Thanks for testing this! I have reused RememberAllDependentForRebuilding() which does some validation, but was originally meant for ALTER COLUMN TYPE. I will add checks and tests for these cases, but to be consistent with how the other dependencies are handled, I think it's better to throw an error here (this is what happens for example if trying to ALTER TYPE of a column used by a function). > > - Incorrect handling of NULL values: See sibling mail, in the next iteration the constraint will have to use IS NOT DISTINCT FROM. I think that should cover all cases. > > Comments on the code: > >> --- a/src/backend/commands/tablecmds.c >> +++ b/src/backend/commands/tablecmds.c >> @@ -5093,6 +5102,13 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, >> ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode, context); >> pass = AT_PASS_SET_EXPRESSION; >> break; >> + case AT_AddGeneratedAsExprStored: > > You should add a comment, same as for the other branches. > >> @@ -6695,6 +6717,8 @@ alter_table_type_to_string(AlterTableType cmdtype) >> return "ALTER COLUMN ... SET NOT NULL"; >> case AT_SetExpression: >> return "ALTER COLUMN ... SET EXPRESSION"; >> + case AT_AddGeneratedAsExprStored: >> + return "ALTER COLUMN ... ADD GENERATED ALWAYS AS (...) STORED"; > > Keep it short, like "ALTER COLUMN ... ADD GENERATED". > >> --- a/src/test/modules/test_ddl_deparse/test_ddl_deparse.c >> +++ b/src/test/modules/test_ddl_deparse/test_ddl_deparse.c >> @@ -129,6 +129,9 @@ get_altertable_subcmdinfo(PG_FUNCTION_ARGS) >> case AT_SetNotNull: >> strtype = "SET NOT NULL"; >> break; >> + case AT_AddGeneratedAsExprStored: >> + strtype = "ADD GENERATED ALWAYS AS (...) STORED"; >> + break; > > I suggest "ALTER COLUMN ADD GENERATED ALWAYS AS", but I won't insist. Agreed, will fix all these in the next version of the patch. Thank you again for the review! Best regards, Alberto -- Alberto Piai Sensational AG Zürich, Switzerland