Thread
-
Re: Support EXCEPT for TABLES IN SCHEMA publications
Peter Smith <smithpb2250@gmail.com> — 2026-05-15T07:19:03Z
Hi Nisha. Some review comments for patch v5-0001. ====== doc/src/sgml/ref/create_publication.sgml (FOR TABLES IN SCHEMA) 1. + Tables listed in <literal>EXCEPT</literal> for a given schema are + excluded from the publication. /Tables listed in EXCEPT.../Tables listed in the EXCEPT clause.../ There is a similar problem with the existing *head* docs in the "FOR ALL TABLES" part. This should be fixed, too. /Tables listed in EXCEPT clause.../Tables listed in the EXCEPT clause.../ ====== src/backend/catalog/pg_publication.c publication_add_relation: 2. + HeapTuple existing; Not sure if this is the best name. How about "tup"? ~~~ 3. + bool is_except = existing_form->prexcept; This variable is used only once. Not sure if that vindicates having it. ~~~ 4. + ereport(ERROR, + (errcode(ERRCODE_DUPLICATE_OBJECT), + errmsg("table \"%s.%s\" cannot be added because it is listed in EXCEPT clause of publication \"%s\"", + get_namespace_name(RelationGetNamespace(targetrel)), + RelationGetRelationName(targetrel), + pub->name))); 4a. There is a recently committed function which will give the fully-qualified rel name, so you can use "%s" instead of "%s.%s". ~ 4b. The message seems a bit wordy. BEFORE "table \"%s.%s\" cannot be added because it is listed in EXCEPT clause of publication \"%s\"" SUGGESTION "table \"%s\" cannot be added because it is excluded from publication \"%s\"" ~~~ 5. + * For ALTER PUBLICATION, invalidation is needed when adding an EXCEPT + * table to either: - a FOR ALL TABLES publication (pub->alltables is + * true), or - a FOR TABLES IN SCHEMA publication (is_schema_publication + * is true). * - * For ALTER PUBLICATION, invalidation is needed only when adding an - * EXCEPT table to a publication already marked as ALL TABLES. For - * publications that were originally empty or defined as ALL SEQUENCES and - * are being converted to ALL TABLES, invalidation is skipped here, as - * AlterPublicationAllFlags() function invalidates all relations while - * marking the publication as ALL TABLES publication. + * The exception: when a publication is being converted to FOR ALL TABLES + * (pub->alltables is still false at this point), + * AlterPublicationAllFlags() will perform a full invalidation, so we skip + * it here. */ - inval_except_table = (alter_stmt != NULL) && pub->alltables && - (alter_stmt->for_all_tables && pri->except); + inval_except_table = (alter_stmt != NULL) && pri->except && + (pub->alltables + ? alter_stmt->for_all_tables + : is_schema_publication(pubid)); Is all this comment and logic about "ALTER PUBLICATION" a bit premature? AFAIK, patch 0001 is not yet doing anything for ALTER PUBLICATION, so it looks like all this belongs in a later patch. ~~~ GetIncludedPublicationRelations: 6. - * This should only be used FOR ALL TABLES publications. + * This is used for FOR ALL TABLES and FOR TABLES IN SCHEMA publications, + * both of which support EXCEPT TABLE. So, because it might come from 2 places, shouldn't the assert in this function be modified? Also, since the assert was not yet modified, how does this even pass the tests if 'alltables' was false? ~~~ GetAllSchemaPublicationRelations: 7. + /* get the list of tables excluded via EXCEPT TABLE for this publication */ + if (pubschemalist != NIL) + exceptlist = get_publication_relations(pubid, pub_partopt, true); + Should this be calling 'GetExcludedPublicationTables' instead of directly calling get_publication_relations? ~~~ 8. + list_free(schemaRels); + } + else + result = list_concat(result, schemaRels); Why is 'schemaRels' only being freed when there is an EXCEPT? ====== src/backend/commands/publicationcmds.c CreatePublication: 9. List *rels; + ListCell *lc; Looks like this can be declared at a lower scope. ~~~ 10. + foreach(lc, rels) + { + PublicationRelInfo *pri = (PublicationRelInfo *) lfirst(lc); + + explicitrelids = lappend_oid(explicitrelids, + RelationGetRelid(pri->relation)); + } Maybe this loop can be made neater using foreach_ptr(). ~~~ 11. + foreach(lc, rels) + { + PublicationRelInfo *pri = (PublicationRelInfo *) lfirst(lc); + Oid relid = RelationGetRelid(pri->relation); + Oid relns = RelationGetNamespace(pri->relation); 11a. Maybe this loop can be made neater using foreach_ptr(). ~ 11b. Maybe 'except_relid' or 'excluded_relid' is a more meaningful varname. I also felt that some other names are a bit confusing. - e.g. 'exceptrelations' vs 'rels' ... IIUC, both of these are lists of excepted relations/tables, although 1st is a list of PublicationTable and 2nd is a list of PublicationRelInfo ~~~ 12. + ereport(ERROR, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("table \"%s.%s\" cannot appear in both the table list and the EXCEPT clause", + get_namespace_name(relns), + RelationGetRelationName(pri->relation))); This can make use of a recently committed function that returns the schem-qualifier relname so you can use "%s" instead of "%s.%s". ====== src/backend/parser/gram.y 13. - * TABLES IN SCHEMA schema [, ...] + * TABLES IN SCHEMA schema [EXCEPT ( table [, ...] )] [, ...] In the comment just before this one, the FOR ALL TABLES EXCEPT was written as "[EXCEPT (TABLE table [, ...] )]". So, I think this should be the same for consistency. ~~~ 14. - | ColId opt_column_list OptWhereClause + | ColId opt_column_list OptWhereClause opt_pub_except_clause { $$ = makeNode(PublicationObjSpec); $$->pubobjtype = PUBLICATIONOBJ_CONTINUATION; + $$->except_tables = $4; This seems suspicious. You cannot have an EXCEPT clause when there is a column list or a WHERE clause, so what is this scenario? Maybe the "$$->except_tables = $4;" needs to be moved to the 'else' block? ~~~ preprocess_pubobj_list: 15. + /* Flatten EXCEPT entries into the top-level list */ + foreach(lc, pubobj->except_tables) Maybe the loop can be made neater by using foreach_ptr. ~~~ 16. + if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLES_IN_SCHEMA) + { + if (eobj->pubtable->relation->schemaname == NULL) + eobj->pubtable->relation->schemaname = pubobj->name; + else if (strcmp(eobj->pubtable->relation->schemaname, + pubobj->name) != 0) + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("table \"%s.%s\" in EXCEPT clause does not belong to schema \"%s\"", + eobj->pubtable->relation->schemaname, + eobj->pubtable->relation->relname, + pubobj->name), + parser_errposition(eobj->location)); 16a. Introducing some more variables (like 'eobj_schemaname' and 'eobj_relname') would simplify this code quite a bit. ~ 16b. Should make use of the recently committed function that gets fully-qualified rel names so you can use "%s" instead of "%s.%s". ====== src/backend/replication/pgoutput/pgoutput.c get_rel_sync_entry: 17. + if (am_partition) + { + List *pancestore = get_partition_ancestors(relid); + + schemaExceptPubids = + GetRelationExcludedPublications(llast_oid(pancestore)); + list_free(pancestore); + } + else + schemaExceptPubids = GetRelationExcludedPublications(relid); 17a. Typo? "pancestore" -- what's the 'e'? ~ 17b. Anyway, would it be simpler to just get the Oid up-front? e.g. Oid oid_root = llast_oid(get_partition_ancestors(relid)); ~~~ 18. + /* + * The ancestor is only considered published if it is not + * in the EXCEPT clause of this schema publication. + * GetTopMostAncestorInPublication checks schema + * membership but does not account for the EXCEPT list, so + * we must filter that out here. + */ So, then maybe the logic to account for EXCEPT TABLE *should* be encapsulated within the GetTopMostAncestorInPublication, instead of the tack-on extra filtering needed here? ====== src/bin/psql/tab-complete.in.c 19. + if (strchr(previous_words[3], ',') == NULL) + { + set_completion_reference(previous_words[3]); + COMPLETE_WITH_QUERY_VERBATIM(Query_for_list_of_tables_in_schema); + } IIUC, you are not supposed to reference 'previous_words' directly like this, so maybe this should be 'prev4_wd'? ~~~ 20. + char *schema_word = previous_words[previous_words_count - 8]; This looks suspicious. I don't see similar code to this anywhere else. (??) ====== src/include/nodes/parsenodes.h 21. + List *except_tables; /* tables to exclude (for TABLES IN SCHEMA) */ ParseLoc location; /* token location, or -1 if unknown */ } PublicationObjSpec; Maybe the wording should be worded more like the existing one in PublicationAllObjSpec. SUGGESTION tables specified in the EXCEPT clause (for TABLES IN SCHEMA) ====== src/test/regress/sql/publication.sql 22. +-- Exclude multiple tables using unqualified names (implicitly qualified with the schema) +CREATE PUBLICATION testpub_schema_except2 + FOR TABLES IN SCHEMA pub_test EXCEPT (TABLE testpub_nopk, testpub_tbl_s1); +\dRp+ testpub_schema_except2 A slightly more sophisticated test might have those same excluded table names in the 'public' schema as well as in the schema with the EXCEPT clause, so then you can verify it is not getting them mixed up. ~~~ 23. +-- fail: EXCEPT is not allowed for FOR TABLE publications +CREATE PUBLICATION testpub_except_err + FOR TABLE pub_test.testpub_tbl_s1, testpub_tbl_s2 EXCEPT (TABLE pub_test.testpub_nopk); Hmm. Doesn't this test belong elsewhere instead of in the "EXCEPT tests for TABLES IN SCHEMA" group? ~~~ 24. MORE TEST CASES (below are in no particular order) 24a. fail - when a table in the schema EXCEPT clause is also specified by FOR TABLE ~ 24b. fail - when a non-existing table is specified in the EXCEPT clause. ~ 24c. fail - when trying to exclude partitions. ~ 24d. pass - publication with multiple schemas and multiple EXCEPT clauses ~ 24e. fail - when the TABLE keyword is omitted. ====== src/test/subscription/t/037_except.pl 25. +my $schema_pub = + "CREATE PUBLICATION tap_pub_part FOR TABLES IN SCHEMA public EXCEPT (TABLE public.root1)"; +test_except_root_partition('false', + "$schema_pub WITH (publish_via_partition_root = false)"); +test_except_root_partition('true', + "$schema_pub WITH (publish_via_partition_root = true)"); The "WITH (publish_via_partition_root = true/false)" part should be automatically appended within the subroutine, based on the other boolean parameter. ~~~ 26. + CREATE TABLE sch1.tab_pub AS SELECT generate_series(1,5) AS a; + CREATE TABLE sch1.tab_exc AS SELECT generate_series(1,5) AS a; + CREATE TABLE sch1.par (a int); + CREATE TABLE sch1.chi (b int) INHERITS (sch1.par); The abbreviated table names like 'tab_exc', 'par', 'chi' (instead of 'tab_excluded', 'parent', 'child') are not saving much at all, but they are obfuscating the meaning. Descriptive names would be better, ~~~ 27. +# Truncate chi on the publisher so the next test starts with a clean slate. +# (The previous test inserted rows into chi that would otherwise be copied by +# the initial table sync of the next subscription.) The comment explains the setup, but does not actually say what the test does -- e.g. you are testing excluding ONLY the parent, so in this case the child is expected to be replicated. ~~~ 28. +$node_subscriber->safe_psql('postgres', 'DROP SUBSCRIPTION sch_sub'); +$node_publisher->safe_psql('postgres', 'DROP PUBLICATION sch_pub'); +$node_publisher->safe_psql('postgres', + 'TRUNCATE sch1.par, sch1.chi, sch1.tab_exc'); +$node_subscriber->safe_psql('postgres', + 'TRUNCATE sch1.par, sch1.chi, sch1.tab_pub, sch1.tab_exc'); Are those TRUNCATEs needed? Won't the tables get trashed anyway by CASCADE in the next ("Cleanup schema tables") part? ====== Kind Regards, Peter Smith. Fujitsu Australia