Thread

  1. Re: Skipping schema changes in publication

    Shlok Kyal <shlok.kyal.oss@gmail.com> — 2025-12-16T09:20:41Z

    On Thu, 11 Dec 2025 at 04:10, Peter Smith <smithpb2250@gmail.com> wrote:
    >
    > Hi Shlok -
    >
    > Here are some review comments for v31-0001 (EXCEPT (tablelist))
    >
    > ======
    > Commit message
    >
    > 1.
    > The new syntax allows specifying excluded relations when creating or altering
    > a publication. For example:
    > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE (t1,t2);
    >
    > ~
    >
    > In v30, you removed all the ALTER PUBLICATION changes, so the "or
    > altering" in the message above also needs to be removed.
    >
    > ======
    > doc/src/sgml/logical-replication.sgml
    >
    > 2.
    >    <para>
    > -   To add tables to a publication, the user must have ownership rights on the
    > -   table. To add all tables in schema to a publication, the user must be a
    > -   superuser. To create a publication that publishes all tables, all tables in
    > -   schema, or all sequences automatically, the user must be a superuser.
    > +   To create a publication using <literal>FOR ALL TABLES</literal>,
    > +   <literal>FOR ALL SEQUENCES</literal> or
    > +   <literal>FOR TABLES IN SCHEMA</literal>, the user must be a
    > superuser. To add
    > +   <literal>ALL TABLES</literal> or <literal>TABLES IN SCHEMA</literal> to a
    > +   publication, the user must be a superuser. To add tables to a publication,
    > +   the user must have ownership rights on the table.
    >    </para>
    >
    > This is a good improvement, but I was not sure why it is in this
    > patch. Should it be a separate thread for a docs improvement?
    >
    > ======
    > src/backend/catalog/pg_publication.c
    >
    > GetTopMostAncestorInPublication:
    >
    > 3.
    > {
    >   Oid ancestor = lfirst_oid(lc);
    > - List    *apubids = GetRelationPublications(ancestor);
    > - List    *aschemaPubids = NIL;
    > + List    *apubids = NIL;
    > + List    *aexceptpubids = NIL;
    > + List    *aschemapubids = NIL;
    > + bool set_top = false;
    > +
    > + GetRelationPublications(ancestor, &apubids, &aexceptpubids);
    >
    >   level++;
    >
    > - if (list_member_oid(apubids, puboid))
    > + /* check if member of table publications */
    > + set_top = list_member_oid(apubids, puboid);
    > + if (!set_top)
    >   {
    > - topmost_relid = ancestor;
    > + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
    >
    > - if (ancestor_level)
    > - *ancestor_level = level;
    > + /* check if member of schema publications */
    > + set_top = list_member_oid(aschemapubids, puboid);
    > +
    > + /*
    > + * If the publication is all tables publication and the table is
    > + * not part of exception tables.
    > + */
    > + if (!set_top && puballtables)
    > + set_top = !list_member_oid(aexceptpubids, puboid);
    >   }
    > - else
    > +
    > + if (set_top)
    >   {
    > - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
    > - if (list_member_oid(aschemaPubids, puboid))
    > - {
    > - topmost_relid = ancestor;
    > + topmost_relid = ancestor;
    >
    > - if (ancestor_level)
    > - *ancestor_level = level;
    > - }
    > + if (ancestor_level)
    > + *ancestor_level = level;
    >   }
    >
    >   list_free(apubids);
    > - list_free(aschemaPubids);
    > + list_free(aschemapubids);
    > + list_free(aexceptpubids);
    >   }
    >
    > That 'aschemapubids' can be declared and freed within the if block.
    >
    > ~~~
    >
    > publication_add_relation:
    >
    > 4.
    > + /*
    > + * Check when a partition is excluded via EXCEPT TABLE while the
    > + * publication has publish_via_partition_root = true.
    > + */
    > + if (pub->alltables && pub->pubviaroot && pri->except &&
    > + targetrel->rd_rel->relispartition)
    > + ereport(WARNING,
    >
    >
    > This comment doesn't sound quite right:
    >
    > SUGGESTION
    > Handle the case where a partition is excluded by EXCEPT TABLE while
    > publish_via_partition_root = true.
    >
    > ~~~
    >
    > 5.
    > + /*
    > + * Check when a partitioned table is excluded via EXCEPT TABLE while the
    > + * publication has publish_via_partition_root = false.
    > + */
    > + if (pub->alltables && !pub->pubviaroot && pri->except &&
    > + targetrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
    > + ereport(WARNING,
    >
    > Ditto. Reword like suggested in the previous review comment.
    >
    > ~~~
    >
    > 6.
    > +/*
    > + * Get the list of publication oids associated with a specified relation.
    > + * pubids is filled with the list of publication oids the relation is part of.
    > + * except_pubids is filled with the list of publication oids the relation is
    > + * excluded from.
    > + *
    > + * This function returns true if the relation is part of any publication.
    > + */
    >
    > Maybe putting 'pubids' and 'except_pubids' in single quotes will help
    > readability of this comment?
    >
    > Also, these are already Lists, so they are not filled with lists.
    >
    > SUGGESTION
    > Parameter 'pubids' returns the OIDs of the publications the relation is part of.
    > Parameter 'except_pubids' returns the OIDs of publications the
    > relation is excluded from.
    >
    > ~~~
    >
    > GetPublicationRelations:
    >
    > 7.
    >  /*
    > - * Gets list of relation oids for a publication.
    > + * Return the list of relation OIDs for a publication.
    > + *
    > + * For a FOR ALL TABLES publication, this returns the list of tables that were
    > + * explicitly excluded via an EXCEPT TABLE clause.
    > + *
    > + * For a FOR TABLE publication, this returns the list of tables explicitly
    > + * included in the publication.
    >   *
    > - * This should only be used FOR TABLE publications, the FOR ALL
    > TABLES/SEQUENCES
    > - * should use GetAllPublicationRelations().
    > + * Publications declared with FOR ALL TABLES or FOR ALL SEQUENCES should use
    > + * GetAllPublicationRelations() to obtain the complete set of tables covered by
    > + * the publication.
    >   */
    >  List *
    >  GetPublicationRelations(Oid pubid, PublicationPartOpt pub_partopt)
    >
    > 7a.
    > The function is called 'GetPublicationRelations', so it seems
    > unintuitive that it sometimes returns the list of all the tables that
    > are *excluded* from the publication. If you are going to have one
    > single function that does everything, then IMO it might be better to
    > hide that behind some wrapper functions like:
    > GetPublicationMemberRelations
    > GetPublicationExcludedRelations
    >
    > Consider also that all these assumptions might be OK today but they
    > won't be OK in the future. e.g. One day, when named FOR SEQUENCE
    > sq1,sq2 are supported then you will be alble to write a command like
    > FOR ALL TABLES EXCEPT (t1), FOR SEQUENCE sq1,sq2. That's going to be a
    > muddle of some included and some excluded relations. So, it is better
    > to cater for that scenario now, rather than have to rewrite all of
    > this function again in the future. e.g. Maybe instead of this function
    > returning one list it is better to return included/excluded Lists or
    > relations as output parameters?
    >
    > ~
    >
    > 7b.
    > Also, comments like "Publications declared with FOR ALL TABLES or FOR
    > ALL SEQUENCES should use..." seems like too many assumptions are being
    > made. It would be better to enforce the calling requirements using
    > parameter checking and Asserts instead instead of hoping that callers
    > are going to abide by the comments.
    >
    > ~~~
    >
    > GetAllPublicationRelations:
    >
    > 8.
    > + exceptlist = GetPublicationRelations(pubid, pubviaroot ?
    > + PUBLICATION_PART_ALL :
    > + PUBLICATION_PART_ROOT);
    >
    > This is similar to the above review comment. I'm not sure how you can
    > just assume that this must be the "except list" -- AFAICT this assumes
    > that 'GetAllPublicationRelations' can only be called by FOR ALL TABLES
    > (???). Seems like a lot of assumptions, that would be much better to
    > be enforced by Asserts in the code.
    >
    I agree with comments 7 and 8. I have added two functions
    'GetPublicationIncludedRelations' and
    'GetPublicationExcludedRelations'. To get Relations which are included
    or excluded in a publication.
    Both functions will call 'GetPublicationRelationsInternal' function. I
    have also reintroduced the 'except_flag' variable
    
    > ======
    > src/backend/commands/publicationcmds.c
    >
    > pub_rf_contains_invalid_column:
    >
    > 9.
    >  bool
    >  pub_rf_contains_invalid_column(Oid pubid, Relation relation, List *ancestors,
    > -    bool pubviaroot)
    > +    bool pubviaroot, bool puballtables)
    >
    > I felt that 'puballtables' is more "important" than 'pubviaroot' so
    > maybe it should come earlier in the parameter list. (e.g. make it more
    > similar to 'pub_contains_invalid_column')
    >
    > ======
    > src/backend/parser/gram.y
    >
    > 10.
    > + pub_except_obj_list opt_except_clause
    >
    > I felt that 'opt_except_clause' should better be called
    > 'opt_pub_except_clause' or 'pub_opt_except_clause' because without
    > 'pub' it is a bit vague.
    >
    I agree. I prefer 'opt_pub_except_clause'. By looking at other
    variables it better make sense to start the variable name with 'opt_'
    as it indicates that it is optional.
    Made changes for the same.
    
    > ~~~
    >
    > 11.
    > +/*
    > + * ALL TABLES EXCEPT ( table_name [, ...] ) specification
    > + */
    >
    > 11a
    > This comment should be up where all the other CREATE PUBLICATION
    > syntax is commented.
    >
    > ~
    >
    > 11b.
    > Also, there is a missing optional "[TABLE]" part.
    >
    > ~~~
    >
    > 12.
    > +pub_except_obj_list: PublicationExceptObjSpec
    > + { $$ = list_make1($1); }
    > + | pub_except_obj_list ',' PublicationExceptObjSpec
    > + { $$ = lappend($1, $3); }
    > + ;
    > +
    > +opt_except_clause:
    > + EXCEPT opt_table '(' pub_except_obj_list ')' { $$ = $4; }
    > + | /*EMPTY*/ { $$ = NIL; }
    > + ;
    >
    > I felt the clause should be defined before the obj list because that
    > seems the natural order to read these.
    >
    > ======
    > src/bin/pg_dump/pg_dump.c
    >
    > 13.
    > +static SimplePtrList exceptinfo = {NULL, NULL};
    >
    > Having this as global seems a bit hacky. It has nothing in common with
    > all the other nearby lists, which are commented as being based on
    > "patterns given by command-line switches"
    >
    I agree, I have added it in the PublicationInfo struct and made the
    corresponding code changes.
    
    > ~~~
    >
    > dumpPublication:
    >
    > 14.
    > + /* Include exception tables if the publication has EXCEPT TABLEs */
    > + for (SimplePtrListCell *cell = exceptinfo.head; cell; cell = cell->next)
    > + {
    > + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr;
    > + TableInfo  *tbinfo;
    > +
    > + if (pubinfo == pubrinfo->publication)
    > + {
    > + tbinfo = pubrinfo->pubtable;
    >
    > That 'tbinfo' can be declared within the "if".
    >
    > ~~~
    >
    > 15.
    > + appendPQExpBuffer(query, "ONLY %s", fmtQualifiedDumpable(tbinfo));
    >
    > ONLY is not the default. How did you decide that "ONLY" is the correct
    > thing to do here?
    >
    For pg_dump for publication we use "ONLY" by default while specifying the table
    
    For Alter publication we use similar thing:
    ```
      appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
                fmtId(pubinfo->dobj.name));
    ```
    
    Also if we specify a parent table in a publication(without ONLY) all
    its child tables are also added to the pg_publication_rel table.
    So when we dump such a publication we get something like:
    .... EXCEPT TABLE(ONLY parent_table, ONLY child_table)...
    
    > ~~~
    >
    > getPublicationTables:
    >
    > 16.
    > - pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
    > + if (prexcept)
    > + pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL;
    > + else
    > + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL;
    > +
    >
    > Would a single assignment (ternary) make this code simpler and easier to read?
    >
    > SUGGESTION
    > pubrinfo[j].dobj.objType = prexcept ?
    >   DO_PUBLICATION_EXCEPT_REL :
    >   DO_PUBLICATION_REL;
    >
    > ======
    > src/bin/pg_dump/t/002_pg_dump.pl
    >
    > 17.
    > + 'CREATE PUBLICATION pub10' => {
    > + create_order => 50,
    > + create_sql =>
    > +   'CREATE PUBLICATION pub10 FOR ALL TABLES EXCEPT TABLE
    > (dump_test.test_table_generated);',
    > + regexp => qr/^
    > + \QCREATE PUBLICATION pub10 FOR ALL TABLES EXCEPT TABLE (ONLY
    > dump_test.test_table_generated, ONLY
    > dump_test.test_table_generated_child2, ONLY
    > dump_test.test_table_generated_child1) WITH (publish = 'insert,
    > update, delete, truncate');\E
    > + /xm,
    > + like => { %full_runs, section_post_data => 1, },
    > + },
    > +
    >
    > These "generated" names seem unusual. I saw there are some other
    > tables like 'dump_test.test_inheritance_child' and
    > 'dump_test.test_inheritance_parent'. Can you use those more normal
    > table names instead?
    >
    > Also curious - does the order of the tests matter? I saw that the
    > CREATE TABLE tests seem to be coming after the CREATE PUBLICATION
    > tests that are using them.
    >
    I looked into it and came to the conclusion that this is controlled
    using 'create_order' while specifying the tests.
    Tests with a lower create_order value are executed earlier.
    So to ensure 'CREATE PUBLICATION' runs correctly we have to make sure
    the 'create_order' of these statements is higher than that of the
    respective 'CREATE TABLE' statement.
    
    > ~~~
    > 18.
    > - if (!defined($tests{$test}->{all_runs})
    > + if (   !defined($tests{$test}->{all_runs})
    >
    > Why add this whitespace?
    >
    pg_perltidy makes this change. I have reverted it.
    
    > ======
    > src/include/nodes/parsenodes.h
    >
    > 19.
    >   AP_SetObjects, /* set list of objects */
    > + AP_Reset, /* reset the publication */
    >  } AlterPublicationAction;
    >
    > AFAIK, you removed all ALTER command changes from v30-0001. So this
    > should not be here.
    >
    > ~~~
    >
    > 20.
    > + bool for_all_tables; /* Special publication for all tables in db */
    >   AlterPublicationAction action; /* What action to perform with the given
    >   * objects */
    >  } AlterPublicationStmt;
    >
    > AFAIK, you removed all ALTER command changes from v30-0001. So this
    > should not be here.
    >
    > ======
    > src/test/regress/sql/publication.sql
    >
    > 21.
    > +SET client_min_messages = 'ERROR';
    > +CREATE PUBLICATION testpub_foralltables_excepttable FOR ALL TABLES
    > EXCEPT TABLE (testpub_tbl1, testpub_tbl2);
    > +-- specify EXCEPT without TABLE
    > +CREATE PUBLICATION testpub_foralltables_excepttable1 FOR ALL TABLES
    > EXCEPT (testpub_tbl1);
    >
    > Should be 2 comments here for the 2x CREATE:
    >
    > # Exclude tables using FOR ALL TABLES EXCEPT TABLE (tablelist)
    >
    > # Exclude tables using FOR ALL TABLES EXCEPT (tablelist)
    >
    > ~~~
    >
    > 22.
    >  CREATE TABLE testpub_tbl3 (a int);
    >  CREATE TABLE testpub_tbl3a (b text) INHERITS (testpub_tbl3);
    >
    > If you rename these tables like 'testpub_tbl_parent' and
    > 'testpub_tbl_child', it will be much easier to see what is going on.
    >
    > ~~~
    >
    > 23.
    > +CREATE PUBLICATION testpub5 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3);
    >
    > Missing comment -- something like:
    > # Exclude parent table, omitting both of 'ONLY' and '*'
    >
    > ~~~
    >
    > 24.
    > +-- EXCEPT with wildcard: exclude table and all descendants
    > +CREATE PUBLICATION testpub6 FOR ALL TABLES EXCEPT TABLE (testpub_tbl3*);
    >
    > 24a.
    > TBH, I don't think this is a "wildcard" -- it is not doing any pattern
    > matching. IMO just call it an "asterisk" or a "star".
    >
    > ~
    >
    > 24b.
    > And put a space before the '*' here.
    >
    > ======
    > .../t/037_rep_changes_except_table.pl
    >
    > 25.
    > +# ============================================
    > +# EXCEPT TABLE test cases for partition tables
    > +# ============================================
    > +# Check behavior of EXCEPT TABLE together with publish_via_partition_root
    > +# when applied to a partitioned table and its partitions.
    >
    >
    > Really, that "Check behavior" sentence is generic for all of the
    > following tests, so it should also be (within the "=======" of the
    > previous comment)
    >
    > ~~~
    >
    > 26.
    > +$node_publisher->safe_psql(
    > + 'postgres', qq(
    > + CREATE TABLE sch1.t1(a int) PARTITION BY RANGE(a);
    > + CREATE TABLE sch1.part1 PARTITION OF sch1.t1 FOR VALUES FROM (0) TO (5);
    > + CREATE TABLE sch1.part2 PARTITION OF sch1.t1 FOR VALUES FROM (6) TO (10);
    > + INSERT INTO sch1.t1 VALUES (1), (6);
    > +));
    > +
    > +$node_subscriber->safe_psql(
    > + 'postgres', qq(
    > + CREATE TABLE sch1.t1(a int);
    > + CREATE TABLE sch1.part1(a int);
    > + CREATE TABLE sch1.part2(a int);
    > +));
    >
    > 26a.
    > There should be a comment for this part that just says something like
    > "Setup partition table and partitions on the publisher that map to
    > normal tables on the subscriber"
    >
    > ~
    >
    > 26b.
    > The INSERT should be done later, after the CREATE PUBLICATION but
    > before the CREATE SUBSCRIPTION. The pattern will be the same for all
    > the test cases.
    >
    > ~~~
    >
    > 27.
    > +$node_publisher->safe_psql('postgres',
    > + "CREATE PUBLICATION tap_pub_part FOR ALL TABLES EXCEPT TABLE (sch1.part1)"
    > +);
    >
    > Even though the publish_via_partition_root is 'false' by default, I
    > think you should spell it out explicitly here for clarity.
    >
    > ~~~
    >
    > 28.
    > +# EXCEPT TABLE (sch1.t1) with publish_via_partition_root = false
    > +# Excluding the partitioned table while publish_via_partition_root = false
    > +# still allows rows inserted into its partitions to be replicated.
    >
    > I felt you should word this differently. I don't think you should say
    > "inserted into its partitions" because actually, you inserted into the
    > partition table, and the data just ends up in the partitions.
    >
    > ~~~
    >
    > 29.
    > +$node_publisher->safe_psql(
    > + 'postgres', qq(
    > + CREATE PUBLICATION tap_pub_part FOR ALL TABLES EXCEPT TABLE (sch1.t1);
    > + INSERT INTO sch1.t1 VALUES (1), (6);
    > +));
    >
    > Ditto earlier comment. Better to explicitly say
    > "publish_via_partition_root=false".
    >
    I have also addressed the remaining comments and attached the latest patch.
    
    Thanks,
    Shlok Kyal