Thread

  1. Re: Proposal: Conflict log history table for Logical Replication

    vignesh C <vignesh21@gmail.com> — 2025-12-12T05:57:18Z

    On Thu, 11 Dec 2025 at 19:50, Dilip Kumar <dilipbalaut@gmail.com> wrote:
    >
    > On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
    > >
    > > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
    > > >
    > > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote:
    > > >
    > > > > 2)
    > > > > When we do below:
    > > > > alter subscription sub1 SET (conflict_log_table=clt2);
    > > > >
    > > > > the previous conflict log table is dropped. Is this behavior
    > > > > intentional and discussed/concluded earlier? It’s possible that a user
    > > > > may want to create a new conflict log table for future events while
    > > > > still retaining the old one for analysis. If the subscription itself
    > > > > is dropped, then dropping the CLT makes sense, but I’m not sure this
    > > > > behavior is intended for ALTER SUBSCRIPTION.  I do understand that
    > > > > once we unlink CLT from subscription, later even DROP subscription
    > > > > cannot drop it, but user can always drop it when not needed.
    > > > >
    > > > > If we plan to keep existing behavior, it should be clearly documented
    > > > > in a CAUTION section, and the command should explicitly log the table
    > > > > drop.
    > > >
    > > > Yeah we discussed this behavior and the conclusion was we would
    > > > document this behavior and its user's responsibility to take necessary
    > > > backup of the conflict log table data if they are setting a new log
    > > > table or NONE for the subscription.
    > > >
    > >
    > > +1. If we don't do this then it will be difficult to track for
    > > postgres or users the previous conflict history tables.
    >
    > Right, it makes sense.
    >
    > Attached patch fixed most of the open comments
    > 1) \dRs+ now show the schema qualified name
    > 2) Now key_tuple and replica_identify tuple both are add in conflict
    > log tuple wherever applicable
    > 3) Refactored the code so that we can define the conflict log table
    > schema only once in the header file and both create_conflict_log_table
    > and ValidateConflictLogTable use it.
    >
    > I was considering the interdependence between the subscription and the
    > conflict log table (CLT). IMHO, it would be logical to establish the
    > subscription as dependent on the CLT. This way, if someone attempts to
    > drop the CLT, the system would recognize the dependency of the
    > subscription and prevent the drop unless the subscription is removed
    > first or the CASCADE option is used.
    >
    > However, while investigating this, I encountered an error [1] stating
    > that global objects are not supported in this context. This indicates
    > that global objects cannot be made dependent on local objects.
    > Although making an object dependent on global/shared objects is
    > possible for certain types of shared objects [2], this is not our main
    > objective.
    >
    > We do not need to make the CLT dependent on the subscription because
    > the table can be dropped when the subscription is dropped anyway and
    > we are already doing it as part of drop subscription as well as alter
    > subscription when CLT is set to NONE or a different table. Therefore,
    > extending the functionality of shared dependency is unnecessary for
    > this purpose.
    
    I noticed an inconsistency in the checks that prevent adding a
    conflict log table to a publication.  At creation time, we explicitly
    reject attempts to publish a conflict log table:
    /* Can't be conflict log table */
    if (IsConflictLogTable(RelationGetRelid(targetrel)))
        ereport(ERROR,
                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                 errmsg("cannot add relation \"%s.%s\" to publication",
                        get_namespace_name(RelationGetNamespace(targetrel)),
                        RelationGetRelationName(targetrel)),
                 errdetail("This operation is not supported for conflict
    log tables.")));
    
    However, the restriction can be bypassed through a sequence of table
    renames like below:
    -- Set up logical replication
    CREATE PUBLICATION pub_all;
    CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_all  WITH
    (conflict_log_table = 'conflict');
    
    -- Rename the conflict log table
    ALTER TABLE conflict RENAME TO conflict1;
    
    -- Now this succeeds:
    CREATE PUBLICATION pub1 FOR TABLE conflict1;
    
    -- Rename it back
    ALTER TABLE conflict1 RENAME TO conflict;
    
    \dRp+ pub1
      Publication pub1
      ...
      Tables:
          public.conflict
    
    Thus, although we prohibit publishing the conflict log table directly,
    a publication can still end up referencing it through renaming. This
    is inconsistent with the invariant the code attempts to enforce.
    
    Should we extend the checks to handle renames so that a conflict log
    table can never end up in a publication?
    Alternatively, should the creation-time restriction be relaxed if this
    case is acceptable?
    If the invariant should be enforced, should we also prevent renaming a
    conflict-log table into a published table's name?
    
    Thoughts?
    
    Regards,
    Vignesh