Thread

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-29T22:06:59Z
On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Rest of the comments were fixed.
> > > > The attached v37 version patch has the changes for the same. Also
> > > > Peter's comments on the documentation patch from [1] and Shveta's
> > > > comments from [2] are addressed in the attached patch.
> > > >
> > >
> > > Here are few comments based on v37 testing:
> > >
> > > 1) Should we consider using TOAST tables for tuple-data columns like
> > > remote_tuple and local_conflicts (the JSON columns)?
> > > This may be a corner case, but if the tuple data becomes too large to
> > > fit into an 8KB heap tuple, then the apply worker keeps failing while
> > > inserting into the CLT with errors like:
> > >
> > >   ERROR: row is too big: size 19496, maximum size 8160
> > >   LOG: background worker "logical replication apply worker" (PID
> > > 41226) exited with exit code 1
> > >
> >
> > In the docs, it is mentioned: "column_value is the column value. The
> > large column values are truncated to 64 bytes." [1], so I wonder, if
> > we follow this why we need toast entries? Did you tried any case where
> > you are getting above ERROR?
>
> But in this case we are talking about the JSON column of the CLT which
> might contain a full local tuple or even multiple local tuples if a
> remote tuple conflicts with multiple local rows.  So, IMHO, we need a
> toast table. Nisha, have you already tested the scenario? If yes, can
> you share your test case?

After putting more thought, I think instead of executing a three-step
process i.e. inserting the pg_subscription tuple, creating the table
with its dependency, and then going back to update the tuple with the
new relation ID, it is much cleaner to do it linearly, i.e. we should
create the conflict log table first to get its OID, insert the
subscription tuple pre-populated with that ID, and then record the
dependency. This achieves the exact same state in a single direct
sequence without the redundant catalog update within the same command.
I agree with that code we would have to keep the record dependency
code in CreateSubscription and AlterSubscription functions, but after
putting more thought I think in thoese function we are already
recording subscription dependencies with other object so wouldn't it
be more natural to add this depednecy as well at the same place?

Anyway I am ready to change that if we have strong opinion against
this approach.

Here is the updated patch and changes are
1. 0003 and 0004 are merged on 0001
2. Merged Amit's v41_amit_1.patch.txt to 0002
3. Fix the dependency order issue (i.e. create dependency after
inserting subscription tuple) and merged in 0002

Open Items:
1. Need to create toast table for CLT after testing with larger JSON row
2. Fixed review comments of Shveta on 0004 and 0005
3.  Rebase Vignesh's patch of
"v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we
can do that once we have concensus on whether to create conflict log
table first or insert the subscription row first as based on this
change we would have to rebase this patch again.
4. Once we rebase
"v41-0007-Preserve-conflict-log-destination-and-subscripti" after
dependency order consensus I would rebase doc patch and \dRs+ change
patch of Vignesh.


-- 
Regards,
Dilip Kumar
Google