Thread

  1. Re: Proposal: Conflict log history table for Logical Replication

    Nisha Moond <nisha.moond412@gmail.com> — 2026-05-22T04:51:21Z

    On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
    >
    > Rest of the comments were fixed.
    > The attached v37 version patch has the changes for the same. Also
    > Peter's comments on the documentation patch from [1] and Shveta's
    > comments from [2] are addressed in the attached patch.
    >
    
    Here are few comments based on v37 testing:
    
    1) Should we consider using TOAST tables for tuple-data columns like
    remote_tuple and local_conflicts (the JSON columns)?
    This may be a corner case, but if the tuple data becomes too large to
    fit into an 8KB heap tuple, then the apply worker keeps failing while
    inserting into the CLT with errors like:
    
      ERROR: row is too big: size 19496, maximum size 8160
      LOG: background worker "logical replication apply worker" (PID
    41226) exited with exit code 1
    
    Noticed that even disable_on_error=true does not disable the
    subscription in this case. We can think about optimizations such as
    deciding when TOAST tables should be created, or avoiding the error by
    trimming/capping the data size before inserting into the CLT if don't
    want TOAST.
    ~~~
    
    2) Currently, parallel apply workers do not seem to insert conflicts
    into the CLT. The parallel worker logs the conflict to the logfile and
    then exits with an error without handling CLT insertion.
    A small test to reproduce this with a 't1' table subscription using a CLT table:
    -- on publisher
    ALTER SYSTEM SET logical_decoding_work_mem = '64kB';
    SELECT pg_reload_conf();
    
    -- Create a conflict scenario on subscriber: pre-insert a row that will conflict
    INSERT INTO t1 VALUES (99999, 11);
    
    -- on publisher: big transaction that hits the conflict
    BEGIN;
    INSERT INTO t1 SELECT i, i FROM generate_series(1, 10000) i;
    INSERT INTO t1 VALUES (99999, 99); -- this conflicts
    COMMIT;
    
    logfile:
    ERROR: conflict detected on relation "public.t1": conflict=insert_exists
    DETAIL: Could not apply remote change: remote row (99999, 99).
    Key already exists in unique index "t1_pkey", modified locally in
    transaction 842 at 2026-05-21 21:10:51.497681+05:30: key (a)=(99999),
    local row (99999, 42).
    ...
    ERROR: logical replication parallel apply worker exited due to error
    CONTEXT: processing remote data for replication origin "pg_16398"
    during message type "INSERT" for replication target relation
    "public.t1" in transaction 720
    logical replication parallel apply worker
    processing remote data for replication origin "pg_16398" during
    message type "STREAM COMMIT" in transaction 720, finished at
    0/01AC9758
    LOG: subscription "sub1" has been disabled because of an error
    ERROR: lost connection to the logical replication parallel apply worker
    LOG: background worker "logical replication parallel worker" (PID
    66271) exited with exit code 1
    ~~~
    
    3) I think somewhere in patch-0005, the remote_tuple and
    replica_identity columns may have been swapped.
    The replica identity key seems to be written into the remote_tuple
    column, while the remote slot row is written into replica_identity,
    for example:
    
    postgres=# select relname, conflict_type, remote_xid, remote_tuple,
    replica_identity from pg_conflict_log_for_subid_16398;
    relname | conflict_type | remote_xid | remote_tuple | replica_identity
    ---------+-----------------------+------------+--------------+------------------
    t1 | insert_exists | 699 | | {"a":3,"b":11}
    t1 | update_origin_differs | 700 | {"a":3} | {"a":3,"b":111}
    (2 rows)
    
    --
    Thanks,
    Nisha