Thread

  1. Re: Proposal: Conflict log history table for Logical Replication

    shveta malik <shveta.malik@gmail.com> — 2026-05-12T06:29:58Z

    On Tue, May 12, 2026 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
    >
    > On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote:
    > >
    > > Few comments on 001:
    > > 3)
    > > Currently the structure of CLT is:
    > >
    > > +const ConflictLogColumnDef ConflictLogSchema[] = {
    > > + { .attname = "relid",            .atttypid = OIDOID },
    > > + { .attname = "schemaname",       .atttypid = TEXTOID },
    > > + { .attname = "relname",          .atttypid = TEXTOID },
    > > + { .attname = "conflict_type",    .atttypid = TEXTOID },
    > > + { .attname = "remote_xid",       .atttypid = XIDOID },
    > > + { .attname = "remote_commit_lsn",.atttypid = LSNOID },
    > > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
    > > + { .attname = "remote_origin",    .atttypid = TEXTOID },
    > > + { .attname = "replica_identity", .atttypid = JSONOID },
    > > + { .attname = "remote_tuple",     .atttypid = JSONOID },
    > > + { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
    > > +};
    > >
    > > So if user has to delete a conflict from CLT after resolving it, then
    > > what is the user-friendly way to do it? IMO, it will be cumbersome
    > > (and perhaps error-prone) to write a query with remote_commit_lsn,
    > > remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others)
    > > think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS
    > > AS IDENTITY). This provides a simple, unique identifier so the user
    > > can easily target a single row (WHERE log_id = 105) or purge a batch
    > > of old conflicts (WHERE log_id < 1000).
    >
    > I agree with this. I could think of a few other possible approaches as well.
    > The following options seem possible to make row identification/deletion easier:
    > a) Use existing remote_commit_ts
    > ex:
    > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts =
    > '2026-05-12 10:25:46.483899+05:30';
    > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts <
    > now() - interval '100 minutes';
    > b) Use existing system column ctid
    > ex:
    > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE ctid = '(0,1)';
    > c) Add a dedicated identifier conflict_id column as Shveta said
    > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id = 42;
    > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id < 100;
    > d) Add a local conflict_logged_at timestamp
    > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at
    > = '2026-05-12 10:25:46.483899+05:30';
    > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at
    > < now() - interval '100 minutes';
    >
    
    I like c and d. IMO, approach 'a' is cumbersome to write query with.
    Approach 'b' may not be known to all.
    
    I had earlier suggested a timestamp column (pt 3 at [1]) to record
    conflict-occurence time (mainly 'conflict_logged_at' column) in CLT
    but the idea was kept on hold awaiting more feedback. Now we can
    revisit this.
    
    I feel 'conflict_logged_at' could be more beneficial because, going
    forward (based on feedback), we may range-partition this table on that
    field which may form as basis of historical data purge. I also
    suggested this in [2] (see 'That said, irrespective of what we
    decide') . Such a field could be basis of purging mechanism.
    
    [1]: https://www.postgresql.org/message-id/CAJpy0uCMDqcWGepcTwFPH%2BhTDjD8b72KnbL-S%2Bd-qd7ChomOyQ%40mail.gmail.com
    [2]: https://www.postgresql.org/message-id/CAJpy0uAfRZa4axLV_e4gvVdmunb8BOVx%2BYr%3DXecECAVD0KnD%3DA%40mail.gmail.com
    
    thanks
    Shveta