Thread

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Allow logical replication conflicts to be logged to a table.
- a5918fddf10d master landed
Avoid orphaned objects dependencies
- 2fbb21170e90 19 (unreleased) cited

Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-05T12:24:01Z

Currently we log conflicts to the server's log file and updates, this
approach has limitations, 1) Difficult to query and analyze, parsing
plain text log files for conflict details is inefficient. 2) Lack of
structured data, key conflict attributes (table, operation, old/new
data, LSN, etc.) are not readily available in a structured, queryable
format. 3) Difficult for external monitoring tools or custom
resolution scripts to consume conflict data directly.

This proposal aims to address these limitations by introducing a
conflict log history table, providing a structured, and queryable
record of all logical replication conflicts.  This should be a
configurable option whether to log into the conflict log history
table, server logs or both.

This proposal has two main design questions:
===================================

1. How do we store conflicting tuples from different tables?
Using a JSON column to store the row data seems like the most flexible
solution, as it can accommodate different table schemas.

2. Should this be a system table or a user table?
a) System Table: Storing this in a system catalog is simple, but
catalogs aren't designed for ever-growing data. While pg_large_object
is an exception, this is not what we generally do IMHO.
b) User Table: This offers more flexibility. We could allow a user to
specify the table name during CREATE SUBSCRIPTION.  Then we choose to
either create the table internally or let the user create the table
with a predefined schema.

A potential drawback is that a user might drop or alter the table.
However, we could mitigate this risk by simply logging a WARNING if
the table is configured but an insertion fails.
I am currently working on a POC patch for the same, but will post that
once we have some thoughts on design choices.

Schema for the conflict log history table may look like this, although
there is a room for discussion on this.

Note:  I think these fields are self explanatory so I haven't
explained them here.

conflict_log_table (
    logid  SERIAL PRIMARY KEY,
    subid                OID,
    schema_id          OID,
    table_id            OID,
    conflict_type        TEXT NOT NULL,
    operation_type       TEXT NOT NULL,
    replication_origin   TEXT,
    remote_commit_ts TIMESTAMPTZ,
    local_commit_ts TIMESTAMPTZ,
    ri_key                    JSON,
    remote_tuple         JSON,
    local_tuple          JSON,
);

Credit:  Thanks to Amit Kapila for discussing this offlist and
providing some valuable suggestions.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-08-07T06:55:04Z

On Tue, Aug 5, 2025 at 5:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Currently we log conflicts to the server's log file and updates, this
> approach has limitations, 1) Difficult to query and analyze, parsing
> plain text log files for conflict details is inefficient. 2) Lack of
> structured data, key conflict attributes (table, operation, old/new
> data, LSN, etc.) are not readily available in a structured, queryable
> format. 3) Difficult for external monitoring tools or custom
> resolution scripts to consume conflict data directly.
>
> This proposal aims to address these limitations by introducing a
> conflict log history table, providing a structured, and queryable
> record of all logical replication conflicts.  This should be a
> configurable option whether to log into the conflict log history
> table, server logs or both.
>

+1 for the idea.

> This proposal has two main design questions:
> ===================================
>
> 1. How do we store conflicting tuples from different tables?
> Using a JSON column to store the row data seems like the most flexible
> solution, as it can accommodate different table schemas.

Yes, that is one option. I have not looked into details myself, but
you can also explore 'anyarray' used in pg_statistics to store 'Column
data values of the appropriate kind'.

> 2. Should this be a system table or a user table?
> a) System Table: Storing this in a system catalog is simple, but
> catalogs aren't designed for ever-growing data. While pg_large_object
> is an exception, this is not what we generally do IMHO.
> b) User Table: This offers more flexibility. We could allow a user to
> specify the table name during CREATE SUBSCRIPTION.  Then we choose to
> either create the table internally or let the user create the table
> with a predefined schema.
>
> A potential drawback is that a user might drop or alter the table.
> However, we could mitigate this risk by simply logging a WARNING if
> the table is configured but an insertion fails.

I believe it makes more sense for this to be a catalog table rather
than a user table. I wanted to check if we already have a large
catalog table of this kind, and I think pg_statistic could be an
example of a sizable catalog table. To get a rough idea of how size
scales with data, I ran a quick experiment: I created 1000 tables,
each with 2 JSON columns, 1 text column, and 2 integer columns. Then,
I inserted 1000 rows into each table and ran ANALYZE to collect
statistics. Here’s what I observed on a fresh database before and
after:

Before:
pg_statistic row count: 412
Table size: ~256 kB

After:
pg_statistic row count: 6,412
Table size: ~5.3 MB

Although it isn’t an exact comparison, this gives us some insight into
how the statistics catalog table size grows with the number of rows.
It doesn’t seem excessively large with 6k rows, given the fact that
pg_statistic itself is a complex table having many 'anyarray'-type
columns.

That said, irrespective of what we decide, it would be ideal to offer
users an option for automatic purging, perhaps via a retention period
parameter like conflict_stats_retention_period (say default to 30
days), or a manual purge API such as purge_conflict_stats('older than
date'). I wasn’t able to find any such purge mechanism for PostgreSQL
stats tables, but Oracle does provide such purging options for some of
their statistics tables (not related to conflicts), see [1], [2].
And to manage it better, it could be range partitioned on timestamp.

> I am currently working on a POC patch for the same, but will post that
> once we have some thoughts on design choices.
>
> Schema for the conflict log history table may look like this, although
> there is a room for discussion on this.
>
> Note:  I think these fields are self explanatory so I haven't
> explained them here.
>
> conflict_log_table (
>     logid  SERIAL PRIMARY KEY,
>     subid                OID,
>     schema_id          OID,
>     table_id            OID,
>     conflict_type        TEXT NOT NULL,
>     operation_type       TEXT NOT NULL,

I feel operation_type is not needed when we already have
conflict_type. The name of 'conflict_type' is enough to give us info
on operation-type.

>     replication_origin   TEXT,
>     remote_commit_ts TIMESTAMPTZ,
>     local_commit_ts TIMESTAMPTZ,
>     ri_key                    JSON,
>     remote_tuple         JSON,
>     local_tuple          JSON,
> );
>
> Credit:  Thanks to Amit Kapila for discussing this offlist and
> providing some valuable suggestions.
>

[1]
https://docs.oracle.com/en/database/oracle/oracle-database/21/arpls/DBMS_STATS.html#GUID-8E6413D5-F827-4F57-9FAD-7EC56362A98C

[2]
https://docs.oracle.com/en/database/oracle/oracle-database/21/arpls/DBMS_STATS.html#GUID-A04AE1C0-5DE1-4AFC-91F8-D35D41DF98A2

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-08-07T08:13:40Z

On Thu, Aug 7, 2025 at 12:25 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Aug 5, 2025 at 5:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > Currently we log conflicts to the server's log file and updates, this
> > approach has limitations, 1) Difficult to query and analyze, parsing
> > plain text log files for conflict details is inefficient. 2) Lack of
> > structured data, key conflict attributes (table, operation, old/new
> > data, LSN, etc.) are not readily available in a structured, queryable
> > format. 3) Difficult for external monitoring tools or custom
> > resolution scripts to consume conflict data directly.
> >
> > This proposal aims to address these limitations by introducing a
> > conflict log history table, providing a structured, and queryable
> > record of all logical replication conflicts.  This should be a
> > configurable option whether to log into the conflict log history
> > table, server logs or both.
> >
>
> +1 for the idea.
>
> > This proposal has two main design questions:
> > ===================================
> >
> > 1. How do we store conflicting tuples from different tables?
> > Using a JSON column to store the row data seems like the most flexible
> > solution, as it can accommodate different table schemas.
>
> Yes, that is one option. I have not looked into details myself, but
> you can also explore 'anyarray' used in pg_statistics to store 'Column
> data values of the appropriate kind'.
>
> > 2. Should this be a system table or a user table?
> > a) System Table: Storing this in a system catalog is simple, but
> > catalogs aren't designed for ever-growing data. While pg_large_object
> > is an exception, this is not what we generally do IMHO.
> > b) User Table: This offers more flexibility. We could allow a user to
> > specify the table name during CREATE SUBSCRIPTION.  Then we choose to
> > either create the table internally or let the user create the table
> > with a predefined schema.
> >
> > A potential drawback is that a user might drop or alter the table.
> > However, we could mitigate this risk by simply logging a WARNING if
> > the table is configured but an insertion fails.
>
> I believe it makes more sense for this to be a catalog table rather
> than a user table. I wanted to check if we already have a large
> catalog table of this kind, and I think pg_statistic could be an
> example of a sizable catalog table. To get a rough idea of how size
> scales with data, I ran a quick experiment: I created 1000 tables,
> each with 2 JSON columns, 1 text column, and 2 integer columns. Then,
> I inserted 1000 rows into each table and ran ANALYZE to collect
> statistics. Here’s what I observed on a fresh database before and
> after:
>
> Before:
> pg_statistic row count: 412
> Table size: ~256 kB
>
> After:
> pg_statistic row count: 6,412
> Table size: ~5.3 MB
>
> Although it isn’t an exact comparison, this gives us some insight into
> how the statistics catalog table size grows with the number of rows.
> It doesn’t seem excessively large with 6k rows, given the fact that
> pg_statistic itself is a complex table having many 'anyarray'-type
> columns.
>
> That said, irrespective of what we decide, it would be ideal to offer
> users an option for automatic purging, perhaps via a retention period
> parameter like conflict_stats_retention_period (say default to 30
> days), or a manual purge API such as purge_conflict_stats('older than
> date'). I wasn’t able to find any such purge mechanism for PostgreSQL
> stats tables, but Oracle does provide such purging options for some of
> their statistics tables (not related to conflicts), see [1], [2].
> And to manage it better, it could be range partitioned on timestamp.
>

It seems BDR also has one such conflict-log table which is a catalog
table and is also partitioned on time. It has a default retention
period of 30 days. See 'bdr.conflict_history' mentioned under
'catalogs' in [1]

[1]: https://www.enterprisedb.com/docs/pgd/latest/reference/tables-views-functions/#user-visible-catalogs-and-views

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-07T09:38:20Z

On Thu, Aug 7, 2025 at 1:43 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Aug 7, 2025 at 12:25 PM shveta malik <shveta.malik@gmail.com> wrote:

Thanks Shveta for your opinion on the design.

> > On Tue, Aug 5, 2025 at 5:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >

> > > This proposal aims to address these limitations by introducing a
> > > conflict log history table, providing a structured, and queryable
> > > record of all logical replication conflicts.  This should be a
> > > configurable option whether to log into the conflict log history
> > > table, server logs or both.
> > >
> >
> > +1 for the idea.

Thanks

> >
> > > This proposal has two main design questions:
> > > ===================================
> > >
> > > 1. How do we store conflicting tuples from different tables?
> > > Using a JSON column to store the row data seems like the most flexible
> > > solution, as it can accommodate different table schemas.
> >
> > Yes, that is one option. I have not looked into details myself, but
> > you can also explore 'anyarray' used in pg_statistics to store 'Column
> > data values of the appropriate kind'.

I think conversion from row to json and json to row is convenient and
also other extensions like pgactive/bdr also provide as JSON.  But we
can explore this alternative options as well, thanks

> > > 2. Should this be a system table or a user table?
> > > a) System Table: Storing this in a system catalog is simple, but
> > > catalogs aren't designed for ever-growing data. While pg_large_object
> > > is an exception, this is not what we generally do IMHO.
> > > b) User Table: This offers more flexibility. We could allow a user to
> > > specify the table name during CREATE SUBSCRIPTION.  Then we choose to
> > > either create the table internally or let the user create the table
> > > with a predefined schema.
> > >
> > > A potential drawback is that a user might drop or alter the table.
> > > However, we could mitigate this risk by simply logging a WARNING if
> > > the table is configured but an insertion fails.
> >
> > I believe it makes more sense for this to be a catalog table rather
> > than a user table. I wanted to check if we already have a large
> > catalog table of this kind, and I think pg_statistic could be an
> > example of a sizable catalog table. To get a rough idea of how size
> > scales with data, I ran a quick experiment: I created 1000 tables,
> > each with 2 JSON columns, 1 text column, and 2 integer columns. Then,
> > I inserted 1000 rows into each table and ran ANALYZE to collect
> > statistics. Here’s what I observed on a fresh database before and
> > after:
> >
> > Before:
> > pg_statistic row count: 412
> > Table size: ~256 kB
> >
> > After:
> > pg_statistic row count: 6,412
> > Table size: ~5.3 MB
> >
> > Although it isn’t an exact comparison, this gives us some insight into
> > how the statistics catalog table size grows with the number of rows.
> > It doesn’t seem excessively large with 6k rows, given the fact that
> > pg_statistic itself is a complex table having many 'anyarray'-type
> > columns.

Yeah that's good analysis, apart from this pg_largeobject is also a
catalog which grows with each large object and growth rate for that
will be very high because it stores large object data in catalog.

> >
> > That said, irrespective of what we decide, it would be ideal to offer
> > users an option for automatic purging, perhaps via a retention period
> > parameter like conflict_stats_retention_period (say default to 30
> > days), or a manual purge API such as purge_conflict_stats('older than
> > date'). I wasn’t able to find any such purge mechanism for PostgreSQL
> > stats tables, but Oracle does provide such purging options for some of
> > their statistics tables (not related to conflicts), see [1], [2].
> > And to manage it better, it could be range partitioned on timestamp.

Yeah that's an interesting suggestion to timestamp based partitioning
it for purging.

> It seems BDR also has one such conflict-log table which is a catalog
> table and is also partitioned on time. It has a default retention
> period of 30 days. See 'bdr.conflict_history' mentioned under
> 'catalogs' in [1]
>
> [1]: https://www.enterprisedb.com/docs/pgd/latest/reference/tables-views-functions/#user-visible-catalogs-and-views

Actually bdr is an extension and this table is under extension
namespace (bdr.conflict_history) so this is not really a catalog but
its a extension managed table.  So logically for PostgreSQL its an
user table but yeah this is created and managed by the extension.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-08-08T03:28:21Z

On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Aug 7, 2025 at 1:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Aug 7, 2025 at 12:25 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thanks Shveta for your opinion on the design.
>
> > > On Tue, Aug 5, 2025 at 5:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
>
> > > > This proposal aims to address these limitations by introducing a
> > > > conflict log history table, providing a structured, and queryable
> > > > record of all logical replication conflicts.  This should be a
> > > > configurable option whether to log into the conflict log history
> > > > table, server logs or both.
> > > >
> > >
> > > +1 for the idea.
>
> Thanks
>
> > >
> > > > This proposal has two main design questions:
> > > > ===================================
> > > >
> > > > 1. How do we store conflicting tuples from different tables?
> > > > Using a JSON column to store the row data seems like the most flexible
> > > > solution, as it can accommodate different table schemas.
> > >
> > > Yes, that is one option. I have not looked into details myself, but
> > > you can also explore 'anyarray' used in pg_statistics to store 'Column
> > > data values of the appropriate kind'.
>
> I think conversion from row to json and json to row is convenient and
> also other extensions like pgactive/bdr also provide as JSON.

Okay. Agreed.

> But we
> can explore this alternative options as well, thanks
>
> > > > 2. Should this be a system table or a user table?
> > > > a) System Table: Storing this in a system catalog is simple, but
> > > > catalogs aren't designed for ever-growing data. While pg_large_object
> > > > is an exception, this is not what we generally do IMHO.
> > > > b) User Table: This offers more flexibility. We could allow a user to
> > > > specify the table name during CREATE SUBSCRIPTION.  Then we choose to
> > > > either create the table internally or let the user create the table
> > > > with a predefined schema.
> > > >
> > > > A potential drawback is that a user might drop or alter the table.
> > > > However, we could mitigate this risk by simply logging a WARNING if
> > > > the table is configured but an insertion fails.
> > >
> > > I believe it makes more sense for this to be a catalog table rather
> > > than a user table. I wanted to check if we already have a large
> > > catalog table of this kind, and I think pg_statistic could be an
> > > example of a sizable catalog table. To get a rough idea of how size
> > > scales with data, I ran a quick experiment: I created 1000 tables,
> > > each with 2 JSON columns, 1 text column, and 2 integer columns. Then,
> > > I inserted 1000 rows into each table and ran ANALYZE to collect
> > > statistics. Here’s what I observed on a fresh database before and
> > > after:
> > >
> > > Before:
> > > pg_statistic row count: 412
> > > Table size: ~256 kB
> > >
> > > After:
> > > pg_statistic row count: 6,412
> > > Table size: ~5.3 MB
> > >
> > > Although it isn’t an exact comparison, this gives us some insight into
> > > how the statistics catalog table size grows with the number of rows.
> > > It doesn’t seem excessively large with 6k rows, given the fact that
> > > pg_statistic itself is a complex table having many 'anyarray'-type
> > > columns.
>
> Yeah that's good analysis, apart from this pg_largeobject is also a
> catalog which grows with each large object and growth rate for that
> will be very high because it stores large object data in catalog.
>
> > >
> > > That said, irrespective of what we decide, it would be ideal to offer
> > > users an option for automatic purging, perhaps via a retention period
> > > parameter like conflict_stats_retention_period (say default to 30
> > > days), or a manual purge API such as purge_conflict_stats('older than
> > > date'). I wasn’t able to find any such purge mechanism for PostgreSQL
> > > stats tables, but Oracle does provide such purging options for some of
> > > their statistics tables (not related to conflicts), see [1], [2].
> > > And to manage it better, it could be range partitioned on timestamp.
>
> Yeah that's an interesting suggestion to timestamp based partitioning
> it for purging.
>
> > It seems BDR also has one such conflict-log table which is a catalog
> > table and is also partitioned on time. It has a default retention
> > period of 30 days. See 'bdr.conflict_history' mentioned under
> > 'catalogs' in [1]
> >
> > [1]: https://www.enterprisedb.com/docs/pgd/latest/reference/tables-views-functions/#user-visible-catalogs-and-views
>
> Actually bdr is an extension and this table is under extension
> namespace (bdr.conflict_history) so this is not really a catalog but
> its a extension managed table.

Yes, right. Sorry for confusion.

> So logically for PostgreSQL its an
> user table but yeah this is created and managed by the extension.
>

Any idea if the user can alter/drop or perform any DML on it? I could
not find any details on this part.

> --
> Regards,
> Dilip Kumar
> Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-08T04:31:03Z

On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > So logically for PostgreSQL its an
> > user table but yeah this is created and managed by the extension.
> >
>
> Any idea if the user can alter/drop or perform any DML on it? I could
> not find any details on this part.

In my experience, for such extension managed tables where we want them
to behave like catalog, generally users are just granted with SELECT
permission.  So although it is not a catalog but for accessibility
wise for non admin users it is like a catalog.  IMHO, even if we
choose to create a user table for conflict log history we can also
control the permissions similarly.  What's your opinion on this?

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-08-08T08:42:33Z

On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > So logically for PostgreSQL its an
> > > user table but yeah this is created and managed by the extension.
> > >
> >
> > Any idea if the user can alter/drop or perform any DML on it? I could
> > not find any details on this part.
>
> In my experience, for such extension managed tables where we want them
> to behave like catalog, generally users are just granted with SELECT
> permission.  So although it is not a catalog but for accessibility
> wise for non admin users it is like a catalog.  IMHO, even if we
> choose to create a user table for conflict log history we can also
> control the permissions similarly.
>

Yes, it can be done. Technically there is nothing preventing us from
doing it. But in my experience, I have never seen any
system-maintained statistics tables to be a user table rather than
catalog table. Extensions are a different case; they typically manage
their own tables, which are not part of the system catalog. But if any
such stats related functionality is part of the core database, it
generally makes more sense to implement it as a catalog table
(provided there are no major obstacles to doing so). But I am curious
to know what others think here.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-08-13T10:08:55Z

On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > So logically for PostgreSQL its an
> > > user table but yeah this is created and managed by the extension.
> > >
> >
> > Any idea if the user can alter/drop or perform any DML on it? I could
> > not find any details on this part.
>
> In my experience, for such extension managed tables where we want them
> to behave like catalog, generally users are just granted with SELECT
> permission.  So although it is not a catalog but for accessibility
> wise for non admin users it is like a catalog.  IMHO, even if we
> choose to create a user table for conflict log history we can also
> control the permissions similarly.  What's your opinion on this?
>

Yes, I think it is important to control permissions on this table even
if it is a user table. How about giving SELECT, DELETE, TRUNCATE
permissions to subscription owner assuming we create one such table
per subscription?

It should be a user table due to following reasons (a) It is an ever
growing table by definition and we need some level of user control to
manage it (like remove the old data); (b) We may want some sort of
partitioning streategy to manage it, even though, we decide to do it
ourselves now but in future, we should allow user to also specify it;
(c) We may also want user to specify what exact information she wants
to get stored considering in future we want resolutions to also be
stored in it. See a somewhat similar proposal to store errors during
copy by Tom [1]; (d) In a near-by thread, we are discussing storing
errors during copy in user table [2] and we have some similarity with
that proposal as well.

If we agree on this then the next thing to consider is whether we
allow users to create such a table or do it ourselves. In the long
term, we may want both but for simplicity, we can auto-create
ourselves during CREATE SUBSCRIPTION with some option. BTW, if we
decide to let user create it then we can consider the idea of TYPED
tables as discussed in emails [3][4].

For user tables, we need to consider how to avoid replicating these
tables for publications that use FOR ALL TABLES specifier. One idea is
to use EXCLUDE table functionality as being discussed in thread [5]
but that would also be a bit tricky especially if we decide to create
such a table automatically. One naive idea is that internally we skip
sending changes from this table for "FOR ALL TABLES" publication, and
we shouldn't allow creating publication for this table. OTOH, if we
allow the user to create and specify this table, we can ask her to
specify with EXCLUDE syntax in publication. This needs more thoughts.

[1] - https://www.postgresql.org/message-id/flat/752672.1699474336%40sss.pgh.pa.us#b8450be5645c4252d7d02cf7aca1fc7b
[2] - https://www.postgresql.org/message-id/CACJufxH_OJpVra%3D0c4ow8fbxHj7heMcVaTNEPa5vAurSeNA-6Q%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/28c420cf-f25d-44f1-89fd-04ef0b2dd3db%40dunslane.net
[4] - https://www.postgresql.org/message-id/CADrsxdYG%2B%2BK%3DiKjRm35u03q-Nb0tQPJaqjxnA2mGt5O%3DDht7sw%40mail.gmail.com
[5] - https://www.postgresql.org/message-id/CANhcyEW%2BuJB_bvQLEaZCgoRTc1%3Di%2BQnrPPHxZ2%3D0SBSCyj9pkg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Alastair Turner <minion@decodable.me> — 2025-08-14T10:56:26Z

On Wed, 13 Aug 2025 at 11:09, Amit Kapila <amit.kapila16@gmail.com> wrote:

> On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com>
> wrote:
> > >
> > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com>
> wrote:
> > > >
> > > > So logically for PostgreSQL its an
> > > > user table but yeah this is created and managed by the extension.
> > > >
> > >
> > > Any idea if the user can alter/drop or perform any DML on it? I could
> > > not find any details on this part.
> >
> > In my experience, for such extension managed tables where we want them
> > to behave like catalog, generally users are just granted with SELECT
> > permission.  So although it is not a catalog but for accessibility
> > wise for non admin users it is like a catalog.  IMHO, even if we
> > choose to create a user table for conflict log history we can also
> > control the permissions similarly.  What's your opinion on this?
> >
>
> Yes, I think it is important to control permissions on this table even
> if it is a user table. How about giving SELECT, DELETE, TRUNCATE
> permissions to subscription owner assuming we create one such table
> per subscription?
>
> It should be a user table due to following reasons (a) It is an ever
> growing table by definition and we need some level of user control to
> manage it (like remove the old data); (b) We may want some sort of
> partitioning streategy to manage it, even though, we decide to do it
> ourselves now but in future, we should allow user to also specify it;
> (c) We may also want user to specify what exact information she wants
> to get stored considering in future we want resolutions to also be
> stored in it. See a somewhat similar proposal to store errors during
> copy by Tom [1]; (d) In a near-by thread, we are discussing storing
> errors during copy in user table [2] and we have some similarity with
> that proposal as well.
>
> If we agree on this then the next thing to consider is whether we
> allow users to create such a table or do it ourselves. In the long
> term, we may want both but for simplicity, we can auto-create
> ourselves during CREATE SUBSCRIPTION with some option. BTW, if we
> decide to let user create it then we can consider the idea of TYPED
> tables as discussed in emails [3][4].
>

Having it be a user table, and specifying the table per subscription sounds
good. This is very similar to how the load error tables for CloudBerry
behave, for instance. To have both options for table creation, CREATE ...
IF NOT EXISTS semantics work well - if the option on CREATE SUBSCRIPTION
specifies an existing table of the right type use it, or create one with
the name supplied. This would also give the user control over whether to
have one table per subscription, one central table or anything in between.
Rather than constraining permissions on the table, the CREATE SUBSCRIPTION
command could create a dependency relationship between the table and the
subscription.This would prevent removal of the table, even by a superuser.


> For user tables, we need to consider how to avoid replicating these
> tables for publications that use FOR ALL TABLES specifier. One idea is
> to use EXCLUDE table functionality as being discussed in thread [5]
> but that would also be a bit tricky especially if we decide to create
> such a table automatically. One naive idea is that internally we skip
> sending changes from this table for "FOR ALL TABLES" publication, and
> we shouldn't allow creating publication for this table. OTOH, if we
> allow the user to create and specify this table, we can ask her to
> specify with EXCLUDE syntax in publication. This needs more thoughts.
>

If a dependency relationship is established between the error table and the
subscription, could this be used as a basis for filtering the error tables
from FOR ALL TABLES subscriptions?

Regards

Alastair

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-08-15T06:52:32Z

On Thu, Aug 14, 2025 at 4:26 PM Alastair Turner <minion@decodable.me> wrote:
>
> On Wed, 13 Aug 2025 at 11:09, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> >
>> > On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote:
>> > >
>> > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> > > >
>> > > > So logically for PostgreSQL its an
>> > > > user table but yeah this is created and managed by the extension.
>> > > >
>> > >
>> > > Any idea if the user can alter/drop or perform any DML on it? I could
>> > > not find any details on this part.
>> >
>> > In my experience, for such extension managed tables where we want them
>> > to behave like catalog, generally users are just granted with SELECT
>> > permission.  So although it is not a catalog but for accessibility
>> > wise for non admin users it is like a catalog.  IMHO, even if we
>> > choose to create a user table for conflict log history we can also
>> > control the permissions similarly.  What's your opinion on this?
>> >
>>
>> Yes, I think it is important to control permissions on this table even
>> if it is a user table. How about giving SELECT, DELETE, TRUNCATE
>> permissions to subscription owner assuming we create one such table
>> per subscription?
>>
>> It should be a user table due to following reasons (a) It is an ever
>> growing table by definition and we need some level of user control to
>> manage it (like remove the old data); (b) We may want some sort of
>> partitioning streategy to manage it, even though, we decide to do it
>> ourselves now but in future, we should allow user to also specify it;
>> (c) We may also want user to specify what exact information she wants
>> to get stored considering in future we want resolutions to also be
>> stored in it. See a somewhat similar proposal to store errors during
>> copy by Tom [1]; (d) In a near-by thread, we are discussing storing
>> errors during copy in user table [2] and we have some similarity with
>> that proposal as well.
>>
>> If we agree on this then the next thing to consider is whether we
>> allow users to create such a table or do it ourselves. In the long
>> term, we may want both but for simplicity, we can auto-create
>> ourselves during CREATE SUBSCRIPTION with some option. BTW, if we
>> decide to let user create it then we can consider the idea of TYPED
>> tables as discussed in emails [3][4].
>
>
> Having it be a user table, and specifying the table per subscription sounds good. This is very similar to how the load error tables for CloudBerry behave, for instance. To have both options for table creation, CREATE ... IF NOT EXISTS semantics work well - if the option on CREATE SUBSCRIPTION specifies an existing table of the right type use it, or create one with the name supplied. This would also give the user control over whether to have one table per subscription, one central table or anything in between.
>

Sounds reasonable. I think the first version we can let such a table
be created automatically with some option(s) with subscription. Then,
in subsequent versions, we can extend the functionality to allow
existing tables.

>
> Rather than constraining permissions on the table, the CREATE SUBSCRIPTION command could create a dependency relationship between the table and the subscription.This would prevent removal of the table, even by a superuser.
>

Okay, that makes sense. But, we still probably want to disallow users
from inserting or updating rows in the conflict table.

>>
>> For user tables, we need to consider how to avoid replicating these
>> tables for publications that use FOR ALL TABLES specifier. One idea is
>> to use EXCLUDE table functionality as being discussed in thread [5]
>> but that would also be a bit tricky especially if we decide to create
>> such a table automatically. One naive idea is that internally we skip
>> sending changes from this table for "FOR ALL TABLES" publication, and
>> we shouldn't allow creating publication for this table. OTOH, if we
>> allow the user to create and specify this table, we can ask her to
>> specify with EXCLUDE syntax in publication. This needs more thoughts.
>
>
> If a dependency relationship is established between the error table and the subscription, could this be used as a basis for filtering the error tables from FOR ALL TABLES subscriptions?
>

Yeah, that is worth considering.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-15T09:00:48Z

On Wed, Aug 13, 2025 at 3:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > So logically for PostgreSQL its an
> > > > user table but yeah this is created and managed by the extension.
> > > >
> > >
> > > Any idea if the user can alter/drop or perform any DML on it? I could
> > > not find any details on this part.
> >
> > In my experience, for such extension managed tables where we want them
> > to behave like catalog, generally users are just granted with SELECT
> > permission.  So although it is not a catalog but for accessibility
> > wise for non admin users it is like a catalog.  IMHO, even if we
> > choose to create a user table for conflict log history we can also
> > control the permissions similarly.  What's your opinion on this?
> >
>
> Yes, I think it is important to control permissions on this table even
> if it is a user table. How about giving SELECT, DELETE, TRUNCATE
> permissions to subscription owner assuming we create one such table
> per subscription?

Right, we need to control the permission.  I am not sure whether we
want a per subscription table or a common one. Earlier I was thinking
of a single table, but I think per subscription is not a bad idea
especially for managing the permissions.  And there can not be a
really huge number of subscriptions that we need to worry about
creating many conflict log history tables and that too we will only
create such tables when users pass that subscription option.


> It should be a user table due to following reasons (a) It is an ever
> growing table by definition and we need some level of user control to
> manage it (like remove the old data); (b) We may want some sort of
> partitioning streategy to manage it, even though, we decide to do it
> ourselves now but in future, we should allow user to also specify it;

Maybe we can partition by range on date (when entry is inserted) .
That way it would be easy to get rid of older partitions for users.

> (c) We may also want user to specify what exact information she wants
> to get stored considering in future we want resolutions to also be
> stored in it. See a somewhat similar proposal to store errors during
> copy by Tom [1]; (d) In a near-by thread, we are discussing storing
> errors during copy in user table [2] and we have some similarity with
> that proposal as well.

Right, we may consider that as well.

> If we agree on this then the next thing to consider is whether we
> allow users to create such a table or do it ourselves. In the long
> term, we may want both but for simplicity, we can auto-create
> ourselves during CREATE SUBSCRIPTION with some option. BTW, if we
> decide to let user create it then we can consider the idea of TYPED
> tables as discussed in emails [3][4].

Yeah that's an interesting option.

>
> For user tables, we need to consider how to avoid replicating these
> tables for publications that use FOR ALL TABLES specifier. One idea is
> to use EXCLUDE table functionality as being discussed in thread [5]
> but that would also be a bit tricky especially if we decide to create
> such a table automatically. One naive idea is that internally we skip
> sending changes from this table for "FOR ALL TABLES" publication, and
> we shouldn't allow creating publication for this table. OTOH, if we
> allow the user to create and specify this table, we can ask her to
> specify with EXCLUDE syntax in publication. This needs more thoughts.

Yes this needs more thought, I will think more on this point and respond.

Yet another question is about table names, whether we keep some
standard name like conflict_log_history_$subid or let users pass the
name.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-08-18T06:55:05Z

On Fri, Aug 15, 2025 at 2:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Yet another question is about table names, whether we keep some
> standard name like conflict_log_history_$subid or let users pass the
> name.
>

It would be good if we can let the user specify the table_name and if
she didn't specify then use an internally generated name. I think it
will be somewhat similar to slot_name. However, in this case, there is
one challenge which is how can we decide whether the schema of the
user provided table_name is correct or not? Do we compare it with the
standard schema we are planning to use?

One idea to keep things simple for the first version is that we allow
users to specify the table_name for storing conflicts but the table
should be created internally and if the same name table already
exists, we can give an ERROR. Then we can later extend the
functionality to even allow storing conflicts in pre-created tables
with more checks about its schema.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-20T06:16:55Z

On Mon, Aug 18, 2025 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 15, 2025 at 2:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > Yet another question is about table names, whether we keep some
> > standard name like conflict_log_history_$subid or let users pass the
> > name.
> >
>
> It would be good if we can let the user specify the table_name and if
> she didn't specify then use an internally generated name. I think it
> will be somewhat similar to slot_name. However, in this case, there is
> one challenge which is how can we decide whether the schema of the
> user provided table_name is correct or not? Do we compare it with the
> standard schema we are planning to use?

Ideally we can do that, if you see in this thread [1] there is a patch
[2] which first try to validate the table schema and if it doesn't
exist it creates it on its own.  And it seems fine to me.

> One idea to keep things simple for the first version is that we allow
> users to specify the table_name for storing conflicts but the table
> should be created internally and if the same name table already
> exists, we can give an ERROR. Then we can later extend the
> functionality to even allow storing conflicts in pre-created tables
> with more checks about its schema.

That's fair too.  I am wondering what namespace we should create this
user table in. If we are creating internally, I assume the user should
provide a schema qualified name right?


[1] https://www.postgresql.org/message-id/flat/752672.1699474336%40sss.pgh.pa.us#b8450be5645c4252d7d02cf7aca1fc7b
[2] https://www.postgresql.org/message-id/attachment/152792/v8-0001-Add-a-new-COPY-option-SAVE_ERROR.patch


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-08-20T12:16:29Z

On Wed, Aug 20, 2025 at 11:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Aug 18, 2025 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > One idea to keep things simple for the first version is that we allow
> > users to specify the table_name for storing conflicts but the table
> > should be created internally and if the same name table already
> > exists, we can give an ERROR. Then we can later extend the
> > functionality to even allow storing conflicts in pre-created tables
> > with more checks about its schema.
>
> That's fair too.  I am wondering what namespace we should create this
> user table in. If we are creating internally, I assume the user should
> provide a schema qualified name right?
>

Yeah, but if not provided then we should create it based on
search_path similar to what we do when user created the table from
psql.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-21T03:47:08Z

On Wed, Aug 20, 2025 at 5:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Aug 20, 2025 at 11:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Aug 18, 2025 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > One idea to keep things simple for the first version is that we allow
> > > users to specify the table_name for storing conflicts but the table
> > > should be created internally and if the same name table already
> > > exists, we can give an ERROR. Then we can later extend the
> > > functionality to even allow storing conflicts in pre-created tables
> > > with more checks about its schema.
> >
> > That's fair too.  I am wondering what namespace we should create this
> > user table in. If we are creating internally, I assume the user should
> > provide a schema qualified name right?
> >
>
> Yeah, but if not provided then we should create it based on
> search_path similar to what we do when user created the table from
> psql.

Yeah that makes sense.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-06T09:38:00Z

On Thu, Aug 21, 2025 at 9:17 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Aug 20, 2025 at 5:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Aug 20, 2025 at 11:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, Aug 18, 2025 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > >
> > > > One idea to keep things simple for the first version is that we allow
> > > > users to specify the table_name for storing conflicts but the table
> > > > should be created internally and if the same name table already
> > > > exists, we can give an ERROR. Then we can later extend the
> > > > functionality to even allow storing conflicts in pre-created tables
> > > > with more checks about its schema.
> > >
> > > That's fair too.  I am wondering what namespace we should create this
> > > user table in. If we are creating internally, I assume the user should
> > > provide a schema qualified name right?
> > >
> >
> > Yeah, but if not provided then we should create it based on
> > search_path similar to what we do when user created the table from
> > psql.

While working on the patch, I see there are some open questions

1. We decided to pass the conflict history table name during
subscription creation. And it makes sense to create this table when
the CREATE SUBSCRIPTION command is executed. A potential concern is
that the subscription owner will also own this table, having full
control over it, including the ability to drop or alter its schema.
This might not be an issue. If an INSERT into the conflict table
fails, we can check the table's existence and schema. If they are not
as expected, the conflict log history option can be disabled and
re-enabled later via ALTER SUBSCRIPTION.

2. A further challenge is how to exclude these tables from publishing
changes. If we support a subscription-level log history table and the
user publishes ALL TABLES, the output plugin uses
is_publishable_relation() to check if a table is publishable. However,
applying the same logic here would require checking each subscription
on the node to see if the table is designated as a conflict log
history table for any subscription, which could be costly.

3. And one last thing is about should we consider dropping this table
when we drop the subscription, I think this makes sense as we are
internally creating it while creating the subscription.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Alastair Turner <minion@decodable.me> — 2025-09-07T08:12:02Z

Hi Dilip

Thanks for working on this, I think it will make conflict detection a lot
more useful.

On Sat, 6 Sept 2025, 10:38 Dilip Kumar, <dilipbalaut@gmail.com> wrote:

> While working on the patch, I see there are some open questions
>
> 1. We decided to pass the conflict history table name during
> subscription creation. And it makes sense to create this table when
> the CREATE SUBSCRIPTION command is executed. A potential concern is
> that the subscription owner will also own this table, having full
> control over it, including the ability to drop or alter its schema.

...
>

Typed tables and the dependency framework can address this concern. The
schema of a typed table cannot be changed. If the subscription is marked as
a dependency of the log table, the table cannot be dropped while the
subscription exists.

> 2. A further challenge is how to exclude these tables from publishing
> changes. If we support a subscription-level log history table and the
> user publishes ALL TABLES, the output plugin uses
> is_publishable_relation() to check if a table is publishable. However,
> applying the same logic here would require checking each subscription
> on the node to see if the table is designated as a conflict log
> history table for any subscription, which could be costly.
>

 Checking the type of a table and/or whether a subscription object depends
on it in a certain way would be a far less costly operation to add to
is_publishable_relation()

> 3. And one last thing is about should we consider dropping this table
> when we drop the subscription, I think this makes sense as we are
> internally creating it while creating the subscription.
>

Having to clean up the log table explicitly is likely to annoy users far
less than having the conflict data destroyed as a side effect of another
operation. I would strongly suggest leaving the table in place when the
subscription is dropped.

Regards
Alastair

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-08T06:31:28Z

On Sun, Sep 7, 2025 at 1:42 PM Alastair Turner <minion@decodable.me> wrote:
>
> Hi Dilip
>
> Thanks for working on this, I think it will make conflict detection a lot more useful.

Thanks for the suggestions, please find my reply inline.

> On Sat, 6 Sept 2025, 10:38 Dilip Kumar, <dilipbalaut@gmail.com> wrote:
>>
>> While working on the patch, I see there are some open questions
>>
>> 1. We decided to pass the conflict history table name during
>> subscription creation. And it makes sense to create this table when
>> the CREATE SUBSCRIPTION command is executed. A potential concern is
>> that the subscription owner will also own this table, having full
>> control over it, including the ability to drop or alter its schema.

>
> Typed tables and the dependency framework can address this concern. The schema of a typed table cannot be changed. If the subscription is marked as a dependency of the log table, the table cannot be dropped while the subscription exists.

Yeah type table can be useful here, but only concern is when do we
create this type.  One option is whenever we can create a catalog
relation say "conflict_log_history" that will create a type and then
for each subscription if we need to create the conflict history table
we can create it as "conflict_log_history" type, but this might not be
a best option as we are creating catalog just for using this type.
Second option is to create a type while creating a table itself but
then again the problem remains the same as subscription owners get
control over altering the schema of the type itself.  So the goal is
we want this type to be created such that it can not be altered so
IMHO option1 is more suitable i.e. creating conflict_log_history as
catalog and per subscription table can be created as this type.

>>
>> 2. A further challenge is how to exclude these tables from publishing
>> changes. If we support a subscription-level log history table and the
>> user publishes ALL TABLES, the output plugin uses
>> is_publishable_relation() to check if a table is publishable. However,
>> applying the same logic here would require checking each subscription
>> on the node to see if the table is designated as a conflict log
>> history table for any subscription, which could be costly.
>
>
>  Checking the type of a table and/or whether a subscription object depends on it in a certain way would be a far less costly operation to add to is_publishable_relation()
+1

>
>>
>> 3. And one last thing is about should we consider dropping this table
>> when we drop the subscription, I think this makes sense as we are
>> internally creating it while creating the subscription.
>
>
> Having to clean up the log table explicitly is likely to annoy users far less than having the conflict data destroyed as a side effect of another operation. I would strongly suggest leaving the table in place when the subscription is dropped.

Thanks for the input, I would like to hear opinions from others as
well here.  I agree that implicitly getting rid of the conflict
history might be problematic but we also need to consider that we are
considering dropping this when the whole subscription is dropped.  Not
sure even after subscription drop users will be interested in conflict
history, if yes then they need to be aware of preserving that isn't
it.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-09-10T09:55:37Z

On Mon, Sep 8, 2025 at 12:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sun, Sep 7, 2025 at 1:42 PM Alastair Turner <minion@decodable.me> wrote:
> >
> > Hi Dilip
> >
> > Thanks for working on this, I think it will make conflict detection a lot more useful.
>
> Thanks for the suggestions, please find my reply inline.
>
> > On Sat, 6 Sept 2025, 10:38 Dilip Kumar, <dilipbalaut@gmail.com> wrote:
> >>
> >> While working on the patch, I see there are some open questions
> >>
> >> 1. We decided to pass the conflict history table name during
> >> subscription creation. And it makes sense to create this table when
> >> the CREATE SUBSCRIPTION command is executed. A potential concern is
> >> that the subscription owner will also own this table, having full
> >> control over it, including the ability to drop or alter its schema.
>
> >
> > Typed tables and the dependency framework can address this concern. The schema of a typed table cannot be changed. If the subscription is marked as a dependency of the log table, the table cannot be dropped while the subscription exists.
>
> Yeah type table can be useful here, but only concern is when do we
> create this type.
>

How about having this as a built-in type?

>  One option is whenever we can create a catalog
> relation say "conflict_log_history" that will create a type and then
> for each subscription if we need to create the conflict history table
> we can create it as "conflict_log_history" type, but this might not be
> a best option as we are creating catalog just for using this type.
> Second option is to create a type while creating a table itself but
> then again the problem remains the same as subscription owners get
> control over altering the schema of the type itself.  So the goal is
> we want this type to be created such that it can not be altered so
> IMHO option1 is more suitable i.e. creating conflict_log_history as
> catalog and per subscription table can be created as this type.
>

I think having it as a catalog table has drawbacks like who will clean
this ever growing table. The one thing is not clear from Alastair's
response is that he said to make subscription as a dependency of
table, if we do so, then won't it be difficult to even drop
subscription and also doesn't that sound reverse of what we want.

> >>
> >> 2. A further challenge is how to exclude these tables from publishing
> >> changes. If we support a subscription-level log history table and the
> >> user publishes ALL TABLES, the output plugin uses
> >> is_publishable_relation() to check if a table is publishable. However,
> >> applying the same logic here would require checking each subscription
> >> on the node to see if the table is designated as a conflict log
> >> history table for any subscription, which could be costly.
> >
> >
> >  Checking the type of a table and/or whether a subscription object depends on it in a certain way would be a far less costly operation to add to is_publishable_relation()
> +1
>
> >
> >>
> >> 3. And one last thing is about should we consider dropping this table
> >> when we drop the subscription, I think this makes sense as we are
> >> internally creating it while creating the subscription.
> >
> >
> > Having to clean up the log table explicitly is likely to annoy users far less than having the conflict data destroyed as a side effect of another operation. I would strongly suggest leaving the table in place when the subscription is dropped.
>
> Thanks for the input, I would like to hear opinions from others as
> well here.
>

But OTOH, there could be users who want such a table to be dropped.
One possibility is that if we user provided us a pre-created table
then we leave it to user to remove the table, otherwise, we can remove
with drop subscription. BTW, did we decide that we want a
conflict-table-per-subscription or one table for all subscriptions, if
later, then I guess the problem would be that it has to be a shared
table across databases.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-10T10:15:40Z

On Wed, Sep 10, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 8, 2025 at 12:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Sun, Sep 7, 2025 at 1:42 PM Alastair Turner <minion@decodable.me> wrote:
> > >
> > > Hi Dilip
> > >
> > > Thanks for working on this, I think it will make conflict detection a lot more useful.
> >
> > Thanks for the suggestions, please find my reply inline.
> >
> > > On Sat, 6 Sept 2025, 10:38 Dilip Kumar, <dilipbalaut@gmail.com> wrote:
> > >>
> > >> While working on the patch, I see there are some open questions
> > >>
> > >> 1. We decided to pass the conflict history table name during
> > >> subscription creation. And it makes sense to create this table when
> > >> the CREATE SUBSCRIPTION command is executed. A potential concern is
> > >> that the subscription owner will also own this table, having full
> > >> control over it, including the ability to drop or alter its schema.
> >
> > >
> > > Typed tables and the dependency framework can address this concern. The schema of a typed table cannot be changed. If the subscription is marked as a dependency of the log table, the table cannot be dropped while the subscription exists.
> >
> > Yeah type table can be useful here, but only concern is when do we
> > create this type.
> >
>
> How about having this as a built-in type?

Here we will have to create a built-in type of type table which is I
think typcategory => 'C' and if we create this type it should be
supplied with the "typrelid" that means there should be a backing
catalog table. At least thats what I think.

> >  One option is whenever we can create a catalog
> > relation say "conflict_log_history" that will create a type and then
> > for each subscription if we need to create the conflict history table
> > we can create it as "conflict_log_history" type, but this might not be
> > a best option as we are creating catalog just for using this type.
> > Second option is to create a type while creating a table itself but
> > then again the problem remains the same as subscription owners get
> > control over altering the schema of the type itself.  So the goal is
> > we want this type to be created such that it can not be altered so
> > IMHO option1 is more suitable i.e. creating conflict_log_history as
> > catalog and per subscription table can be created as this type.
> >
>
> I think having it as a catalog table has drawbacks like who will clean
> this ever growing table.

No, I didn't mean an ever growing catalog table, I was giving an
option to create a catalog table just to create a built-in type and
then we will create an actual log history table of this built-in type
for each subscription while creating the subscription.  So this
catalog table will be there but nothing will be inserted to this table
and whenever the user supplies a conflict log history table name while
creating a subscription that time we will create an actual table and
the type of the table will be as the catalog table type.  I agree
creating a catalog table for this purpose might not be worth it, but I
am not yet able to figure out how to create a built-in type of type
table without creating the actual table.

 The one thing is not clear from Alastair's
> response is that he said to make subscription as a dependency of
> table, if we do so, then won't it be difficult to even drop
> subscription and also doesn't that sound reverse of what we want.

I assume he means subscription will be dependent on the log table,
that means we can not drop the log table as subscription is dependent
on this table.

> > >>
> > >> 2. A further challenge is how to exclude these tables from publishing
> > >> changes. If we support a subscription-level log history table and the
> > >> user publishes ALL TABLES, the output plugin uses
> > >> is_publishable_relation() to check if a table is publishable. However,
> > >> applying the same logic here would require checking each subscription
> > >> on the node to see if the table is designated as a conflict log
> > >> history table for any subscription, which could be costly.
> > >
> > >
> > >  Checking the type of a table and/or whether a subscription object depends on it in a certain way would be a far less costly operation to add to is_publishable_relation()
> > +1
> >
> > >
> > >>
> > >> 3. And one last thing is about should we consider dropping this table
> > >> when we drop the subscription, I think this makes sense as we are
> > >> internally creating it while creating the subscription.
> > >
> > >
> > > Having to clean up the log table explicitly is likely to annoy users far less than having the conflict data destroyed as a side effect of another operation. I would strongly suggest leaving the table in place when the subscription is dropped.
> >
> > Thanks for the input, I would like to hear opinions from others as
> > well here.
> >
>
> But OTOH, there could be users who want such a table to be dropped.
> One possibility is that if we user provided us a pre-created table
> then we leave it to user to remove the table, otherwise, we can remove
> with drop subscription.

Thanks make sense.

 BTW, did we decide that we want a
> conflict-table-per-subscription or one table for all subscriptions, if
> later, then I guess the problem would be that it has to be a shared
> table across databases.

Right and I don't think there is an option to create a user defined
shared table.  And I don't think there is any issue creating per
subscription conflict log history table, except that the subscription
owner should have permission to create the table in the database while
creating the subscription, but I think this is expected, either user
can get the sufficient privilege or disable the option for conflict
log history table.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Alastair Turner <minion@decodable.me> — 2025-09-10T11:01:58Z

On Wed, 10 Sept 2025 at 11:15, Dilip Kumar <dilipbalaut@gmail.com> wrote:

> On Wed, Sep 10, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
>
...

> >
> > How about having this as a built-in type?
>
> Here we will have to create a built-in type of type table which is I
> think typcategory => 'C' and if we create this type it should be
> supplied with the "typrelid" that means there should be a backing
> catalog table. At least thats what I think.
>
A compound type can be used for building a table, it's not necessary to
create a table when creating the type. In user SQL:

CREATE TYPE conflict_log_type AS (
  conflictid UUID,
  subid OID,
  tableid OID,
  conflicttype TEXT,
  operationtype TEXT,
  replication_origin   TEXT,
  remote_commit_ts TIMESTAMPTZ,
  local_commit_ts TIMESTAMPTZ,
  ri_key                    JSON,
  remote_tuple         JSON,
  local_tuple          JSON
);

CREATE TABLE my_subscription_conflicts OF conflict_log_type;

...

>
>  The one thing is not clear from Alastair's
> > response is that he said to make subscription as a dependency of
> > table, if we do so, then won't it be difficult to even drop
> > subscription and also doesn't that sound reverse of what we want.
>
> I assume he means subscription will be dependent on the log table,
> that means we can not drop the log table as subscription is dependent
> on this table.
>

Yes, that's what I was proposing.


> > > >>
> > > >> 2. A further challenge is how to exclude these tables from
> publishing
> > > >> changes. If we support a subscription-level log history table and
> the
> > > >> user publishes ALL TABLES, the output plugin uses
> > > >> is_publishable_relation() to check if a table is publishable.
> However,
> > > >> applying the same logic here would require checking each
> subscription
> > > >> on the node to see if the table is designated as a conflict log
> > > >> history table for any subscription, which could be costly.
> > > >
> > > >
> > > >  Checking the type of a table and/or whether a subscription object
> depends on it in a certain way would be a far less costly operation to add
> to is_publishable_relation()
> > > +1
> > >
> > > >
> > > >>
> > > >> 3. And one last thing is about should we consider dropping this
> table
> > > >> when we drop the subscription, I think this makes sense as we are
> > > >> internally creating it while creating the subscription.
> > > >
> > > >
> > > > Having to clean up the log table explicitly is likely to annoy users
> far less than having the conflict data destroyed as a side effect of
> another operation. I would strongly suggest leaving the table in place when
> the subscription is dropped.
> > >
> > > Thanks for the input, I would like to hear opinions from others as
> > > well here.
> > >
> >
> > But OTOH, there could be users who want such a table to be dropped.
> > One possibility is that if we user provided us a pre-created table
> > then we leave it to user to remove the table, otherwise, we can remove
> > with drop subscription.
>
> Thanks make sense.
>
>  BTW, did we decide that we want a
> > conflict-table-per-subscription or one table for all subscriptions, if
> > later, then I guess the problem would be that it has to be a shared
> > table across databases.
>
> Right and I don't think there is an option to create a user defined
> shared table.  And I don't think there is any issue creating per
> subscription conflict log history table, except that the subscription
> owner should have permission to create the table in the database while
> creating the subscription, but I think this is expected, either user
> can get the sufficient privilege or disable the option for conflict
> log history table.
>

Since  subscriptions are created in a particular database, it seems
reasonable that error tables would also be created in a particular database.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-10T12:06:41Z

On Wed, Sep 10, 2025 at 4:32 PM Alastair Turner <minion@decodable.me> wrote:
>
>> Here we will have to create a built-in type of type table which is I
>> think typcategory => 'C' and if we create this type it should be
>> supplied with the "typrelid" that means there should be a backing
>> catalog table. At least thats what I think.
>
> A compound type can be used for building a table, it's not necessary to create a table when creating the type. In user SQL:
>
> CREATE TYPE conflict_log_type AS (
>   conflictid UUID,
>   subid OID,
>   tableid OID,
>   conflicttype TEXT,
>   operationtype TEXT,
>   replication_origin   TEXT,
>   remote_commit_ts TIMESTAMPTZ,
>   local_commit_ts TIMESTAMPTZ,
>   ri_key                    JSON,
>   remote_tuple         JSON,
>   local_tuple          JSON
> );
>
> CREATE TABLE my_subscription_conflicts OF conflict_log_type;

Problem is if you CREATE TYPE just before creating the table that
means subscription owners get full control over the type as well it
means they can alter the type itself.  So logically this TYPE should
be a built-in type so that subscription owners do not have control to
ALTER the type but they have permission to create a table from this
type.  But the problem is whenever you create a type it needs to have
corresponding relid in pg_class in fact you can just create a type as
per your example and see[1] it will get corresponding entry in
pg_class.

So the problem is if you create a user defined type it will be created
under the subscription owner and it defeats the purpose of not
allowing to alter the type OTOH if we create a built-in type it needs
to have a corresponding entry in pg_class.

So what's your proposal, create this type while creating a
subscription or as a built-in type, or anything else?

[1]
postgres[1948123]=# CREATE TYPE conflict_log_type AS (conflictid UUID);
postgres[1948123]=# select oid, typrelid, typcategory from pg_type
where typname='conflict_log_type';

  oid  | typrelid | typcategory
-------+----------+-------------
 16386 |    16384 | C
(1 row)

postgres[1948123]=# select relname from pg_class where oid=16384;
      relname
-------------------
 conflict_log_type

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2025-09-10T19:23:03Z

Hi,

On Tue, Aug 5, 2025 at 5:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Currently we log conflicts to the server's log file and updates, this
> approach has limitations, 1) Difficult to query and analyze, parsing
> plain text log files for conflict details is inefficient. 2) Lack of
> structured data, key conflict attributes (table, operation, old/new
> data, LSN, etc.) are not readily available in a structured, queryable
> format. 3) Difficult for external monitoring tools or custom
> resolution scripts to consume conflict data directly.
>
> This proposal aims to address these limitations by introducing a
> conflict log history table, providing a structured, and queryable
> record of all logical replication conflicts.  This should be a
> configurable option whether to log into the conflict log history
> table, server logs or both.

+1 for the overall idea. Having an option to separate out the
conflicts helps analyze the data correctness issues and understand the
behavior of conflicts.

Parsing server logs file for analysis and debugging is a typical
requirement differently met with tools like log_fdw or capture server
logs in CSV format for parsing or do text search and analyze etc.

> This proposal has two main design questions:
> ===================================
>
> 1. How do we store conflicting tuples from different tables?
> Using a JSON column to store the row data seems like the most flexible
> solution, as it can accommodate different table schemas.

How good is storing conflicts on the table? Is it okay to generate WAL
traffic? Is it okay to physically replicate this log table to all
replicas? Is it okay to logically replicate this log table to all
subscribers and logical decoding clients? How does this table get
truncated? If truncation gets delayed, won't it unnecessarily fill up
storage?

> 2. Should this be a system table or a user table?
> a) System Table: Storing this in a system catalog is simple, but
> catalogs aren't designed for ever-growing data. While pg_large_object
> is an exception, this is not what we generally do IMHO.
> b) User Table: This offers more flexibility. We could allow a user to
> specify the table name during CREATE SUBSCRIPTION.  Then we choose to
> either create the table internally or let the user create the table
> with a predefined schema.

-1 for the system table for sure.

> A potential drawback is that a user might drop or alter the table.
> However, we could mitigate this risk by simply logging a WARNING if
> the table is configured but an insertion fails.
> I am currently working on a POC patch for the same, but will post that
> once we have some thoughts on design choices.

How about streaming the conflicts in fixed format to a separate log
file other than regular postgres server log file?  All the
rules/settings that apply to regular postgres server log files also
apply for conflicts server log files (rotation, GUCs, format
CSV/JSON/TEXT etc.). This way there's no additional WAL, and we don't
have to worry about drop/alter, truncate, delete, update/insert,
permission model, physical replication, logical replication, storage
space etc.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-09-11T03:13:43Z

On Thu, Sep 11, 2025 at 12:53 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Aug 5, 2025 at 5:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > Currently we log conflicts to the server's log file and updates, this
> > approach has limitations, 1) Difficult to query and analyze, parsing
> > plain text log files for conflict details is inefficient. 2) Lack of
> > structured data, key conflict attributes (table, operation, old/new
> > data, LSN, etc.) are not readily available in a structured, queryable
> > format. 3) Difficult for external monitoring tools or custom
> > resolution scripts to consume conflict data directly.
> >
> > This proposal aims to address these limitations by introducing a
> > conflict log history table, providing a structured, and queryable
> > record of all logical replication conflicts.  This should be a
> > configurable option whether to log into the conflict log history
> > table, server logs or both.
>
> +1 for the overall idea. Having an option to separate out the
> conflicts helps analyze the data correctness issues and understand the
> behavior of conflicts.
>
> Parsing server logs file for analysis and debugging is a typical
> requirement differently met with tools like log_fdw or capture server
> logs in CSV format for parsing or do text search and analyze etc.
>
> > This proposal has two main design questions:
> > ===================================
> >
> > 1. How do we store conflicting tuples from different tables?
> > Using a JSON column to store the row data seems like the most flexible
> > solution, as it can accommodate different table schemas.
>
> How good is storing conflicts on the table? Is it okay to generate WAL
> traffic?
>

Yesh, I think so. One would like to query conflicts and resolutions
for those conflicts at a later point to ensure consistency. BTW, if
you are worried about WAL traffic, please note conflicts shouldn't be
a very often event, so additional WAL should be okay. OTOH, if the
conflicts are frequent, anyway, the performance won't be that great as
that means there is a kind of ERROR which we have to deal by having
resolution for it.

> Is it okay to physically replicate this log table to all
> replicas?
>

Yes, that should be okay as we want the conflict_tables to be present
after failover.

 Is it okay to logically replicate this log table to all
> subscribers and logical decoding clients?
>

I think we should avoid this.

> How does this table get
> truncated? If truncation gets delayed, won't it unnecessarily fill up
> storage?
>

I think it should be users responsibility to clean this table as they
better know when the data in the table is obsolete. Eventually, we can
also have some policies via options or some other way to get it
truncated. IIRC, we also discussed having these as partition tables so
that it is easy to discard data. However, for initial version, we may
want something simpler.

> > 2. Should this be a system table or a user table?
> > a) System Table: Storing this in a system catalog is simple, but
> > catalogs aren't designed for ever-growing data. While pg_large_object
> > is an exception, this is not what we generally do IMHO.
> > b) User Table: This offers more flexibility. We could allow a user to
> > specify the table name during CREATE SUBSCRIPTION.  Then we choose to
> > either create the table internally or let the user create the table
> > with a predefined schema.
>
> -1 for the system table for sure.
>
> > A potential drawback is that a user might drop or alter the table.
> > However, we could mitigate this risk by simply logging a WARNING if
> > the table is configured but an insertion fails.
> > I am currently working on a POC patch for the same, but will post that
> > once we have some thoughts on design choices.
>
> How about streaming the conflicts in fixed format to a separate log
> file other than regular postgres server log file?
>

I would prefer this info to be stored in tables as it would be easy to
query them. If we use separate LOGs then we should provide some views
to query the LOG.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-12T10:13:21Z

On Thu, Sep 11, 2025 at 8:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 11, 2025 at 12:53 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Tue, Aug 5, 2025 at 5:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > Currently we log conflicts to the server's log file and updates, this
> > > approach has limitations, 1) Difficult to query and analyze, parsing
> > > plain text log files for conflict details is inefficient. 2) Lack of
> > > structured data, key conflict attributes (table, operation, old/new
> > > data, LSN, etc.) are not readily available in a structured, queryable
> > > format. 3) Difficult for external monitoring tools or custom
> > > resolution scripts to consume conflict data directly.
> > >
> > > This proposal aims to address these limitations by introducing a
> > > conflict log history table, providing a structured, and queryable
> > > record of all logical replication conflicts.  This should be a
> > > configurable option whether to log into the conflict log history
> > > table, server logs or both.
> >
> > +1 for the overall idea. Having an option to separate out the
> > conflicts helps analyze the data correctness issues and understand the
> > behavior of conflicts.
> >
> > Parsing server logs file for analysis and debugging is a typical
> > requirement differently met with tools like log_fdw or capture server
> > logs in CSV format for parsing or do text search and analyze etc.
> >
> > > This proposal has two main design questions:
> > > ===================================
> > >
> > > 1. How do we store conflicting tuples from different tables?
> > > Using a JSON column to store the row data seems like the most flexible
> > > solution, as it can accommodate different table schemas.
> >
> > How good is storing conflicts on the table? Is it okay to generate WAL
> > traffic?
> >
>
> Yesh, I think so. One would like to query conflicts and resolutions
> for those conflicts at a later point to ensure consistency. BTW, if
> you are worried about WAL traffic, please note conflicts shouldn't be
> a very often event, so additional WAL should be okay. OTOH, if the
> conflicts are frequent, anyway, the performance won't be that great as
> that means there is a kind of ERROR which we have to deal by having
> resolution for it.
>
> > Is it okay to physically replicate this log table to all
> > replicas?
> >
>
> Yes, that should be okay as we want the conflict_tables to be present
> after failover.
>
>  Is it okay to logically replicate this log table to all
> > subscribers and logical decoding clients?
> >
>
> I think we should avoid this.
>
> > How does this table get
> > truncated? If truncation gets delayed, won't it unnecessarily fill up
> > storage?
> >
>
> I think it should be users responsibility to clean this table as they
> better know when the data in the table is obsolete. Eventually, we can
> also have some policies via options or some other way to get it
> truncated. IIRC, we also discussed having these as partition tables so
> that it is easy to discard data. However, for initial version, we may
> want something simpler.
>
> > > 2. Should this be a system table or a user table?
> > > a) System Table: Storing this in a system catalog is simple, but
> > > catalogs aren't designed for ever-growing data. While pg_large_object
> > > is an exception, this is not what we generally do IMHO.
> > > b) User Table: This offers more flexibility. We could allow a user to
> > > specify the table name during CREATE SUBSCRIPTION.  Then we choose to
> > > either create the table internally or let the user create the table
> > > with a predefined schema.
> >
> > -1 for the system table for sure.
> >
> > > A potential drawback is that a user might drop or alter the table.
> > > However, we could mitigate this risk by simply logging a WARNING if
> > > the table is configured but an insertion fails.
> > > I am currently working on a POC patch for the same, but will post that
> > > once we have some thoughts on design choices.
> >
> > How about streaming the conflicts in fixed format to a separate log
> > file other than regular postgres server log file?
> >
>
> I would prefer this info to be stored in tables as it would be easy to
> query them. If we use separate LOGs then we should provide some views
> to query the LOG.

I was looking into another thread where we provide an error table for
COPY [1], it requires the user to pre-create the error table. And
inside the COPY command we will validate the table, validation in that
context is a one-time process checking for: (1) table existence, (2)
ability to acquire a sufficient lock, (3) INSERT privileges, and (4)
matching column names and data types. This approach avoids concerns
about the user's DROP or ALTER permissions.

Our requirement for the logical replication conflict log table
differs, as we must validate the target table upon every conflict
insertion, not just at subscription creation. A more robust
alternative is to perform validation and acquire a lock on the
conflict table whenever the subscription worker starts. This prevents
modifications (like ALTER or DROP) while the worker is active. When
the worker gets restarted, we can re-validate the table and
automatically disable the conflict logging feature if validation
fails.  And this can be enabled by ALTER SUBSCRIPTION by setting the
option again.

And if we want in first version we can expect user to create the table
as per the expected schema and supply it, this will avoid the need of
handling how to avoid it from publishing as it will be user's
responsibility and then in top up patches we can also allow to create
the table internally if tables doesn't exist and then we can find out
solution to avoid it from being publish when ALL TABLES are published.

Thoughts?

[1] https://www.postgresql.org/message-id/CACJufxEo-rsH5v__S3guUhDdXjakC7m7N5wj%3DmOB5rPiySBoQg%40mail.gmail.com

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2025-09-13T00:44:19Z

Hi,

On Wed, Sep 10, 2025 at 8:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > How about streaming the conflicts in fixed format to a separate log
> > file other than regular postgres server log file?
>
> I would prefer this info to be stored in tables as it would be easy to
> query them. If we use separate LOGs then we should provide some views
> to query the LOG.

Providing views to query the conflicts LOG is the easiest way than
having tables (Probably we must provide both - logging conflicts to
tables and separate LOG files). However, wanting the conflicts logs
after failovers is something that makes me think the table approach is
better. I'm open to more thoughts here.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2025-09-13T00:45:56Z

Hi,

On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I was looking into another thread where we provide an error table for
> COPY [1], it requires the user to pre-create the error table. And
> inside the COPY command we will validate the table, validation in that
> context is a one-time process checking for: (1) table existence, (2)
> ability to acquire a sufficient lock, (3) INSERT privileges, and (4)
> matching column names and data types. This approach avoids concerns
> about the user's DROP or ALTER permissions.
>
> Our requirement for the logical replication conflict log table
> differs, as we must validate the target table upon every conflict
> insertion, not just at subscription creation. A more robust
> alternative is to perform validation and acquire a lock on the
> conflict table whenever the subscription worker starts. This prevents
> modifications (like ALTER or DROP) while the worker is active. When
> the worker gets restarted, we can re-validate the table and
> automatically disable the conflict logging feature if validation
> fails.  And this can be enabled by ALTER SUBSCRIPTION by setting the
> option again.

Having to worry about ALTER/DROP and adding code to protect seems like
an overkill.

> And if we want in first version we can expect user to create the table
> as per the expected schema and supply it, this will avoid the need of
> handling how to avoid it from publishing as it will be user's
> responsibility and then in top up patches we can also allow to create
> the table internally if tables doesn't exist and then we can find out
> solution to avoid it from being publish when ALL TABLES are published.

This looks much more simple to start with.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-14T06:53:12Z

On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks for the feedback Bharath

> On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > I was looking into another thread where we provide an error table for
> > COPY [1], it requires the user to pre-create the error table. And
> > inside the COPY command we will validate the table, validation in that
> > context is a one-time process checking for: (1) table existence, (2)
> > ability to acquire a sufficient lock, (3) INSERT privileges, and (4)
> > matching column names and data types. This approach avoids concerns
> > about the user's DROP or ALTER permissions.
> >
> > Our requirement for the logical replication conflict log table
> > differs, as we must validate the target table upon every conflict
> > insertion, not just at subscription creation. A more robust
> > alternative is to perform validation and acquire a lock on the
> > conflict table whenever the subscription worker starts. This prevents
> > modifications (like ALTER or DROP) while the worker is active. When
> > the worker gets restarted, we can re-validate the table and
> > automatically disable the conflict logging feature if validation
> > fails.  And this can be enabled by ALTER SUBSCRIPTION by setting the
> > option again.
>
> Having to worry about ALTER/DROP and adding code to protect seems like
> an overkill.

IMHO eventually if we can control that I feel this is a good goal to
have.  So that we can avoid failure during conflict insertion.  We may
argue its user's responsibility to not alter the table and we can just
check the validity during create/alter subscription.

> > And if we want in first version we can expect user to create the table
> > as per the expected schema and supply it, this will avoid the need of
> > handling how to avoid it from publishing as it will be user's
> > responsibility and then in top up patches we can also allow to create
> > the table internally if tables doesn't exist and then we can find out
> > solution to avoid it from being publish when ALL TABLES are published.
>
> This looks much more simple to start with.

Right.

PFA, attached WIP patches, 0001 allow user created tables to provide
as input for conflict history tables and we will validate the table
during create/alter subscription.  0002 add an option to internally
create the table if it does not exist.

TODO:
- Still patches are WIP and need more work testing for different failure cases
- Need to explore an option to create a built-in type (I will start a
separate thread for the same)
- Need to add test cases
- Need to explore options to avoid getting published, but maybe we
only need to avoid this when we internally create the table?

Here is some basic test I tried:

psql -d postgres -c "CREATE TABLE test(a int, b int, primary key(a));"
psql -d postgres -p 5433 -c "CREATE SCHEMA myschema"
psql -d postgres -p 5433 -c "CREATE TABLE test(a int, b int, primary key(a));"
psql -d postgres -p 5433 -c "GRANT INSERT, UPDATE, SELECT, DELETE ON
test TO dk "
psql -d postgres -c "CREATE PUBLICATION pub FOR ALL TABLES ;"

psql -d postgres -p 5433 -c "CREATE SUBSCRIPTION sub CONNECTION
'dbname=postgres port=5432' PUBLICATION pub
WITH(conflict_log_table=myschema.conflict_log_history)";
psql -d postgres -p 5432 -c "INSERT INTO test VALUES(1,2);"
psql -d postgres -p 5433 -c "UPDATE test SET b=10 WHERE a=1;"
psql -d postgres -p 5432 -c "UPDATE test SET b=20 WHERE a=1;"

postgres[1202034]=# select * from myschema.conflict_log_history ;
-[ RECORD 1 ]-----+------------------------------
relid             | 16385
local_xid         | 763
remote_xid        | 757
local_lsn         | 0/00000000
remote_commit_lsn | 0/0174AB30
local_commit_ts   | 2025-09-14 06:45:00.828874+00
remote_commit_ts  | 2025-09-14 06:45:05.845614+00
table_schema      | public
table_name        | test
conflict_type     | update_origin_differs
local_origin      |
remote_origin     | pg_16396
key_tuple         | {"a":1,"b":20}
local_tuple       | {"a":1,"b":10}
remote_tuple      | {"a":1,"b":20}


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-09-18T08:33:33Z

On Sun, Sep 14, 2025 at 12:23 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Thanks for the feedback Bharath
>
> > On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > I was looking into another thread where we provide an error table for
> > > COPY [1], it requires the user to pre-create the error table. And
> > > inside the COPY command we will validate the table, validation in that
> > > context is a one-time process checking for: (1) table existence, (2)
> > > ability to acquire a sufficient lock, (3) INSERT privileges, and (4)
> > > matching column names and data types. This approach avoids concerns
> > > about the user's DROP or ALTER permissions.
> > >
> > > Our requirement for the logical replication conflict log table
> > > differs, as we must validate the target table upon every conflict
> > > insertion, not just at subscription creation. A more robust
> > > alternative is to perform validation and acquire a lock on the
> > > conflict table whenever the subscription worker starts. This prevents
> > > modifications (like ALTER or DROP) while the worker is active. When
> > > the worker gets restarted, we can re-validate the table and
> > > automatically disable the conflict logging feature if validation
> > > fails.  And this can be enabled by ALTER SUBSCRIPTION by setting the
> > > option again.
> >
> > Having to worry about ALTER/DROP and adding code to protect seems like
> > an overkill.
>
> IMHO eventually if we can control that I feel this is a good goal to
> have.  So that we can avoid failure during conflict insertion.  We may
> argue its user's responsibility to not alter the table and we can just
> check the validity during create/alter subscription.
>

If we compare conflict_history_table with the slot that gets created
with subscription, one can say the same thing about slots. Users can
drop the slots and whole replication will stop. I think this table
will be created with the same privileges as the owner of a
subscription which can be either a superuser or a user with the
privileges of the pg_create_subscription role, so we can rely on such
users.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-18T13:26:30Z

On Thu, Sep 18, 2025 at 2:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Sep 14, 2025 at 12:23 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Thanks for the feedback Bharath
> >
> > > On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > I was looking into another thread where we provide an error table for
> > > > COPY [1], it requires the user to pre-create the error table. And
> > > > inside the COPY command we will validate the table, validation in that
> > > > context is a one-time process checking for: (1) table existence, (2)
> > > > ability to acquire a sufficient lock, (3) INSERT privileges, and (4)
> > > > matching column names and data types. This approach avoids concerns
> > > > about the user's DROP or ALTER permissions.
> > > >
> > > > Our requirement for the logical replication conflict log table
> > > > differs, as we must validate the target table upon every conflict
> > > > insertion, not just at subscription creation. A more robust
> > > > alternative is to perform validation and acquire a lock on the
> > > > conflict table whenever the subscription worker starts. This prevents
> > > > modifications (like ALTER or DROP) while the worker is active. When
> > > > the worker gets restarted, we can re-validate the table and
> > > > automatically disable the conflict logging feature if validation
> > > > fails.  And this can be enabled by ALTER SUBSCRIPTION by setting the
> > > > option again.
> > >
> > > Having to worry about ALTER/DROP and adding code to protect seems like
> > > an overkill.
> >
> > IMHO eventually if we can control that I feel this is a good goal to
> > have.  So that we can avoid failure during conflict insertion.  We may
> > argue its user's responsibility to not alter the table and we can just
> > check the validity during create/alter subscription.
> >
>
> If we compare conflict_history_table with the slot that gets created
> with subscription, one can say the same thing about slots. Users can
> drop the slots and whole replication will stop. I think this table
> will be created with the same privileges as the owner of a
> subscription which can be either a superuser or a user with the
> privileges of the pg_create_subscription role, so we can rely on such
> users.

Yeah that's a valid point.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Masahiko Sawada <sawada.mshk@gmail.com> — 2025-09-18T18:15:37Z

On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Sep 14, 2025 at 12:23 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Thanks for the feedback Bharath
> >
> > > On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > I was looking into another thread where we provide an error table for
> > > > COPY [1], it requires the user to pre-create the error table. And
> > > > inside the COPY command we will validate the table, validation in that
> > > > context is a one-time process checking for: (1) table existence, (2)
> > > > ability to acquire a sufficient lock, (3) INSERT privileges, and (4)
> > > > matching column names and data types. This approach avoids concerns
> > > > about the user's DROP or ALTER permissions.
> > > >
> > > > Our requirement for the logical replication conflict log table
> > > > differs, as we must validate the target table upon every conflict
> > > > insertion, not just at subscription creation. A more robust
> > > > alternative is to perform validation and acquire a lock on the
> > > > conflict table whenever the subscription worker starts. This prevents
> > > > modifications (like ALTER or DROP) while the worker is active. When
> > > > the worker gets restarted, we can re-validate the table and
> > > > automatically disable the conflict logging feature if validation
> > > > fails.  And this can be enabled by ALTER SUBSCRIPTION by setting the
> > > > option again.
> > >
> > > Having to worry about ALTER/DROP and adding code to protect seems like
> > > an overkill.
> >
> > IMHO eventually if we can control that I feel this is a good goal to
> > have.  So that we can avoid failure during conflict insertion.  We may
> > argue its user's responsibility to not alter the table and we can just
> > check the validity during create/alter subscription.
> >
>
> If we compare conflict_history_table with the slot that gets created
> with subscription, one can say the same thing about slots. Users can
> drop the slots and whole replication will stop. I think this table
> will be created with the same privileges as the owner of a
> subscription which can be either a superuser or a user with the
> privileges of the pg_create_subscription role, so we can rely on such
> users.

We might want to consider which role inserts the conflict info into
the history table. For example, if any table created by a user can be
used as the history table for a subscription and the conflict info
insertion is performed by the subscription owner, we would end up
having the same security issue that was addressed by the run_as_owner
subscription option.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-09-20T11:59:02Z

On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > If we compare conflict_history_table with the slot that gets created
> > with subscription, one can say the same thing about slots. Users can
> > drop the slots and whole replication will stop. I think this table
> > will be created with the same privileges as the owner of a
> > subscription which can be either a superuser or a user with the
> > privileges of the pg_create_subscription role, so we can rely on such
> > users.
>
> We might want to consider which role inserts the conflict info into
> the history table. For example, if any table created by a user can be
> used as the history table for a subscription and the conflict info
> insertion is performed by the subscription owner, we would end up
> having the same security issue that was addressed by the run_as_owner
> subscription option.
>

Yeah, I don't think we want to open that door. For user created
tables, we should perform actions with table_owner's privilege. In
such a case, if one wants to create a subscription with run_as_owner
option, she should give DML operation permissions to the subscription
owner. OTOH, if we create this table internally (via subscription
owner) then irrespective of run_as_owner, we will always insert as
subscription_owner.

AFAIR, one open point for internally created tables is whether we
should skip changes to conflict_history table while replicating
changes? The table will be considered under for ALL TABLES
publications, if defined? Ideally, these should behave as catalog
tables, so one option is to mark them as 'user_catalog_table', or the
other option is we have some hard-code checks during replication. The
first option has the advantage that it won't write additional WAL for
these tables which is otherwise required under wal_level=logical. What
other options do we have?

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Masahiko Sawada <sawada.mshk@gmail.com> — 2025-09-23T17:59:12Z

On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > If we compare conflict_history_table with the slot that gets created
> > > with subscription, one can say the same thing about slots. Users can
> > > drop the slots and whole replication will stop. I think this table
> > > will be created with the same privileges as the owner of a
> > > subscription which can be either a superuser or a user with the
> > > privileges of the pg_create_subscription role, so we can rely on such
> > > users.
> >
> > We might want to consider which role inserts the conflict info into
> > the history table. For example, if any table created by a user can be
> > used as the history table for a subscription and the conflict info
> > insertion is performed by the subscription owner, we would end up
> > having the same security issue that was addressed by the run_as_owner
> > subscription option.
> >
>
> Yeah, I don't think we want to open that door. For user created
> tables, we should perform actions with table_owner's privilege. In
> such a case, if one wants to create a subscription with run_as_owner
> option, she should give DML operation permissions to the subscription
> owner. OTOH, if we create this table internally (via subscription
> owner) then irrespective of run_as_owner, we will always insert as
> subscription_owner.

Agreed.

>
> AFAIR, one open point for internally created tables is whether we
> should skip changes to conflict_history table while replicating
> changes? The table will be considered under for ALL TABLES
> publications, if defined? Ideally, these should behave as catalog
> tables, so one option is to mark them as 'user_catalog_table', or the
> other option is we have some hard-code checks during replication. The
> first option has the advantage that it won't write additional WAL for
> these tables which is otherwise required under wal_level=logical. What
> other options do we have?

I think conflict history information is subscriber local information
so doesn't have to be replicated to another subscriber. Also it could
be problematic in cross-major-version replication cases if we break
the compatibility of history table definition. I would expect that the
history table works as a catalog table in terms of logical
decoding/replication. It would probably make sense to reuse the
user_catalog_table option for that purpose. If we have a history table
for each subscription that wants to record the conflict history (I
believe so), it would be hard to go with the second option (having
hard-code checks).

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-09-24T10:30:12Z

On Tue, Sep 23, 2025 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> >
> > AFAIR, one open point for internally created tables is whether we
> > should skip changes to conflict_history table while replicating
> > changes? The table will be considered under for ALL TABLES
> > publications, if defined? Ideally, these should behave as catalog
> > tables, so one option is to mark them as 'user_catalog_table', or the
> > other option is we have some hard-code checks during replication. The
> > first option has the advantage that it won't write additional WAL for
> > these tables which is otherwise required under wal_level=logical. What
> > other options do we have?
>
> I think conflict history information is subscriber local information
> so doesn't have to be replicated to another subscriber. Also it could
> be problematic in cross-major-version replication cases if we break
> the compatibility of history table definition.
>

Right, this is another reason not to replicate it.

> I would expect that the
> history table works as a catalog table in terms of logical
> decoding/replication. It would probably make sense to reuse the
> user_catalog_table option for that purpose. If we have a history table
> for each subscription that wants to record the conflict history (I
> believe so), it would be hard to go with the second option (having
> hard-code checks).
>

Agreed. Let's wait and see what Dilip or others have to say on this.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-24T11:35:52Z

On Tue, Sep 23, 2025 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > If we compare conflict_history_table with the slot that gets created
> > > > with subscription, one can say the same thing about slots. Users can
> > > > drop the slots and whole replication will stop. I think this table
> > > > will be created with the same privileges as the owner of a
> > > > subscription which can be either a superuser or a user with the
> > > > privileges of the pg_create_subscription role, so we can rely on such
> > > > users.
> > >
> > > We might want to consider which role inserts the conflict info into
> > > the history table. For example, if any table created by a user can be
> > > used as the history table for a subscription and the conflict info
> > > insertion is performed by the subscription owner, we would end up
> > > having the same security issue that was addressed by the run_as_owner
> > > subscription option.
> > >
> >
> > Yeah, I don't think we want to open that door. For user created
> > tables, we should perform actions with table_owner's privilege. In
> > such a case, if one wants to create a subscription with run_as_owner
> > option, she should give DML operation permissions to the subscription
> > owner. OTOH, if we create this table internally (via subscription
> > owner) then irrespective of run_as_owner, we will always insert as
> > subscription_owner.
>
> Agreed.

Yeah that makes sense to me as well.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-24T11:40:17Z

On Wed, Sep 24, 2025 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 23, 2025 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > >
> > > AFAIR, one open point for internally created tables is whether we
> > > should skip changes to conflict_history table while replicating
> > > changes? The table will be considered under for ALL TABLES
> > > publications, if defined? Ideally, these should behave as catalog
> > > tables, so one option is to mark them as 'user_catalog_table', or the
> > > other option is we have some hard-code checks during replication. The
> > > first option has the advantage that it won't write additional WAL for
> > > these tables which is otherwise required under wal_level=logical. What
> > > other options do we have?
> >
> > I think conflict history information is subscriber local information
> > so doesn't have to be replicated to another subscriber. Also it could
> > be problematic in cross-major-version replication cases if we break
> > the compatibility of history table definition.
> >
>
> Right, this is another reason not to replicate it.
>
> > I would expect that the
> > history table works as a catalog table in terms of logical
> > decoding/replication. It would probably make sense to reuse the
> > user_catalog_table option for that purpose. If we have a history table
> > for each subscription that wants to record the conflict history (I
> > believe so), it would be hard to go with the second option (having
> > hard-code checks).
> >
>
> Agreed. Let's wait and see what Dilip or others have to say on this.

Yeah I think this makes sense to create as 'user_catalog_table' tables
when we internally create them.  However, IMHO when a user provides
its own table, I believe we should not enforce the restriction for
that table to be created as a 'user_catalog_table' table, or do you
think we should enforce that property?

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Masahiko Sawada <sawada.mshk@gmail.com> — 2025-09-24T18:35:49Z

On Wed, Sep 24, 2025 at 4:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Sep 24, 2025 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Sep 23, 2025 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > >
> > > > AFAIR, one open point for internally created tables is whether we
> > > > should skip changes to conflict_history table while replicating
> > > > changes? The table will be considered under for ALL TABLES
> > > > publications, if defined? Ideally, these should behave as catalog
> > > > tables, so one option is to mark them as 'user_catalog_table', or the
> > > > other option is we have some hard-code checks during replication. The
> > > > first option has the advantage that it won't write additional WAL for
> > > > these tables which is otherwise required under wal_level=logical. What
> > > > other options do we have?
> > >
> > > I think conflict history information is subscriber local information
> > > so doesn't have to be replicated to another subscriber. Also it could
> > > be problematic in cross-major-version replication cases if we break
> > > the compatibility of history table definition.
> > >
> >
> > Right, this is another reason not to replicate it.
> >
> > > I would expect that the
> > > history table works as a catalog table in terms of logical
> > > decoding/replication. It would probably make sense to reuse the
> > > user_catalog_table option for that purpose. If we have a history table
> > > for each subscription that wants to record the conflict history (I
> > > believe so), it would be hard to go with the second option (having
> > > hard-code checks).
> > >
> >
> > Agreed. Let's wait and see what Dilip or others have to say on this.
>
> Yeah I think this makes sense to create as 'user_catalog_table' tables
> when we internally create them.  However, IMHO when a user provides
> its own table, I believe we should not enforce the restriction for
> that table to be created as a 'user_catalog_table' table, or do you
> think we should enforce that property?

I find that's a user's responsibility, so I would not enforce that
property for user-provided-tables.

BTW what is the main use case for supporting the use of user-provided
tables for the history table? I think we basically don't want the
history table to be updated by any other processes than apply workers,
so it would make more sense that such a table is created internally
and tied to the subscription. I'm less convinced that it has enough
upside to warrant the complexity.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-25T05:39:59Z

On Sat, Sep 20, 2025 at 5:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > If we compare conflict_history_table with the slot that gets created
> > > with subscription, one can say the same thing about slots. Users can
> > > drop the slots and whole replication will stop. I think this table
> > > will be created with the same privileges as the owner of a
> > > subscription which can be either a superuser or a user with the
> > > privileges of the pg_create_subscription role, so we can rely on such
> > > users.
> >
> > We might want to consider which role inserts the conflict info into
> > the history table. For example, if any table created by a user can be
> > used as the history table for a subscription and the conflict info
> > insertion is performed by the subscription owner, we would end up
> > having the same security issue that was addressed by the run_as_owner
> > subscription option.
> >
>
> Yeah, I don't think we want to open that door. For user created
> tables, we should perform actions with table_owner's privilege. In
> such a case, if one wants to create a subscription with run_as_owner
> option, she should give DML operation permissions to the subscription
> owner. OTOH, if we create this table internally (via subscription
> owner) then irrespective of run_as_owner, we will always insert as
> subscription_owner.
>
> AFAIR, one open point for internally created tables is whether we
> should skip changes to conflict_history table while replicating
> changes? The table will be considered under for ALL TABLES
> publications, if defined? Ideally, these should behave as catalog
> tables, so one option is to mark them as 'user_catalog_table', or the
> other option is we have some hard-code checks during replication. The
> first option has the advantage that it won't write additional WAL for
> these tables which is otherwise required under wal_level=logical. What
> other options do we have?

I was doing more analysis and testing for 'use_catalog_table', so what
I found is when a table is marked as  'use_catalog_table', it will log
extra information i.e. CID[1] so that these tables can be used for
scanning as well during decoding like catalog tables using historical
snapshot.  And I have checked the code and tested as well
'use_catalog_table' does get streamed with ALL TABLE options.  Am I
missing something or are we thinking of changing the behavior of
use_catalog_table so that they do not get decoded, but I think that
will change the existing behaviour so might not be a good option, yet
another idea is to invent some other option for which purpose called
'conflict_history_purpose' but maybe that doesn't justify the purpose
of the new option IMHO.

[1]
/*
* For logical decode we need combo CIDs to properly decode the
* catalog
*/
if (RelationIsAccessibleInLogicalDecoding(relation))
log_heap_new_cid(relation, &tp);


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-25T06:23:45Z

On Thu, Sep 25, 2025 at 11:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, Sep 20, 2025 at 5:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > If we compare conflict_history_table with the slot that gets created
> > > > with subscription, one can say the same thing about slots. Users can
> > > > drop the slots and whole replication will stop. I think this table
> > > > will be created with the same privileges as the owner of a
> > > > subscription which can be either a superuser or a user with the
> > > > privileges of the pg_create_subscription role, so we can rely on such
> > > > users.
> > >
> > > We might want to consider which role inserts the conflict info into
> > > the history table. For example, if any table created by a user can be
> > > used as the history table for a subscription and the conflict info
> > > insertion is performed by the subscription owner, we would end up
> > > having the same security issue that was addressed by the run_as_owner
> > > subscription option.
> > >
> >
> > Yeah, I don't think we want to open that door. For user created
> > tables, we should perform actions with table_owner's privilege. In
> > such a case, if one wants to create a subscription with run_as_owner
> > option, she should give DML operation permissions to the subscription
> > owner. OTOH, if we create this table internally (via subscription
> > owner) then irrespective of run_as_owner, we will always insert as
> > subscription_owner.
> >
> > AFAIR, one open point for internally created tables is whether we
> > should skip changes to conflict_history table while replicating
> > changes? The table will be considered under for ALL TABLES
> > publications, if defined? Ideally, these should behave as catalog
> > tables, so one option is to mark them as 'user_catalog_table', or the
> > other option is we have some hard-code checks during replication. The
> > first option has the advantage that it won't write additional WAL for
> > these tables which is otherwise required under wal_level=logical. What
> > other options do we have?
>
> I was doing more analysis and testing for 'use_catalog_table', so what
> I found is when a table is marked as  'use_catalog_table', it will log
> extra information i.e. CID[1] so that these tables can be used for
> scanning as well during decoding like catalog tables using historical
> snapshot.  And I have checked the code and tested as well
> 'use_catalog_table' does get streamed with ALL TABLE options.  Am I
> missing something or are we thinking of changing the behavior of
> use_catalog_table so that they do not get decoded, but I think that
> will change the existing behaviour so might not be a good option, yet
> another idea is to invent some other option for which purpose called
> 'conflict_history_purpose' but maybe that doesn't justify the purpose
> of the new option IMHO.
>
> [1]
> /*
> * For logical decode we need combo CIDs to properly decode the
> * catalog
> */
> if (RelationIsAccessibleInLogicalDecoding(relation))
> log_heap_new_cid(relation, &tp);
>

Meanwhile I am also exploring the option where we can just CREATE TYPE
in initialize_data_directory() during initdb, basically we will create
this type in template1 so that it will be available in all the
databases, and that would simplify the table creation whether we
create internally or we allow user to create it.  And while checking
is_publishable_class we can check the type and avoid publishing those
tables.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-25T10:49:33Z

On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > [1]
> > /*
> > * For logical decode we need combo CIDs to properly decode the
> > * catalog
> > */
> > if (RelationIsAccessibleInLogicalDecoding(relation))
> > log_heap_new_cid(relation, &tp);
> >
>
> Meanwhile I am also exploring the option where we can just CREATE TYPE
> in initialize_data_directory() during initdb, basically we will create
> this type in template1 so that it will be available in all the
> databases, and that would simplify the table creation whether we
> create internally or we allow user to create it.  And while checking
> is_publishable_class we can check the type and avoid publishing those
> tables.
>

Based on my off list discussion with Amit, one option could be to set
HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict
history table, for that we can not use SPI interface to insert instead
we will have to directly call the heap_insert() to add this option.
Since we do not want to create any trigger etc on this table, direct
insert should be fine, but if we plan to create this table as
partitioned table in future then direct heap insert might not work.

--
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-26T11:12:11Z

On Thu, Sep 25, 2025 at 4:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > > [1]
> > > /*
> > > * For logical decode we need combo CIDs to properly decode the
> > > * catalog
> > > */
> > > if (RelationIsAccessibleInLogicalDecoding(relation))
> > > log_heap_new_cid(relation, &tp);
> > >
> >
> > Meanwhile I am also exploring the option where we can just CREATE TYPE
> > in initialize_data_directory() during initdb, basically we will create
> > this type in template1 so that it will be available in all the
> > databases, and that would simplify the table creation whether we
> > create internally or we allow user to create it.  And while checking
> > is_publishable_class we can check the type and avoid publishing those
> > tables.
> >
>
> Based on my off list discussion with Amit, one option could be to set
> HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict
> history table, for that we can not use SPI interface to insert instead
> we will have to directly call the heap_insert() to add this option.
> Since we do not want to create any trigger etc on this table, direct
> insert should be fine, but if we plan to create this table as
> partitioned table in future then direct heap insert might not work.

Upon further reflection, I realized that while this approach avoids
streaming inserts to the conflict log history table, it still requires
that table to exist on the subscriber node upon subscription creation,
which isn't ideal.

We have two main options to address this:

Option1:
When calling pg_get_publication_tables(), if the 'alltables' option is
used, we can scan all subscriptions and explicitly ignore (filter out)
all conflict history tables.  This will not be very costly as this
will scan the subscriber when pg_get_publication_tables() is called,
which is only called during create subscription/alter subscription on
the remote node.

Option2:
Alternatively, we could introduce a table creation option, like a
'non-publishable' flag, to prevent a table from being streamed
entirely. I believe this would be a valuable, independent feature for
users who want to create certain tables without including them in
logical replication.

I prefer option2, as I feel this can add value independent of this patch.

--
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-09-27T15:23:28Z

On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Sep 25, 2025 at 4:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > > [1]
> > > > /*
> > > > * For logical decode we need combo CIDs to properly decode the
> > > > * catalog
> > > > */
> > > > if (RelationIsAccessibleInLogicalDecoding(relation))
> > > > log_heap_new_cid(relation, &tp);
> > > >
> > >
> > > Meanwhile I am also exploring the option where we can just CREATE TYPE
> > > in initialize_data_directory() during initdb, basically we will create
> > > this type in template1 so that it will be available in all the
> > > databases, and that would simplify the table creation whether we
> > > create internally or we allow user to create it.  And while checking
> > > is_publishable_class we can check the type and avoid publishing those
> > > tables.
> > >
> >
> > Based on my off list discussion with Amit, one option could be to set
> > HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict
> > history table, for that we can not use SPI interface to insert instead
> > we will have to directly call the heap_insert() to add this option.
> > Since we do not want to create any trigger etc on this table, direct
> > insert should be fine, but if we plan to create this table as
> > partitioned table in future then direct heap insert might not work.
>
> Upon further reflection, I realized that while this approach avoids
> streaming inserts to the conflict log history table, it still requires
> that table to exist on the subscriber node upon subscription creation,
> which isn't ideal.
>

I am not able to understand what exact problem you are seeing here. I
was thinking that during the CREATE SUBSCRIPTION command, a new table
with user provided name will be created similar to how we create a
slot. The difference would be that we create a slot on the
remote/publisher node but this table will be created locally.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-27T15:54:15Z

On Sat, Sep 27, 2025 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I am not able to understand what exact problem you are seeing here. I
> was thinking that during the CREATE SUBSCRIPTION command, a new table
> with user provided name will be created similar to how we create a
> slot. The difference would be that we create a slot on the
> remote/publisher node but this table will be created locally.
>
That's not an issue, the problem here we are discussing is the
conflict history table which is created on the subscriber node should
not be published when this node subscription node create another
publisher with ALL TABLE option.  So we found a option for inserting
into this table with HEAP_INSERT_NO_LOGICAL flag so that those insert
will not be decoded, but what about another not subscribing from this
publisher, they should have this table because when ALL TABLES are
published subscriber node expect all user table to present there even
if its changes are not published.  Consider below example

Node1:
CREATE PUBLICATION pub_node1..

Node2:
CREATE SUBSCRIPTION sub.. PUBLICATION pub_node1
WITH(conflict_history_table='my_conflict_table');
CREATE PUBLICATION pub_node2 FOR ALL TABLE;

Node3:
CREATE SUBSCRIPTION sub1.. PUBLICATION pub_node2; --this will expect
'my_conflict_table' to exist here because when it will call
pg_get_publication_tables() from Node2 it will also get the
'my_conflict_table' along with other user tables.

And as a solution I wanted to avoid this table to be avoided when
pg_get_publication_tables() is being called.
Option1: We can see if table name is listed as conflict history table
in any of the subscribers on Node2 we will ignore this.
Option2: Provide a new table option to mark table as non publishable
table when ALL TABLE option is provided, I think this option can be
useful independently as well.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-09-27T21:13:39Z

On Sat, Sep 27, 2025 at 9:24 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, Sep 27, 2025 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I am not able to understand what exact problem you are seeing here. I
> > was thinking that during the CREATE SUBSCRIPTION command, a new table
> > with user provided name will be created similar to how we create a
> > slot. The difference would be that we create a slot on the
> > remote/publisher node but this table will be created locally.
> >
> That's not an issue, the problem here we are discussing is the
> conflict history table which is created on the subscriber node should
> not be published when this node subscription node create another
> publisher with ALL TABLE option.  So we found a option for inserting
> into this table with HEAP_INSERT_NO_LOGICAL flag so that those insert
> will not be decoded, but what about another not subscribing from this
> publisher, they should have this table because when ALL TABLES are
> published subscriber node expect all user table to present there even
> if its changes are not published.  Consider below example
>
> Node1:
> CREATE PUBLICATION pub_node1..
>
> Node2:
> CREATE SUBSCRIPTION sub.. PUBLICATION pub_node1
> WITH(conflict_history_table='my_conflict_table');
> CREATE PUBLICATION pub_node2 FOR ALL TABLE;
>
> Node3:
> CREATE SUBSCRIPTION sub1.. PUBLICATION pub_node2; --this will expect
> 'my_conflict_table' to exist here because when it will call
> pg_get_publication_tables() from Node2 it will also get the
> 'my_conflict_table' along with other user tables.
>
> And as a solution I wanted to avoid this table to be avoided when
> pg_get_publication_tables() is being called.
> Option1: We can see if table name is listed as conflict history table
> in any of the subscribers on Node2 we will ignore this.
> Option2: Provide a new table option to mark table as non publishable
> table when ALL TABLE option is provided, I think this option can be
> useful independently as well.
>

I agree that option-2 is useful and IIUC, we are already working on
something similar in thread [1]. However, it is better to use option-1
here because we are using non-user specified mechanism to skip changes
during replication, so following the same during other times is
preferable. Once we have that other feature [1], we can probably
optimize this code to use it without taking input from the user. The
other reason of not going with the option-2 in the way you are
proposing is that it doesn't seem like a good idea to have multiple
ways to specify skipping tables from publishing. I find the approach
being discussed in thread [1] a generic and better than a new
table-level option.

[1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com
-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-28T11:45:41Z

On Sun, Sep 28, 2025 at 2:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>

> I agree that option-2 is useful and IIUC, we are already working on
> something similar in thread [1]. However, it is better to use option-1
> here because we are using non-user specified mechanism to skip changes
> during replication, so following the same during other times is
> preferable. Once we have that other feature [1], we can probably
> optimize this code to use it without taking input from the user. The
> other reason of not going with the option-2 in the way you are
> proposing is that it doesn't seem like a good idea to have multiple
> ways to specify skipping tables from publishing. I find the approach
> being discussed in thread [1] a generic and better than a new
> table-level option.
>
> [1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com

I understand the current discussion revolves around using an EXCEPT
clause (for tables/schemas/columns) during publication creation.  But
what we want is to mark some table which will be excluded permanently
from publication, because we can not expect users to explicitly
exclude them while creating publication.

So, I propose we add a "non-publishable" property to tables
themselves. This is a more valuable option for users who are certain
that certain tables should never be replicated.

By marking a table as non-publishable, we save users the effort of
repeatedly listing it in the EXCEPT option for every new publication.
Both methods have merit, but the proposed table property addresses the
need for a permanent, system-wide exclusion.

See below test with a quick hack, what I am referring to.

postgres[2730657]=# CREATE TABLE test(a int) WITH
(NON_PUBLISHABLE_TABLE = true);
CREATE TABLE
postgres[2730657]=# CREATE PUBLICATION pub FOR ALL TABLES ;
CREATE PUBLICATION
postgres[2730657]=# select pg_get_publication_tables('pub');
 pg_get_publication_tables
---------------------------
(0 rows)

But I agree this is an additional table option which might need
consensus, so meanwhile we can proceed with option2, I will prepare
patches with option-2 and as a add on patch I will propose option-1.
And this option-1 patch can be discussed in a separate thread as well.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-29T09:57:23Z

On Sun, Sep 28, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sun, Sep 28, 2025 at 2:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > I agree that option-2 is useful and IIUC, we are already working on
> > something similar in thread [1]. However, it is better to use option-1
> > here because we are using non-user specified mechanism to skip changes
> > during replication, so following the same during other times is
> > preferable. Once we have that other feature [1], we can probably
> > optimize this code to use it without taking input from the user. The
> > other reason of not going with the option-2 in the way you are
> > proposing is that it doesn't seem like a good idea to have multiple
> > ways to specify skipping tables from publishing. I find the approach
> > being discussed in thread [1] a generic and better than a new
> > table-level option.
> >
> > [1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com
>
> I understand the current discussion revolves around using an EXCEPT
> clause (for tables/schemas/columns) during publication creation.  But
> what we want is to mark some table which will be excluded permanently
> from publication, because we can not expect users to explicitly
> exclude them while creating publication.
>
> So, I propose we add a "non-publishable" property to tables
> themselves. This is a more valuable option for users who are certain
> that certain tables should never be replicated.
>
> By marking a table as non-publishable, we save users the effort of
> repeatedly listing it in the EXCEPT option for every new publication.
> Both methods have merit, but the proposed table property addresses the
> need for a permanent, system-wide exclusion.
>
> See below test with a quick hack, what I am referring to.
>
> postgres[2730657]=# CREATE TABLE test(a int) WITH
> (NON_PUBLISHABLE_TABLE = true);
> CREATE TABLE
> postgres[2730657]=# CREATE PUBLICATION pub FOR ALL TABLES ;
> CREATE PUBLICATION
> postgres[2730657]=# select pg_get_publication_tables('pub');
>  pg_get_publication_tables
> ---------------------------
> (0 rows)
>
>
> But I agree this is an additional table option which might need
> consensus, so meanwhile we can proceed with option2, I will prepare
> patches with option-2 and as a add on patch I will propose option-1.
> And this option-1 patch can be discussed in a separate thread as well.

So here is the patch set using option-2, with this when alltable
option is used and we get pg_get_publication_tables(), this will check
the relid against the conflict history tables in the subscribers and
those tables will not be added to the list.  I will start a separate
thread for proposing the patch I sent in previous email.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-11T10:18:55Z

On Mon, Sep 29, 2025 at 3:27 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sun, Sep 28, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Sun, Sep 28, 2025 at 2:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > I agree that option-2 is useful and IIUC, we are already working on
> > > something similar in thread [1]. However, it is better to use option-1
> > > here because we are using non-user specified mechanism to skip changes
> > > during replication, so following the same during other times is
> > > preferable. Once we have that other feature [1], we can probably
> > > optimize this code to use it without taking input from the user. The
> > > other reason of not going with the option-2 in the way you are
> > > proposing is that it doesn't seem like a good idea to have multiple
> > > ways to specify skipping tables from publishing. I find the approach
> > > being discussed in thread [1] a generic and better than a new
> > > table-level option.
> > >
> > > [1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com
> >
> > I understand the current discussion revolves around using an EXCEPT
> > clause (for tables/schemas/columns) during publication creation.  But
> > what we want is to mark some table which will be excluded permanently
> > from publication, because we can not expect users to explicitly
> > exclude them while creating publication.
> >
> > So, I propose we add a "non-publishable" property to tables
> > themselves. This is a more valuable option for users who are certain
> > that certain tables should never be replicated.
> >
> > By marking a table as non-publishable, we save users the effort of
> > repeatedly listing it in the EXCEPT option for every new publication.
> > Both methods have merit, but the proposed table property addresses the
> > need for a permanent, system-wide exclusion.
> >
> > See below test with a quick hack, what I am referring to.
> >
> > postgres[2730657]=# CREATE TABLE test(a int) WITH
> > (NON_PUBLISHABLE_TABLE = true);
> > CREATE TABLE
> > postgres[2730657]=# CREATE PUBLICATION pub FOR ALL TABLES ;
> > CREATE PUBLICATION
> > postgres[2730657]=# select pg_get_publication_tables('pub');
> >  pg_get_publication_tables
> > ---------------------------
> > (0 rows)
> >
> >
> > But I agree this is an additional table option which might need
> > consensus, so meanwhile we can proceed with option2, I will prepare
> > patches with option-2 and as a add on patch I will propose option-1.
> > And this option-1 patch can be discussed in a separate thread as well.
>
> So here is the patch set using option-2, with this when alltable
> option is used and we get pg_get_publication_tables(), this will check
> the relid against the conflict history tables in the subscribers and
> those tables will not be added to the list.  I will start a separate
> thread for proposing the patch I sent in previous email.
>

I have started going through this thread. Is it possible to rebase the
patches and post?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-11T10:26:56Z

On Tue, Nov 11, 2025 at 3:49 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Sep 29, 2025 at 3:27 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
>
> I have started going through this thread. Is it possible to rebase the
> patches and post?

Thanks Shveta, I will post the rebased patch by tomorrow.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-12T06:50:55Z

On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Sep 25, 2025 at 4:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > > [1]
> > > > /*
> > > > * For logical decode we need combo CIDs to properly decode the
> > > > * catalog
> > > > */
> > > > if (RelationIsAccessibleInLogicalDecoding(relation))
> > > > log_heap_new_cid(relation, &tp);
> > > >
> > >
> > > Meanwhile I am also exploring the option where we can just CREATE TYPE
> > > in initialize_data_directory() during initdb, basically we will create
> > > this type in template1 so that it will be available in all the
> > > databases, and that would simplify the table creation whether we
> > > create internally or we allow user to create it.  And while checking
> > > is_publishable_class we can check the type and avoid publishing those
> > > tables.
> > >
> >
> > Based on my off list discussion with Amit, one option could be to set
> > HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict
> > history table, for that we can not use SPI interface to insert instead
> > we will have to directly call the heap_insert() to add this option.
> > Since we do not want to create any trigger etc on this table, direct
> > insert should be fine, but if we plan to create this table as
> > partitioned table in future then direct heap insert might not work.
>
> Upon further reflection, I realized that while this approach avoids
> streaming inserts to the conflict log history table, it still requires
> that table to exist on the subscriber node upon subscription creation,
> which isn't ideal.
>
> We have two main options to address this:
>
> Option1:
> When calling pg_get_publication_tables(), if the 'alltables' option is
> used, we can scan all subscriptions and explicitly ignore (filter out)
> all conflict history tables.  This will not be very costly as this
> will scan the subscriber when pg_get_publication_tables() is called,
> which is only called during create subscription/alter subscription on
> the remote node.
>
> Option2:
> Alternatively, we could introduce a table creation option, like a
> 'non-publishable' flag, to prevent a table from being streamed
> entirely. I believe this would be a valuable, independent feature for
> users who want to create certain tables without including them in
> logical replication.
>
> I prefer option2, as I feel this can add value independent of this patch.
>

I agree that marking tables with a flag to easily exclude them during
publishing would be cleaner. In the current patch, for an ALL-TABLES
publication, we scan pg_subscription for each table in pg_class to
check its subconflicttable and decide whether to ignore it. But since
this only happens during create/alter subscription and refresh
publication, the overhead should be acceptable.

Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good
enhancement but since we already have the EXCEPT list built in a
separate thread, that might be sufficient for now. IMO, such
conflict-tables should be marked internally (for example, with a
‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily
identified within the system, without requiring users to explicitly
specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to
see what others think on this.
For the time being, the current implementation looks fine, considering
it runs only during a few publication-related DDL operations.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-12T09:10:28Z

On Wed, Nov 12, 2025 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >

> I agree that marking tables with a flag to easily exclude them during
> publishing would be cleaner. In the current patch, for an ALL-TABLES
> publication, we scan pg_subscription for each table in pg_class to
> check its subconflicttable and decide whether to ignore it. But since
> this only happens during create/alter subscription and refresh
> publication, the overhead should be acceptable.

Thanks for your opinion.

> Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good
> enhancement but since we already have the EXCEPT list built in a
> separate thread, that might be sufficient for now. IMO, such
> conflict-tables should be marked internally (for example, with a
> ‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily
> identified within the system, without requiring users to explicitly
> specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to
> see what others think on this.
> For the time being, the current implementation looks fine, considering
> it runs only during a few publication-related DDL operations.

+1

Here is the rebased patch, changes apart from rebasing it
1) Dropped the conflict history table during drop subscription
2) Added test cases for testing the conflict history table behavior
with CREATE/ALTER/DROP subscription

TODO:
1) Need more thoughts on the table schema whether we need to capture
more items or shall we drop some fields if we think those are not
necessary.
2) Logical replication test for generating conflict and capturing in
conflict history table.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-12T09:44:06Z

On Wed, Nov 12, 2025 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Nov 12, 2025 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
>
> > I agree that marking tables with a flag to easily exclude them during
> > publishing would be cleaner. In the current patch, for an ALL-TABLES
> > publication, we scan pg_subscription for each table in pg_class to
> > check its subconflicttable and decide whether to ignore it. But since
> > this only happens during create/alter subscription and refresh
> > publication, the overhead should be acceptable.
>
> Thanks for your opinion.
>
> > Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good
> > enhancement but since we already have the EXCEPT list built in a
> > separate thread, that might be sufficient for now. IMO, such
> > conflict-tables should be marked internally (for example, with a
> > ‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily
> > identified within the system, without requiring users to explicitly
> > specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to
> > see what others think on this.
> > For the time being, the current implementation looks fine, considering
> > it runs only during a few publication-related DDL operations.
>
> +1
>
> Here is the rebased patch, changes apart from rebasing it
> 1) Dropped the conflict history table during drop subscription
> 2) Added test cases for testing the conflict history table behavior
> with CREATE/ALTER/DROP subscription

Thanks.

> TODO:
> 1) Need more thoughts on the table schema whether we need to capture
> more items or shall we drop some fields if we think those are not
> necessary.

Yes, this needs some more thoughts. I will review.

I feel since design is somewhat agreed upon, we may handle
code-correction/completion. I have not looked at the rebased patch
yet, but here are a few comments based on old-version.

Few observations related to publication.
------------------------------

(In the below comments, clt/CLT implies Conflict Log Table)

1)
'select pg_relation_is_publishable(clt)' returns true for conflict-log table.

2)
'\d+ clt'   shows all-tables publication name. I feel we should not
show that for clt.

3)
I am able to create a publication for clt table, should it be allowed?

create subscription sub1 connection '...' publication pub1
WITH(conflict_log_table='clt');
create publication pub3 for table clt;

4)
Is there a reason we have not made '!IsConflictHistoryRelid' check as
part of is_publishable_class() itself? If we do so, other code-logics
will also get clt as non-publishable always (and will solve a few of
the above issues I think). IIUC, there is no place where we want to
mark CLT as publishable or is there any?

5) Also, I feel we can add some documentation now to help others to
understand/review the patch better without going through the long
thread.

Few observations related to conflict-logging:
------------------------------
1)
I found that for the conflicts which ultimately result in Error, we do
not insert any conflict-record in clt.

a)
Example: insert_exists, update_Exists
create table tab1 (i int primary key, j int);
sub: insert into tab1 values(30,10);
pub: insert into tab1 values(30,10);
ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
No record in clt.

sub:
<some pre-data needed>
update tab1 set i=40 where i = 30;
pub: update tab1 set i=40 where i = 20;
ERROR:  conflict detected on relation "public.tab1": conflict=update_exists
No record in clt.

b)
Another question related to this is, since these conflicts (which
results in error) keep on happening until user resolves these or skips
these or 'disable_on_error' is set. Then are we going to insert these
multiple times? We do count these in 'confl_insert_exists' and
'confl_update_exists' everytime, so it makes sense to log those each
time in clt as well. Thoughts?

2)
Conflicts where row on sub is missing, local_ts incorrectly inserted.
It is '2000-01-01 05:30:00+05:30'. Should it be Null or something
indicating that it is not applicable for this conflict-type?

Example: delete_missing, update_missing
pub:
 insert into tab1 values(10,10);
 insert into tab1 values(20,10);
 sub:  delete from tab1 where i=10;
 pub:  delete from tab1 where i=10;

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-13T09:09:02Z

On Wed, Nov 12, 2025 at 3:14 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Nov 12, 2025 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Nov 12, 2025 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> >
> > > I agree that marking tables with a flag to easily exclude them during
> > > publishing would be cleaner. In the current patch, for an ALL-TABLES
> > > publication, we scan pg_subscription for each table in pg_class to
> > > check its subconflicttable and decide whether to ignore it. But since
> > > this only happens during create/alter subscription and refresh
> > > publication, the overhead should be acceptable.
> >
> > Thanks for your opinion.
> >
> > > Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good
> > > enhancement but since we already have the EXCEPT list built in a
> > > separate thread, that might be sufficient for now. IMO, such
> > > conflict-tables should be marked internally (for example, with a
> > > ‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily
> > > identified within the system, without requiring users to explicitly
> > > specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to
> > > see what others think on this.
> > > For the time being, the current implementation looks fine, considering
> > > it runs only during a few publication-related DDL operations.
> >
> > +1
> >
> > Here is the rebased patch, changes apart from rebasing it
> > 1) Dropped the conflict history table during drop subscription
> > 2) Added test cases for testing the conflict history table behavior
> > with CREATE/ALTER/DROP subscription
>
> Thanks.
>
> > TODO:
> > 1) Need more thoughts on the table schema whether we need to capture
> > more items or shall we drop some fields if we think those are not
> > necessary.
>
> Yes, this needs some more thoughts. I will review.
>
> I feel since design is somewhat agreed upon, we may handle
> code-correction/completion. I have not looked at the rebased patch
> yet, but here are a few comments based on old-version.
>
> Few observations related to publication.
> ------------------------------
>
> (In the below comments, clt/CLT implies Conflict Log Table)
>
> 1)
> 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.
>
> 2)
> '\d+ clt'   shows all-tables publication name. I feel we should not
> show that for clt.
>
> 3)
> I am able to create a publication for clt table, should it be allowed?
>
> create subscription sub1 connection '...' publication pub1
> WITH(conflict_log_table='clt');
> create publication pub3 for table clt;
>
> 4)
> Is there a reason we have not made '!IsConflictHistoryRelid' check as
> part of is_publishable_class() itself? If we do so, other code-logics
> will also get clt as non-publishable always (and will solve a few of
> the above issues I think). IIUC, there is no place where we want to
> mark CLT as publishable or is there any?
>
> 5) Also, I feel we can add some documentation now to help others to
> understand/review the patch better without going through the long
> thread.
>
>
> Few observations related to conflict-logging:
> ------------------------------
> 1)
> I found that for the conflicts which ultimately result in Error, we do
> not insert any conflict-record in clt.
>
> a)
> Example: insert_exists, update_Exists
> create table tab1 (i int primary key, j int);
> sub: insert into tab1 values(30,10);
> pub: insert into tab1 values(30,10);
> ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
> No record in clt.
>
> sub:
> <some pre-data needed>
> update tab1 set i=40 where i = 30;
> pub: update tab1 set i=40 where i = 20;
> ERROR:  conflict detected on relation "public.tab1": conflict=update_exists
> No record in clt.
>
> b)
> Another question related to this is, since these conflicts (which
> results in error) keep on happening until user resolves these or skips
> these or 'disable_on_error' is set. Then are we going to insert these
> multiple times? We do count these in 'confl_insert_exists' and
> 'confl_update_exists' everytime, so it makes sense to log those each
> time in clt as well. Thoughts?
>
> 2)
> Conflicts where row on sub is missing, local_ts incorrectly inserted.
> It is '2000-01-01 05:30:00+05:30'. Should it be Null or something
> indicating that it is not applicable for this conflict-type?
>
> Example: delete_missing, update_missing
> pub:
>  insert into tab1 values(10,10);
>  insert into tab1 values(20,10);
>  sub:  delete from tab1 where i=10;
>  pub:  delete from tab1 where i=10;
>

3)
We also need to think how we are going to display the info in case of
multiple_unique_conflicts as there could be multiple local and remote
tuples conflicting for one single operation. Example:

create table conf_tab (a int primary key, b int unique, c int unique);

sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4);

pub: insert into conf_tab values (2,3,4);

ERROR:  conflict detected on relation "public.conf_tab":
conflict=multiple_unique_conflicts
DETAIL:  Key already exists in unique index "conf_tab_pkey", modified
locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
Key already exists in unique index "conf_tab_b_key", modified locally
in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
Key already exists in unique index "conf_tab_c_key", modified locally
in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).
CONTEXT:  processing remote data for replication origin "pg_16392"
during message type "INSERT" for replication target relation
"public.conf_tab" in transaction 781, finished at 0/017FDDA0

Currently in clt, we have singular terms such as 'key_tuple',
'local_tuple', 'remote_tuple'.  Shall we have multiple rows inserted?
But it does not look reasonable to have multiple rows inserted for a
single conflict raised. I will think more about this.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-13T15:47:11Z

On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> > Few observations related to publication.
> > ------------------------------

Thanks Shveta, for testing and sharing your thoughts.  IMHO for
conflict log tables it should be good enough if we restrict it when
ALL TABLE options are used, I don't think we need to put extra effort
to completely restrict it even if users want to explicitly list it
into the publication.

> >
> > (In the below comments, clt/CLT implies Conflict Log Table)
> >
> > 1)
> > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.

This function is used while publishing every single change and I don't
think we want to add a cost to check each subscription to identify
whether the table is listed as CLT.

> > 2)
> > '\d+ clt'   shows all-tables publication name. I feel we should not
> > show that for clt.

I think we should fix this.

> > 3)
> > I am able to create a publication for clt table, should it be allowed?

I believe we should not do any specific handling to restrict this but
I am open for the opinions.

> > create subscription sub1 connection '...' publication pub1
> > WITH(conflict_log_table='clt');
> > create publication pub3 for table clt;
> >
> > 4)
> > Is there a reason we have not made '!IsConflictHistoryRelid' check as
> > part of is_publishable_class() itself? If we do so, other code-logics
> > will also get clt as non-publishable always (and will solve a few of
> > the above issues I think). IIUC, there is no place where we want to
> > mark CLT as publishable or is there any?

IMHO the main reason is performance.

> > 5) Also, I feel we can add some documentation now to help others to
> > understand/review the patch better without going through the long
> > thread.

Make sense, I will do that in the next version.

> >
> > Few observations related to conflict-logging:
> > ------------------------------
> > 1)
> > I found that for the conflicts which ultimately result in Error, we do
> > not insert any conflict-record in clt.
> >
> > a)
> > Example: insert_exists, update_Exists
> > create table tab1 (i int primary key, j int);
> > sub: insert into tab1 values(30,10);
> > pub: insert into tab1 values(30,10);
> > ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
> > No record in clt.
> >
> > sub:
> > <some pre-data needed>
> > update tab1 set i=40 where i = 30;
> > pub: update tab1 set i=40 where i = 20;
> > ERROR:  conflict detected on relation "public.tab1": conflict=update_exists
> > No record in clt.

Yeah that interesting need to put thought on how to commit this record
when an outer transaction is aborted as we do not have autonomous
transactions which are generally used for this kind of logging.  But
we can explore more options like inserting into conflict log tables
outside the outer transaction.

> > b)
> > Another question related to this is, since these conflicts (which
> > results in error) keep on happening until user resolves these or skips
> > these or 'disable_on_error' is set. Then are we going to insert these
> > multiple times? We do count these in 'confl_insert_exists' and
> > 'confl_update_exists' everytime, so it makes sense to log those each
> > time in clt as well. Thoughts?

I think it make sense to insert every time we see the conflict, but it
would be good to have opinion from others as well.

> > 2)
> > Conflicts where row on sub is missing, local_ts incorrectly inserted.
> > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something
> > indicating that it is not applicable for this conflict-type?
> >
> > Example: delete_missing, update_missing
> > pub:
> >  insert into tab1 values(10,10);
> >  insert into tab1 values(20,10);
> >  sub:  delete from tab1 where i=10;
> >  pub:  delete from tab1 where i=10;

Sure I will test this.

>
> 3)
> We also need to think how we are going to display the info in case of
> multiple_unique_conflicts as there could be multiple local and remote
> tuples conflicting for one single operation. Example:
>
> create table conf_tab (a int primary key, b int unique, c int unique);
>
> sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4);
>
> pub: insert into conf_tab values (2,3,4);
>
> ERROR:  conflict detected on relation "public.conf_tab":
> conflict=multiple_unique_conflicts
> DETAIL:  Key already exists in unique index "conf_tab_pkey", modified
> locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
> Key already exists in unique index "conf_tab_b_key", modified locally
> in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
> Key already exists in unique index "conf_tab_c_key", modified locally
> in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).
> CONTEXT:  processing remote data for replication origin "pg_16392"
> during message type "INSERT" for replication target relation
> "public.conf_tab" in transaction 781, finished at 0/017FDDA0
>
> Currently in clt, we have singular terms such as 'key_tuple',
> 'local_tuple', 'remote_tuple'.  Shall we have multiple rows inserted?
> But it does not look reasonable to have multiple rows inserted for a
> single conflict raised. I will think more about this.

Currently I am inserting multiple records in the conflict history
table, the same as each tuple is logged, but couldn't find any better
way for this. Another option is to use an array of tuples instead of a
single tuple but not sure this might make things more complicated to
process by any external tool.  But you are right, this needs more
discussion.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-17T06:24:11Z

On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > > Few observations related to publication.
> > > ------------------------------
>
> Thanks Shveta, for testing and sharing your thoughts.  IMHO for
> conflict log tables it should be good enough if we restrict it when
> ALL TABLE options are used, I don't think we need to put extra effort
> to completely restrict it even if users want to explicitly list it
> into the publication.
>
> > >
> > > (In the below comments, clt/CLT implies Conflict Log Table)
> > >
> > > 1)
> > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.

After putting more thought I have changed this to return false for
clt, as this is just an exposed function not called by pgoutput layer.

> > > 2)
> > > '\d+ clt'   shows all-tables publication name. I feel we should not
> > > show that for clt.
>
Fixed

>
> > > 3)
> > > I am able to create a publication for clt table, should it be allowed?
>
> I believe we should not do any specific handling to restrict this but
> I am open for the opinions.

Restricting this as well, lets see what others think.


>
> > > 5) Also, I feel we can add some documentation now to help others to
> > > understand/review the patch better without going through the long
> > > thread.
>
> Make sense, I will do that in the next version.
Done that but not compiled the docs as I don't currently have the
setup so added as WIP patch.


> > > 2)
> > > Conflicts where row on sub is missing, local_ts incorrectly inserted.
> > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something
> > > indicating that it is not applicable for this conflict-type?
> > >
> > > Example: delete_missing, update_missing
> > > pub:
> > >  insert into tab1 values(10,10);
> > >  insert into tab1 values(20,10);
> > >  sub:  delete from tab1 where i=10;
> > >  pub:  delete from tab1 where i=10;
>
> Sure I will test this.

I have fixed this.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-18T10:09:46Z

On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > > Few observations related to publication.
> > > ------------------------------
>
> Thanks Shveta, for testing and sharing your thoughts.  IMHO for
> conflict log tables it should be good enough if we restrict it when
> ALL TABLE options are used, I don't think we need to put extra effort
> to completely restrict it even if users want to explicitly list it
> into the publication.
>
> > >
> > > (In the below comments, clt/CLT implies Conflict Log Table)
> > >
> > > 1)
> > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.
>
> This function is used while publishing every single change and I don't
> think we want to add a cost to check each subscription to identify
> whether the table is listed as CLT.
>
> > > 2)
> > > '\d+ clt'   shows all-tables publication name. I feel we should not
> > > show that for clt.
>
> I think we should fix this.
>
> > > 3)
> > > I am able to create a publication for clt table, should it be allowed?
>
> I believe we should not do any specific handling to restrict this but
> I am open for the opinions.
>
> > > create subscription sub1 connection '...' publication pub1
> > > WITH(conflict_log_table='clt');
> > > create publication pub3 for table clt;
> > >
> > > 4)
> > > Is there a reason we have not made '!IsConflictHistoryRelid' check as
> > > part of is_publishable_class() itself? If we do so, other code-logics
> > > will also get clt as non-publishable always (and will solve a few of
> > > the above issues I think). IIUC, there is no place where we want to
> > > mark CLT as publishable or is there any?
>
> IMHO the main reason is performance.
>
> > > 5) Also, I feel we can add some documentation now to help others to
> > > understand/review the patch better without going through the long
> > > thread.
>
> Make sense, I will do that in the next version.
>
> > >
> > > Few observations related to conflict-logging:
> > > ------------------------------
> > > 1)
> > > I found that for the conflicts which ultimately result in Error, we do
> > > not insert any conflict-record in clt.
> > >
> > > a)
> > > Example: insert_exists, update_Exists
> > > create table tab1 (i int primary key, j int);
> > > sub: insert into tab1 values(30,10);
> > > pub: insert into tab1 values(30,10);
> > > ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
> > > No record in clt.
> > >
> > > sub:
> > > <some pre-data needed>
> > > update tab1 set i=40 where i = 30;
> > > pub: update tab1 set i=40 where i = 20;
> > > ERROR:  conflict detected on relation "public.tab1": conflict=update_exists
> > > No record in clt.
>
> Yeah that interesting need to put thought on how to commit this record
> when an outer transaction is aborted as we do not have autonomous
> transactions which are generally used for this kind of logging.

Right

> But
> we can explore more options like inserting into conflict log tables
> outside the outer transaction.

Yes, that seems the way to me. I could not find any such existing
reference/usage in code though.

>
> > > b)
> > > Another question related to this is, since these conflicts (which
> > > results in error) keep on happening until user resolves these or skips
> > > these or 'disable_on_error' is set. Then are we going to insert these
> > > multiple times? We do count these in 'confl_insert_exists' and
> > > 'confl_update_exists' everytime, so it makes sense to log those each
> > > time in clt as well. Thoughts?
>
> I think it make sense to insert every time we see the conflict, but it
> would be good to have opinion from others as well.
>
> > > 2)
> > > Conflicts where row on sub is missing, local_ts incorrectly inserted.
> > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something
> > > indicating that it is not applicable for this conflict-type?
> > >
> > > Example: delete_missing, update_missing
> > > pub:
> > >  insert into tab1 values(10,10);
> > >  insert into tab1 values(20,10);
> > >  sub:  delete from tab1 where i=10;
> > >  pub:  delete from tab1 where i=10;
>
> Sure I will test this.
>
> >
> > 3)
> > We also need to think how we are going to display the info in case of
> > multiple_unique_conflicts as there could be multiple local and remote
> > tuples conflicting for one single operation. Example:
> >
> > create table conf_tab (a int primary key, b int unique, c int unique);
> >
> > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4);
> >
> > pub: insert into conf_tab values (2,3,4);
> >
> > ERROR:  conflict detected on relation "public.conf_tab":
> > conflict=multiple_unique_conflicts
> > DETAIL:  Key already exists in unique index "conf_tab_pkey", modified
> > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
> > Key already exists in unique index "conf_tab_b_key", modified locally
> > in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
> > Key already exists in unique index "conf_tab_c_key", modified locally
> > in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).
> > CONTEXT:  processing remote data for replication origin "pg_16392"
> > during message type "INSERT" for replication target relation
> > "public.conf_tab" in transaction 781, finished at 0/017FDDA0
> >
> > Currently in clt, we have singular terms such as 'key_tuple',
> > 'local_tuple', 'remote_tuple'.  Shall we have multiple rows inserted?
> > But it does not look reasonable to have multiple rows inserted for a
> > single conflict raised. I will think more about this.
>
> Currently I am inserting multiple records in the conflict history
> table, the same as each tuple is logged, but couldn't find any better
> way for this. Another option is to use an array of tuples instead of a
> single tuple but not sure this might make things more complicated to
> process by any external tool.

It’s arguable and hard to say what the correct behaviour should be.
I’m slightly leaning toward having a single row per conflict. IMO,
overall the confl_* counters in pg_stat_subscription_stats should
align with the number of entries in the conflict history table, which
implies one row even for multiple_unique_conflicts. But I also
understand that this approach could make things complicated for
external tools. For now, we can proceed with logging multiple rows for
a single multiple_unique_conflicts occurrence and wait to hear others’
opinions.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-18T11:17:14Z

On Mon, Nov 17, 2025 at 11:54 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > > Few observations related to publication.
> > > > ------------------------------
> >
> > Thanks Shveta, for testing and sharing your thoughts.  IMHO for
> > conflict log tables it should be good enough if we restrict it when
> > ALL TABLE options are used, I don't think we need to put extra effort
> > to completely restrict it even if users want to explicitly list it
> > into the publication.
> >
> > > >
> > > > (In the below comments, clt/CLT implies Conflict Log Table)
> > > >
> > > > 1)
> > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.
>
> After putting more thought I have changed this to return false for
> clt, as this is just an exposed function not called by pgoutput layer.
>
> > > > 2)
> > > > '\d+ clt'   shows all-tables publication name. I feel we should not
> > > > show that for clt.
> >
> Fixed
>
> >
> > > > 3)
> > > > I am able to create a publication for clt table, should it be allowed?
> >
> > I believe we should not do any specific handling to restrict this but
> > I am open for the opinions.
>
> Restricting this as well, lets see what others think.
>
>
> >
> > > > 5) Also, I feel we can add some documentation now to help others to
> > > > understand/review the patch better without going through the long
> > > > thread.
> >
> > Make sense, I will do that in the next version.
> Done that but not compiled the docs as I don't currently have the
> setup so added as WIP patch.
>
>
> > > > 2)
> > > > Conflicts where row on sub is missing, local_ts incorrectly inserted.
> > > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something
> > > > indicating that it is not applicable for this conflict-type?
> > > >
> > > > Example: delete_missing, update_missing
> > > > pub:
> > > >  insert into tab1 values(10,10);
> > > >  insert into tab1 values(20,10);
> > > >  sub:  delete from tab1 where i=10;
> > > >  pub:  delete from tab1 where i=10;
> >
> > Sure I will test this.
>
> I have fixed this.

Thanks for the patch.  Some feedback about the clt:

1)
local_origin is always NULL in my tests for all conflict types I tried.

2)
Do we need 'key_tuple' as such or replica_identity is enough/better?
I see 'key_tuple' inserted as {"i":10,"j":null} for delete_missing
case where query was 'delete from tab1 where i=10'; here 'i' is PK;
which seems okay.
But it is '{"i":20,"j":200}' for update_origin_differ case where query
was 'update tab1 set j=200 where i =20'. Here too RI is 'i' alone. I
feel 'j' should not be part of the key but let me know if I have
misunderstood. IMO, 'j' being part of remote_tuple should be good
enough.

3)
Do we need to have a timestamp column as well to say when conflict was
recorded? Or local_commit_ts, remote_commit_ts are sufficient?
Thoughts

4)
Also, it makes sense if we have 'conflict_type' next to 'relid'. I
feel relid and conflict_type are primary columns and rest are related
details.

5)
Do we need table_schema, table_name when we have relid already? If we
want to retain these, we can name them as schemaname and relname to be
consistent with all other stats tables. IMO, then the order can be:
relid, schemaname, relname, conflcit_type and then the rest of the
details.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-11-19T01:30:49Z

Hi Dilip.

I started to look at this thread. Here are some comments for patch v4-0001.


=====
GENERAL

1.
There's some inconsistency in how this new table is called at different times :
a) "conflict table"
b) "conflict log table"
c) "conflict log history table"
d) "conflict history"

My preference was (b). Making this consistent will have impacts on
many macros, variables, comments, function names, etc.

~~~

2.
What about enhancements to description \dRs+ so the subscription
conflict log table is displayed?

~~~

3.
What about enhancements to the tab-complete code?

======
src/backend/commands/subscriptioncmds.c

4.
 #define SUBOPT_MAX_RETENTION_DURATION 0x00008000
 #define SUBOPT_LSN 0x00010000
 #define SUBOPT_ORIGIN 0x00020000
+#define SUBOPT_CONFLICT_TABLE 0x00030000

Bug? Shouldn't that be 0x00040000.

~~~

5.
+ char    *conflicttable;
  XLogRecPtr lsn;
 } SubOpts;

IMO 'conflicttable' looks too much like 'conflictable', which may
cause some confusion on first reading.

~~~

6.
+static void CreateConflictLogTable(Oid namespaceId, char *conflictrel);
+static void DropConflictLogTable(Oid namespaceId, char *conflictrel);

AFAIK it is more conventional for the static functions to be
snake_case and the extern functions to use CamelCase. So these would
be:
- create_conflict_log_table
- drop_conflict_log_table

~~~

CreateSubscription:

7.
+ /* If conflict log table name is given than create the table. */
+ if (opts.conflicttable)
+ CreateConflictLogTable(conflict_table_nspid, conflict_table);
+

typo: /If conflict/If a conflict/

typo: "than"

~~~

AlterSubscription:

8.
-   SUBOPT_ORIGIN);
+   SUBOPT_ORIGIN |
+   SUBOPT_CONFLICT_TABLE);

The line wrapping doesn't seem necessary.

~~~

9.
+ replaces[Anum_pg_subscription_subconflictnspid - 1] = true;
+ replaces[Anum_pg_subscription_subconflicttable - 1] = true;
+
+ CreateConflictLogTable(nspid, relname);
+ }
+

What are the rules regarding replacing one log table with a different
log table for the same subscription? I didn't see anything about this
scenario, nor any test cases.

~~~

CreateConflictLogTable:

10.
+ /*
+ * Check if table with same name already present, if so report an error
+ * as currently we do not support user created table as conflict log
+ * table.
+ */

Is the comment about "user-created table" strictly correct? e.g. Won't
you encounter the same problem if there are 2 subscriptions trying to
set the same-named conflict log table?

SUGGESTION
Report an error if the specified conflict log table already exists.

~~~

DropConflictLogTable:

11.
+ /*
+ * Drop conflict log table if exist, use if exists ensures the command
+ * won't error if the table is already gone.
+ */

The reason for EXISTS was already mentioned in the function comment.

SUGGESTION
Drop the conflict log table if it exists.

======
src/backend/replication/logical/conflict.c

12.
+static Datum TupleTableSlotToJsonDatum(TupleTableSlot *slot);
+
+static void InsertConflictLog(Relation rel,
+   TransactionId local_xid,
+   TimestampTz local_ts,
+   ConflictType conflict_type,
+   RepOriginId origin_id,
+   TupleTableSlot *searchslot,
+   TupleTableSlot *localslot,
+   TupleTableSlot *remoteslot);

Same as earlier comment #6 -- isn't it conventional to use snake_case
for the static function names?

~~~

TupleTableSlotToJsonDatum:

13.
+ * This would be a new internal helper function for logical replication
+ * Needs to handle various data types and potentially TOASTed data

What's this comment about? Something doesn't look quite right.

~~~

InsertConflictLog:

14.
+ /* TODO: proper error code */
+ relid = get_relname_relid(relname, nspid);
+ if (!OidIsValid(relid))
+ elog(ERROR, "conflict log history table does not exists");
+ conflictrel = table_open(relid, RowExclusiveLock);
+ if (conflictrel == NULL)
+ elog(ERROR, "could not open conflict log history table");

14a.
What's the TODO comment for? Are you going to replace these elogs?

~

14b.
Typo: "does not exists"

~

14c.
An unnecessary double-blank line follows this code fragment.

~~~

15.
+ /* Populate the values and nulls arrays */
+ attno = 0;
+ values[attno] = ObjectIdGetDatum(RelationGetRelid(rel));
+ attno++;
+
+ if (TransactionIdIsValid(local_xid))
+ values[attno] = TransactionIdGetDatum(local_xid);
+ else
+ nulls[attno] = true;
+ attno++;
+
+ if (TransactionIdIsValid(remote_xid))
+ values[attno] = TransactionIdGetDatum(remote_xid);
+ else
+ nulls[attno] = true;
+ attno++;
+
+ values[attno] = LSNGetDatum(remote_final_lsn);
+ attno++;
+
+ if (local_ts > 0)
+ values[attno] = TimestampTzGetDatum(local_ts);
+ else
+ nulls[attno] = true;
+ attno++;
+
+ if (remote_commit_ts > 0)
+ values[attno] = TimestampTzGetDatum(remote_commit_ts);
+ else
+ nulls[attno] = true;
+ attno++;
+
+ values[attno] =
+ CStringGetTextDatum(get_namespace_name(RelationGetNamespace(rel)));
+ attno++;
+
+ values[attno] = CStringGetTextDatum(RelationGetRelationName(rel));
+ attno++;
+
+ values[attno] = CStringGetTextDatum(ConflictTypeNames[conflict_type]);
+ attno++;
+
+ if (origin_id != InvalidRepOriginId)
+ replorigin_by_oid(origin_id, true, &origin);
+
+ if (origin != NULL)
+ values[attno] = CStringGetTextDatum(origin);
+ else
+ nulls[attno] = true;
+ attno++;
+
+ if (replorigin_session_origin != InvalidRepOriginId)
+ replorigin_by_oid(replorigin_session_origin, true, &remote_origin);
+
+ if (remote_origin != NULL)
+ values[attno] = CStringGetTextDatum(remote_origin);
+ else
+ nulls[attno] = true;
+ attno++;
+
+ if (searchslot != NULL)
+ values[attno] = TupleTableSlotToJsonDatum(searchslot);
+ else
+ nulls[attno] = true;
+ attno++;
+
+ if (localslot != NULL)
+ values[attno] = TupleTableSlotToJsonDatum(localslot);
+ else
+ nulls[attno] = true;
+ attno++;
+
+ if (remoteslot != NULL)
+ values[attno] = TupleTableSlotToJsonDatum(remoteslot);
+ else
+ nulls[attno] = true;
+

15a.
It might be simpler to just post-increment that 'attno' in all the
assignments and save a dozen lines of code:
e.g. values[attno++] = ...

~

15b.
Also, put a sanity Assert check at the end, like:
Assert(attno + 1 == MAX_CONFLICT_ATTR_NUM);


======
src/backend/utils/cache/lsyscache.c

16.
+ if (isnull)
+ {
+ ReleaseSysCache(tup);
+ return NULL;
+ }
+
+ *nspid = subform->subconflictnspid;
+ relname = pstrdup(TextDatumGetCString(datum));
+
+ ReleaseSysCache(tup);
+
+ return relname;

It would be tidier to have a single release/return by coding this
slightly differently.

SUGGESTION:

char *relname = NULL;
...
if (!isnull)
{
  *nspid = subform->subconflictnspid;
  relname = pstrdup(TextDatumGetCString(datum));
}

ReleaseSysCache(tup);
return relname;

======
src/include/catalog/pg_subscription.h

17.
+ Oid subconflictnspid; /* Namespace Oid in which the conflict history
+ * table is created. */

Would it be better to make these 2 new member names more alike, since
they go together. e.g.
confl_table_nspid
confl_table_name

======
src/include/replication/conflict.h

18.
+#define MAX_CONFLICT_ATTR_NUM 15

I felt this doesn't really belong here. Just define it atop/within the
function InsertConflictLog()

~~~

19.
 extern void InitConflictIndexes(ResultRelInfo *relInfo);
+
 #endif

Spurious whitespace change not needed for this patch.

======
src/test/regress/sql/subscription.sql

20.
How about adding some more test scenarios:
e.g.1. ALTER the conflict log table of some subscription that already has one
e.g.2. Have multiple subscriptions that specify the same conflict log table

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-11-19T06:29:45Z

Here are some comments for the patch v4-0002.

======
GENERAL

1.
The patch should include test cases:

- to confirm an error happens when attempting to publish clt
- to confirm \dt+ clt is not showing the ALL TABLES publication
- to confirm that SQL function pg_relation_is_publishable givesthe
expected result
- etc.

======
Commit Message

1.
When all table option is used with publication don't publish the
conflict history tables.

~

Maybe reword that using uppercase for keywords, like:

SUGGESTION
A conflict log table will not be published by a FOR ALL TABLES publication.

======
src/backend/catalog/pg_publication.c

check_publication_add_relation:

3.
+ /* Can't be created as conflict log table */
+ if (IsConflictLogRelid(RelationGetRelid(targetrel)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(targetrel)),
+ errdetail("This operation is not supported for conflict log tables.")));

3a.
Typo in comment.

SUGGESTION
Can't be a conflict log table

~

3b.
I was wondering if this check should be moved to the bottom of the function.

I think IsConflictLogRelid() is the most inefficient of all these
conditions, so it is better to give the other ones a chance to fail
quickly before needing to check for clt.

~~~

pg_relation_is_publishable:

4.
 /*
- * SQL-callable variant of the above
+ * SQL-callable variant of the above and this should not be a conflict log rel
  *
  * This returns null when the relation does not exist.  This is intended to be
  * used for example in psql to avoid gratuitous errors when there are

I felt this new comment should be in the code, instead of in the
function comment.

SUGGESTION
/* subscription conflict log tables are not published */
result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) &&
  !IsConflictLogRelid(relid);

~~~

5.
It seemed strange that function
pg_relation_is_publishable(PG_FUNCTION_ARGS) is checking
IsConflictLogRelid, but function is_publishable_relation(Relation rel)
is not.

~~~

GetAllPublicationRelations:

6.
+ /* conflict history tables are not published. */
  if (is_publishable_class(relid, relForm) &&
+ !IsConflictLogRelid(relid) &&
  !(relForm->relispartition && pubviaroot))
  result = lappend_oid(result, relid);
Inconsistent "history table" terminology.

Maybe this comment should be identical to the other one above. e.g.
/* subscription conflict log tables are not published */

======
src/backend/commands/subscriptioncmds.c

IsConflictLogRelid:

8.
+/*
+ * Is relation used as a conflict log table
+ *
+ * Scan all the subscription and check whether the relation is used as
+ * conflict log table.
+ */

typo: "all the subscription"

Also, the 2nd sentence repeats the purpose of the function;  I don't
think you need to say it twice.

SUGGESTION
Check if the specified relation is used as a conflict log table by any
subscription.

~~~

9.
+ if (relname == NULL)
+ continue;
+ if (relid == get_relname_relid(relname, nspid))
+ {
+ found = true;
+ break;
+ }

It seemed unnecessary to separate out the 'continue' like that.

In passing, consider renaming that generic 'found' to be the proper
meaning of the boolean.

SUGGESTION
if (relname && relid == get_relname_relid(relname, nspid))
{
  is_clt = true;
  break;
}

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-11-19T07:10:05Z

Hi Dilip,

FYI, patch v4-0003 (docs) needs rebasing due to ada78cd.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-19T10:16:20Z

On Tue, Nov 18, 2025 at 4:47 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thanks for the patch.  Some feedback about the clt:
>
> 1)
> local_origin is always NULL in my tests for all conflict types I tried.

You need to set the replication origin as shown below
On subscriber side:
---------------------------
SELECT pg_replication_origin_create('my_remote_source_2');
SELECT pg_replication_origin_session_setup('my_remote_source_2');
UPDATE test SET b=200 where a=1;

On remote:
---------------
UPDATE test SET b=300 where a=1; -- conflicting operation with local node

On subscriber
------------------
postgres[1514377]=# select local_origin, remote_origin from
myschema.conflict_log_history2 ;
    local_origin    | remote_origin
--------------------+---------------------
 my_remote_source_2 | pg_16396

> 2)
> Do we need 'key_tuple' as such or replica_identity is enough/better?
> I see 'key_tuple' inserted as {"i":10,"j":null} for delete_missing
> case where query was 'delete from tab1 where i=10'; here 'i' is PK;
> which seems okay.
> But it is '{"i":20,"j":200}' for update_origin_differ case where query
> was 'update tab1 set j=200 where i =20'. Here too RI is 'i' alone. I
> feel 'j' should not be part of the key but let me know if I have
> misunderstood. IMO, 'j' being part of remote_tuple should be good
> enough.

Yeah we should display the replica identity only, I assumed in
ReportApplyConflict() the searchslot should only have RI tuple but it
is sending a remote tuple in the searchslot, so might need to extract
the RI from this slot, I will work on this.

> 3)
> Do we need to have a timestamp column as well to say when conflict was
> recorded? Or local_commit_ts, remote_commit_ts are sufficient?
> Thoughts

You mean we can record the timestamp now while inserting, not sure if
it will add some more meaningful information than remote_commit_ts,
but let's see what others think.

> 4)
> Also, it makes sense if we have 'conflict_type' next to 'relid'. I
> feel relid and conflict_type are primary columns and rest are related
> details.

Sure

> 5)
> Do we need table_schema, table_name when we have relid already? If we
> want to retain these, we can name them as schemaname and relname to be
> consistent with all other stats tables. IMO, then the order can be:
> relid, schemaname, relname, conflcit_type and then the rest of the
> details.

Yeah this makes the table denormalized as we can fetch this
information by joining with pg_class, but I think it might be better
for readability, lets see what others think, for now I will reorder as
suggested.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-19T11:19:41Z

On Wed, Nov 19, 2025 at 3:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Nov 18, 2025 at 4:47 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Thanks for the patch.  Some feedback about the clt:
> >
> > 1)
> > local_origin is always NULL in my tests for all conflict types I tried.
>
> You need to set the replication origin as shown below
> On subscriber side:
> ---------------------------
> SELECT pg_replication_origin_create('my_remote_source_2');
> SELECT pg_replication_origin_session_setup('my_remote_source_2');
> UPDATE test SET b=200 where a=1;
>
> On remote:
> ---------------
> UPDATE test SET b=300 where a=1; -- conflicting operation with local node
>
> On subscriber
> ------------------
> postgres[1514377]=# select local_origin, remote_origin from
> myschema.conflict_log_history2 ;
>     local_origin    | remote_origin
> --------------------+---------------------
>  my_remote_source_2 | pg_16396

Okay, I see, thanks!

>
> > 2)
> > Do we need 'key_tuple' as such or replica_identity is enough/better?
> > I see 'key_tuple' inserted as {"i":10,"j":null} for delete_missing
> > case where query was 'delete from tab1 where i=10'; here 'i' is PK;
> > which seems okay.
> > But it is '{"i":20,"j":200}' for update_origin_differ case where query
> > was 'update tab1 set j=200 where i =20'. Here too RI is 'i' alone. I
> > feel 'j' should not be part of the key but let me know if I have
> > misunderstood. IMO, 'j' being part of remote_tuple should be good
> > enough.
>
> Yeah we should display the replica identity only, I assumed in
> ReportApplyConflict() the searchslot should only have RI tuple but it
> is sending a remote tuple in the searchslot, so might need to extract
> the RI from this slot, I will work on this.

yeah, we have extracted it already in
errdetail_apply_conflict()->build_tuple_value_details(). See it dumps
it in log:

LOG:  conflict detected on relation "public.tab1":
conflict=update_origin_differs
DETAIL:  Updating the row that was modified locally in transaction 768
at 2025-11-18 12:09:19.658502+05:30.
        Existing local row (20, 100); remote row (20, 200); replica
identity (i)=(20).

We somehow need to reuse it.

>
> > 3)
> > Do we need to have a timestamp column as well to say when conflict was
> > recorded? Or local_commit_ts, remote_commit_ts are sufficient?
> > Thoughts
>
> You mean we can record the timestamp now while inserting, not sure if
> it will add some more meaningful information than remote_commit_ts,
> but let's see what others think.
>

On rethinking, we can skip it. The commit-ts of both sides are enough.

> > 4)
> > Also, it makes sense if we have 'conflict_type' next to 'relid'. I
> > feel relid and conflict_type are primary columns and rest are related
> > details.
>
> Sure
>
> > 5)
> > Do we need table_schema, table_name when we have relid already? If we
> > want to retain these, we can name them as schemaname and relname to be
> > consistent with all other stats tables. IMO, then the order can be:
> > relid, schemaname, relname, conflcit_type and then the rest of the
> > details.
>
> Yeah this makes the table denormalized as we can fetch this
> information by joining with pg_class, but I think it might be better
> for readability, lets see what others think, for now I will reorder as
> suggested.
>

Okay, works for me if we want to keep these. I see that most of the
other statistics tables (pg_stat_all_indexes, pg_statio_all_tables,
pg_statio_all_sequences etc)  that maintain a relid also retain the
names.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-20T12:08:21Z

On Wed, Nov 19, 2025 at 7:01 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Dilip.
>
> I started to look at this thread. Here are some comments for patch v4-0001.

Thanks Peter for your review, worked on most of the comments for 0001
>
> =====
> GENERAL
>
> 1.
> There's some inconsistency in how this new table is called at different times :
> a) "conflict table"
> b) "conflict log table"
> c) "conflict log history table"
> d) "conflict history"
>
> My preference was (b). Making this consistent will have impacts on
> many macros, variables, comments, function names, etc.

Yeah even my preference is b) so used everywhere.

> ~~~
>
> 2.
> What about enhancements to description \dRs+ so the subscription
> conflict log table is displayed?

Done, I have displayed the conflict log table name, not sure shall we
display complete schema qualified name, if so we might need to join
with pg_namespace.

> ~~~
>
> 3.
> What about enhancements to the tab-complete code?

Done

> ======
> src/backend/commands/subscriptioncmds.c
>
> 4.
>  #define SUBOPT_MAX_RETENTION_DURATION 0x00008000
>  #define SUBOPT_LSN 0x00010000
>  #define SUBOPT_ORIGIN 0x00020000
> +#define SUBOPT_CONFLICT_TABLE 0x00030000
>
> Bug? Shouldn't that be 0x00040000.

Yeah, fixed.

> ~~~
>
> 5.
> + char    *conflicttable;
>   XLogRecPtr lsn;
>  } SubOpts;
>
> IMO 'conflicttable' looks too much like 'conflictable', which may
> cause some confusion on first reading.

Changed to conflictlogtable

> ~~~
>
> 6.
> +static void CreateConflictLogTable(Oid namespaceId, char *conflictrel);
> +static void DropConflictLogTable(Oid namespaceId, char *conflictrel);
>
> AFAIK it is more conventional for the static functions to be
> snake_case and the extern functions to use CamelCase. So these would
> be:
> - create_conflict_log_table
> - drop_conflict_log_table

Done

> ~~~
>
> CreateSubscription:
>
> 7.
> + /* If conflict log table name is given than create the table. */
> + if (opts.conflicttable)
> + CreateConflictLogTable(conflict_table_nspid, conflict_table);
> +
>
> typo: /If conflict/If a conflict/
>
> typo: "than"

Fixed

> ~~~
>
> AlterSubscription:
>
> 8.
> -   SUBOPT_ORIGIN);
> +   SUBOPT_ORIGIN |
> +   SUBOPT_CONFLICT_TABLE);
>
> The line wrapping doesn't seem necessary.

Without wrapping it crosses 80 characters per line limit.

> ~~~
>
> 9.
> + replaces[Anum_pg_subscription_subconflictnspid - 1] = true;
> + replaces[Anum_pg_subscription_subconflicttable - 1] = true;
> +
> + CreateConflictLogTable(nspid, relname);
> + }
> +
>
> What are the rules regarding replacing one log table with a different
> log table for the same subscription? I didn't see anything about this
> scenario, nor any test cases.

Added test and updated the code as well, so if we set different log
table, we will drop the old and create new table, however if you set
the same table, just NOTICE will be issued and table will not be
created again.

> ~~~
>
> CreateConflictLogTable:
>
> 10.
> + /*
> + * Check if table with same name already present, if so report an error
> + * as currently we do not support user created table as conflict log
> + * table.
> + */
>
> Is the comment about "user-created table" strictly correct? e.g. Won't
> you encounter the same problem if there are 2 subscriptions trying to
> set the same-named conflict log table?
>
> SUGGESTION
> Report an error if the specified conflict log table already exists.

Done

> ~~~
>
> DropConflictLogTable:
>
> 11.
> + /*
> + * Drop conflict log table if exist, use if exists ensures the command
> + * won't error if the table is already gone.
> + */
>
> The reason for EXISTS was already mentioned in the function comment.
>
> SUGGESTION
> Drop the conflict log table if it exists.

Done

> ======
> src/backend/replication/logical/conflict.c
>
> 12.
> +static Datum TupleTableSlotToJsonDatum(TupleTableSlot *slot);
> +
> +static void InsertConflictLog(Relation rel,
> +   TransactionId local_xid,
> +   TimestampTz local_ts,
> +   ConflictType conflict_type,
> +   RepOriginId origin_id,
> +   TupleTableSlot *searchslot,
> +   TupleTableSlot *localslot,
> +   TupleTableSlot *remoteslot);
>
> Same as earlier comment #6 -- isn't it conventional to use snake_case
> for the static function names?

Done

> ~~~
>
> TupleTableSlotToJsonDatum:
>
> 13.
> + * This would be a new internal helper function for logical replication
> + * Needs to handle various data types and potentially TOASTed data
>
> What's this comment about? Something doesn't look quite right.

Hmm, that's bad, fixed.

> ~~~
>
> InsertConflictLog:
>
> 14.
> + /* TODO: proper error code */
> + relid = get_relname_relid(relname, nspid);
> + if (!OidIsValid(relid))
> + elog(ERROR, "conflict log history table does not exists");
> + conflictrel = table_open(relid, RowExclusiveLock);
> + if (conflictrel == NULL)
> + elog(ERROR, "could not open conflict log history table");
>
> 14a.
> What's the TODO comment for? Are you going to replace these elogs?

replaced with ereport
> ~
>
> 14b.
> Typo: "does not exists"

fixed

> ~
>
> 14c.
> An unnecessary double-blank line follows this code fragment.

fixed

> ~~~
>
> 15.
> + /* Populate the values and nulls arrays */
> + attno = 0;
> + values[attno] = ObjectIdGetDatum(RelationGetRelid(rel));
> + attno++;
> +
> + if (TransactionIdIsValid(local_xid))
> + values[attno] = TransactionIdGetDatum(local_xid);
> + else
> + nulls[attno] = true;
> + attno++;
> +
> + if (TransactionIdIsValid(remote_xid))
> + values[attno] = TransactionIdGetDatum(remote_xid);
> + else
> + nulls[attno] = true;
> + attno++;
> +
> + values[attno] = LSNGetDatum(remote_final_lsn);
> + attno++;
> +
> + if (local_ts > 0)
> + values[attno] = TimestampTzGetDatum(local_ts);
> + else
> + nulls[attno] = true;
> + attno++;
> +
> + if (remote_commit_ts > 0)
> + values[attno] = TimestampTzGetDatum(remote_commit_ts);
> + else
> + nulls[attno] = true;
> + attno++;
> +
> + values[attno] =
> + CStringGetTextDatum(get_namespace_name(RelationGetNamespace(rel)));
> + attno++;
> +
> + values[attno] = CStringGetTextDatum(RelationGetRelationName(rel));
> + attno++;
> +
> + values[attno] = CStringGetTextDatum(ConflictTypeNames[conflict_type]);
> + attno++;
> +
> + if (origin_id != InvalidRepOriginId)
> + replorigin_by_oid(origin_id, true, &origin);
> +
> + if (origin != NULL)
> + values[attno] = CStringGetTextDatum(origin);
> + else
> + nulls[attno] = true;
> + attno++;
> +
> + if (replorigin_session_origin != InvalidRepOriginId)
> + replorigin_by_oid(replorigin_session_origin, true, &remote_origin);
> +
> + if (remote_origin != NULL)
> + values[attno] = CStringGetTextDatum(remote_origin);
> + else
> + nulls[attno] = true;
> + attno++;
> +
> + if (searchslot != NULL)
> + values[attno] = TupleTableSlotToJsonDatum(searchslot);
> + else
> + nulls[attno] = true;
> + attno++;
> +
> + if (localslot != NULL)
> + values[attno] = TupleTableSlotToJsonDatum(localslot);
> + else
> + nulls[attno] = true;
> + attno++;
> +
> + if (remoteslot != NULL)
> + values[attno] = TupleTableSlotToJsonDatum(remoteslot);
> + else
> + nulls[attno] = true;
> +
>
> 15a.
> It might be simpler to just post-increment that 'attno' in all the
> assignments and save a dozen lines of code:
> e.g. values[attno++] = ...

Yeah done that

> ~
>
> 15b.
> Also, put a sanity Assert check at the end, like:
> Assert(attno + 1 == MAX_CONFLICT_ATTR_NUM);

Done
>
> ======
> src/backend/utils/cache/lsyscache.c
>
> 16.
> + if (isnull)
> + {
> + ReleaseSysCache(tup);
> + return NULL;
> + }
> +
> + *nspid = subform->subconflictnspid;
> + relname = pstrdup(TextDatumGetCString(datum));
> +
> + ReleaseSysCache(tup);
> +
> + return relname;
>
> It would be tidier to have a single release/return by coding this
> slightly differently.
>
> SUGGESTION:
>
> char *relname = NULL;
> ...
> if (!isnull)
> {
>   *nspid = subform->subconflictnspid;
>   relname = pstrdup(TextDatumGetCString(datum));
> }
>
> ReleaseSysCache(tup);
> return relname;

Right, changed it.

> ======
> src/include/catalog/pg_subscription.h
>
> 17.
> + Oid subconflictnspid; /* Namespace Oid in which the conflict history
> + * table is created. */
>
> Would it be better to make these 2 new member names more alike, since
> they go together. e.g.
> confl_table_nspid
> confl_table_name

In pg_subscription.h all field follows same convention without "_" so
I have changed to

subconflictlognspid
subconflictlogtable


> ======
> src/include/replication/conflict.h
>
> 18.
> +#define MAX_CONFLICT_ATTR_NUM 15
>
> I felt this doesn't really belong here. Just define it atop/within the
> function InsertConflictLog()

Done
> ~~~
>
> 19.
>  extern void InitConflictIndexes(ResultRelInfo *relInfo);
> +
>  #endif
>
> Spurious whitespace change not needed for this patch.

Fixed

> ======
> src/test/regress/sql/subscription.sql
>
> 20.
> How about adding some more test scenarios:
> e.g.1. ALTER the conflict log table of some subscription that already has one
> e.g.2. Have multiple subscriptions that specify the same conflict log table

Added

Pending:
1) fixed review comments of 0002 and 0003
2) Need to add replica identity tuple instead of full tuple - reported by Shveta
3) Keeping the logs in case of outer transaction failure by moving log
insertion outside the main transaction - reported by Shveta

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-11-21T02:05:43Z

Thanks for addressing all my previous review comment of v4.

Here are some more comments for the latest  patch v5-0001.

======
GENERAL

1.
There are still a couple of place remainig where this new table was
not consistent called a "Conflict Log Table" (e.g. search for
"history")

e.g. Subject: [PATCH v5] Add configurable conflict log history table
for Logical Replication
e.g. + /* Insert conflict details to log history table. */
e.g. +-- CONFLICT LOG HISTORY TABLE TESTS

~~~

2.
Is automatically dropping the log tables always what the user might
want to happen? Maybe someone want them lying around afterwards for
later analysis -- I don't really know the answer; Just wondering if
this is (a) good to be tidy or (b) bad to remove user flexibility. Or
maybe the answer is leave if but make sure to add more documentation
to say "if you are going to want to do some post analysis then be sure
to copy this table data before it gets automatically dropped".

======
Commit message.

3.
User-Defined Table: The conflict log is stored in a user-managed table
rather than a system catalog.

~

I felt "User-defined" makes it sound like the user does CREATE TABLE
themselves and has some control over the schema. Maybe say
"User-Managed Table:" instead?

======
src/backend/commands/subscriptioncmds.c

4.
 #define SUBOPT_LSN 0x00010000
 #define SUBOPT_ORIGIN 0x00020000
+#define SUBOPT_CONFLICT_LOG_TABLE 0x00040000

Whitespace alignment.

~~~

AlterSubscription:

5.
+ values[Anum_pg_subscription_subconflictlognspid - 1] =
+ ObjectIdGetDatum(nspid);
+ values[Anum_pg_subscription_subconflictlogtable - 1] =
+ CStringGetTextDatum(relname);
+
+ replaces[Anum_pg_subscription_subconflictlognspid - 1] = true;
+ replaces[Anum_pg_subscription_subconflictlogtable - 1] = true;

Something feels back-to-front, because if the same clt is being
re-used (like the NOTICE part taht follows) then why do you need to
reassign and say replaces[] = true here?

~~~

6.
+ /*
+ * If the subscription already has the conflict log table
+ * set to the exact same name and namespace currently being
+ * specified, and that table exists, just give notice and
+ * skip creation.
+ */

Is there a simpler way to say the same thing?

SUGGESTION
If the subscription already uses this conflict log table and it
exists, just issue a notice.

~~~

7.
+ ereport(NOTICE,
+ (errmsg("skipping table creation because \"%s.%s\" is already set as
conflict log table",
+ nspname, relname)));

I wasn't sure you need to say "skipping table creation because"... it
seems kind of internal details. How about just:

\"%s.%s\" is already in use as the conflict log table for this subscription

~~~

8.
+ /*
+ * Drop the existing conflict log table if we are
+ * setting a new table.
+ */

The comment didn't feel right by implying there is something to drop.

SUGGESTION
Create the conflict log table after dropping any pre-existing one.

~~~

drop_conflict_log_table:

9.
+ /* Drop the conflict log table if it exist. */

typo: /exist./exists./

======
src/backend/replication/logical/conflict.c

10.
+static Datum
+tuple_table_slot_to_json_datum(TupleTableSlot *slot)
+{
+ HeapTuple tuple = ExecCopySlotHeapTuple(slot);
+ Datum datum = heap_copy_tuple_as_datum(tuple, slot->tts_tupleDescriptor);
+ Datum json;
+
+ if (TupIsNull(slot))
+ return 0;
+
+ json = DirectFunctionCall1(row_to_json, datum);
+ heap_freetuple(tuple);
+
+ return json;
+}

Bug? Shouldn't that TupIsNull(slot) check *precede* using that slot
for the tuple/datum assignments?

~~~

insert_conflict_log:

11.
+ Datum values[MAX_CONFLICT_ATTR_NUM];
+ bool nulls[MAX_CONFLICT_ATTR_NUM];
+ Oid nspid;
+ Oid relid;
+ Relation conflictrel = NULL;
+ int attno;
+ int options = HEAP_INSERT_NO_LOGICAL;
+ char    *relname;
+ char    *origin = NULL;
+ char    *remote_origin = NULL;
+ HeapTuple tup;

I felt some of these var names can be confusing:

11A.
e.g. "conflictlogrel" (instead of 'conflictrel') would emphasise this
is the rel of the log file, not the rel that encountered a conflict.

~

11B.
Similarly, maybe 'relname' could be 'conflictlogtable', which is also
what it was called elsewhere.

~

11C.
AFAICT, the 'relid' is really the relid of the conflict log. So, maybe
name it as it 'confliglogreid', otherwise it seems confusing when
there is already parameter called 'rel' that is unrelated to thia
'relid'.

~~~

12.
+ if (searchslot != NULL)
+ values[attno++] = tuple_table_slot_to_json_datum(searchslot);
+ else
+ nulls[attno++] = true;
+
+ if (localslot != NULL)
+ values[attno++] = tuple_table_slot_to_json_datum(localslot);
+ else
+ nulls[attno++] = true;
+
+ if (remoteslot != NULL)
+ values[attno++] = tuple_table_slot_to_json_datum(remoteslot);
+ else
+ nulls[attno++] = true;

That function tuple_table_slot_to_json_datum() has potential to return
0. Is that something that needs checking, so you can assign nulls[] =
true?

======
src/backend/replication/logical/worker.c

13.
+char *
+get_subscription_conflict_log_table(Oid subid, Oid *nspid)
+{
+ HeapTuple tup;
+ Datum datum;
+ bool isnull;
+ char    *relname = NULL;
+ Form_pg_subscription subform;
+
+ tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
+
+ if (!HeapTupleIsValid(tup))
+ return NULL;
+
+ subform = (Form_pg_subscription) GETSTRUCT(tup);
+
+ /* Get conflict log table name. */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subconflictlogtable,
+ &isnull);
+ if (!isnull)
+ {
+ *nspid = subform->subconflictlognspid;
+ relname = pstrdup(TextDatumGetCString(datum));
+ }
+
+ ReleaseSysCache(tup);
+ return relname;
+}

You could consider assigning *nspid = InvalidOid when 'isnull' is
true, so then you don't have to rely on the caller pre-assigning a
default sane value. YMMV.

======
src/bin/psql/tab-complete.in.c

14.
- COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
+ COMPLETE_WITH("binary", "connect", "conflict_log_table",
"copy_data", "create_slot",

'conflict_log_table' comes before 'connect' alphabetically.

======
src/test/regress/sql/subscription.sql

15.
+-- ok - change the conlfict log table name for existing subscription
already had old table
+ALTER SUBSCRIPTION regress_conflict_test2 SET (conflict_log_table =
'public.regress_conflict_log3');
+SELECT subname, subconflictlogtable, subconflictlognspid = (SELECT
oid FROM pg_namespace WHERE nspname = 'public') AS is_public_schema
+FROM pg_subscription WHERE subname = 'regress_conflict_test2';
+

typos in comment.
- /conlfict/conlflict/
- /for existing subscription already had old table/for an existing
subscription that already had one/

~~~

16.
+-- check new table should be created and old should be dropped

SUGGESTION
check the new table was created and the old table was dropped

~~~

17.
+-- ok (NOTICE) - try to set the conflict log table which is used by
same subscription
+ALTER SUBSCRIPTION regress_conflict_test2 SET (conflict_log_table =
'public.regress_conflict_log3');
+
+-- fail - try to use the conflict log table being used by some other
subscription
+ALTER SUBSCRIPTION regress_conflict_test2 SET (conflict_log_table =
'public.regress_conflict_log1');

Make those 2 comment more alike:

SUGGESTIONS
-- ok (NOTICE) - set conflict_log_table to one already used by this subscription
...
-- fail - set conflict_log_table to one already used by a different subscription

~~~

18.
Missing tests for describe \dRs+.

e.g. there are already dozens of \dRs+ examples where there is no clt
assigned, but I did not see any tests where the clt *is* assigned.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-24T06:21:40Z

On Thu, Nov 20, 2025 at 5:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
I was working on these pending items, there is something where I got
stuck, I am exploring this more but would like to share the problem.

> 2) Need to add replica identity tuple instead of full tuple - reported by Shveta
I have worked on fixing this along with other comments by Peter, now
we can see only RI tuple is inserted as part of the key_tuple, IMHO
lets keep the name as key tuple as it will use the primary key or
unique key if no explicit replicate identity is set, thoughts?

postgres[3048044]=# select * from myschema.conflict_log_history2;
 relid | schemaname | relname |     conflict_type     | local_xid |
remote_xid | remote_commit_lsn |        local_commit_ts        |
remote_commit_ts        | local_o
rigin | remote_origin | key_tuple |  local_tuple   |  remote_tuple
-------+------------+---------+-----------------------+-----------+------------+-------------------+-------------------------------+-------------------------------+--------
------+---------------+-----------+----------------+----------------
 16385 | public     | test    | update_origin_differs |       765 |
    759 | 0/0174F2E8        | 2025-11-24 06:16:50.468263+00 |
2025-11-24 06:16:55.483507+00 |
      | pg_16396      | {"a":1}   | {"a":1,"b":10} | {"a":1,"b":20}

Now pending work status
1) fixed review comments of 0002 and 0003 - Pending
2) Need to add replica identity tuple instead of full tuple -- Done
3) Keeping the logs in case of outer transaction failure by moving log
insertion outside the main transaction - reported by Shveta - Pending
4) Run pgindent -- planning to do it after we complete the first level
of review - Pending
5) Subscription test cases for logging the actual conflicts - Pending



--
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-11-25T03:33:18Z

Hi Dilip.

Here are a couple of review comments for v6-0001.

======
GENERAL.

1.
Firstly, here is one of my "what if" ideas...

The current patch is described as making a "structured, queryable
record of all logical replication conflicts".

What if we go bigger than that? What if this were made a more generic
"structured, queryable record of logical replication activity"?

AFAIK, there don't have to be too many logic changes to achieve this.
e.g. I'm imagining it is mostly:

* Rename the subscription parameter "conflict_log_table" to
"log_table" or similar.
* Remove/modify the "conflict_" name part from many of the variables
and function names.
* Add another 'type' column to the log table -- e.g. everything this
patch writes can be type="CONFL", or type='c', or whatever.
* Maybe tweak/add some of the other columns for more generic future use

Anyway, it might be worth considering this now, before everything
becomes set in stone with a conflict-only focus, making it too
difficult to add more potential/unknown log table enhancements later.

Thoughts?

======
src/backend/replication/logical/conflict.c

2.
+#include "funcapi.h"
+#include "funcapi.h"

double include of the same header.

~~~

3.
+static Datum tuple_table_slot_to_ri_json_datum(EState *estate,
+    Relation localrel,
+    Oid replica_index,
+    TupleTableSlot *slot);
+
+static void insert_conflict_log(EState *estate, Relation rel,
+ TransactionId local_xid,
+ TimestampTz local_ts,
+ ConflictType conflict_type,
+ RepOriginId origin_id,
+ TupleTableSlot *searchslot,
+ TupleTableSlot *localslot,
+ TupleTableSlot *remoteslot);

There were no spaces between any of the other static declarations, so
why is this one different?

~~~

insert_conflict_log:

insert_conflict_log:

4.
+#define MAX_CONFLICT_ATTR_NUM 15
+ Datum values[MAX_CONFLICT_ATTR_NUM];
+ bool nulls[MAX_CONFLICT_ATTR_NUM];
+ Oid nspid;
+ Oid confliglogreid;
+ Relation conflictlogrel = NULL;
+ int attno;
+ int options = HEAP_INSERT_NO_LOGICAL;
+ char    *conflictlogtable;
+ char    *origin = NULL;
+ char    *remote_origin = NULL;
+ HeapTuple tup;

Typo: Oops. Looks like that typo originated from my previous review
comment, and you took it as-is.

/confliglogreid/confliglogrelid/

~~~

5.
+ if (searchslot != NULL && !TupIsNull(searchslot))
  {
- tableslot = table_slot_create(localrel, &estate->es_tupleTable);
- tableslot = ExecCopySlot(tableslot, slot);
+ Oid replica_index = GetRelationIdentityOrPK(rel);
+
+ /*
+ * If the table has a valid replica identity index, build the index
+ * json datum from key value. Otherwise, construct it from the complete
+ * tuple in REPLICA IDENTITY FULL cases.
+ */
+ if (OidIsValid(replica_index))
+ values[attno++] = tuple_table_slot_to_ri_json_datum(estate, rel,
+ replica_index,
+ searchslot);
+ else
+ values[attno++] = tuple_table_slot_to_json_datum(searchslot);
  }
+ else
+ nulls[attno++] = true;

- /*
- * Initialize ecxt_scantuple for potential use in FormIndexDatum when
- * index expressions are present.
- */
- GetPerTupleExprContext(estate)->ecxt_scantuple = tableslot;
+ if (localslot != NULL && !TupIsNull(localslot))
+ values[attno++] = tuple_table_slot_to_json_datum(localslot);
+ else
+ nulls[attno++] = true;

- /*
- * The values/nulls arrays passed to BuildIndexValueDescription should be
- * the results of FormIndexDatum, which are the "raw" input to the index
- * AM.
- */
- FormIndexDatum(BuildIndexInfo(indexDesc), tableslot, estate, values, isnull);
+ if (remoteslot != NULL && !TupIsNull(remoteslot))
+ values[attno++] = tuple_table_slot_to_json_datum(remoteslot);
+ else
+ nulls[attno++] = true;

AFAIK, the TupIsNull() already includes the NULL check anyway, so you
don't need to double up those. I saw at least 3 conditions above where
the code could be simpler. e.g.

BEFORE
+ if (remoteslot != NULL && !TupIsNull(remoteslot))

SUGGESTION
if (!TupIsNull(remoteslot))

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-25T08:29:08Z

On Tue, Nov 25, 2025 at 9:03 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Dilip.
>
> Here are a couple of review comments for v6-0001.
>
> ======
> GENERAL.
>
> 1.
> Firstly, here is one of my "what if" ideas...
>
> The current patch is described as making a "structured, queryable
> record of all logical replication conflicts".
>
> What if we go bigger than that? What if this were made a more generic
> "structured, queryable record of logical replication activity"?
>
> AFAIK, there don't have to be too many logic changes to achieve this.
> e.g. I'm imagining it is mostly:
>
> * Rename the subscription parameter "conflict_log_table" to
> "log_table" or similar.
> * Remove/modify the "conflict_" name part from many of the variables
> and function names.
> * Add another 'type' column to the log table -- e.g. everything this
> patch writes can be type="CONFL", or type='c', or whatever.
> * Maybe tweak/add some of the other columns for more generic future use
>
> Anyway, it might be worth considering this now, before everything
> becomes set in stone with a conflict-only focus, making it too
> difficult to add more potential/unknown log table enhancements later.
>
> Thoughts?

Yeah that's an interesting thought for sure, but honestly I believe
the conflict log table only for storing the conflict and conflict
resolution related data is standard followed across the databases who
provide active-active setup e.g. Oracle Golden Gate, BDR, pg active,
so IMHO to keep the feature clean and focused, we should follow the
same.

I will work on other review comments and post the patch soon.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-25T10:36:22Z

On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>

On a separate note, I've been considering how to manage conflict log
insertions when an error causes the outer transaction to abort, which
seems to be a non-trivial.

Here is what I have in mind:
======================
First, prepare_conflict_log() would be executed from
ReportApplyConflict(). This function would handle all preliminary
work, such as preparing the tuple for the conflict log table. Second,
insert_conflict_log() would be executed. If the error level in
ReportApplyConflict() is LOG, the insertion would occur directly.
Otherwise, the log information would be stored in a global variable
and inserted in a separate transaction once we exit start_apply() due
to the error.

@shveta malik @Amit Kapila let me know what you think?  Or do you
think it can be simplified?


--
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-25T14:34:50Z

On Tue, Nov 25, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
>
> On a separate note, I've been considering how to manage conflict log
> insertions when an error causes the outer transaction to abort, which
> seems to be a non-trivial.
>
> Here is what I have in mind:
> ======================
> First, prepare_conflict_log() would be executed from
> ReportApplyConflict(). This function would handle all preliminary
> work, such as preparing the tuple for the conflict log table. Second,
> insert_conflict_log() would be executed. If the error level in
> ReportApplyConflict() is LOG, the insertion would occur directly.
> Otherwise, the log information would be stored in a global variable
> and inserted in a separate transaction once we exit start_apply() due
> to the error.
>
> @shveta malik @Amit Kapila let me know what you think?  Or do you
> think it can be simplified?

While digging more into this I am wondering why
CT_MULTIPLE_UNIQUE_CONFLICTS is reported as an error and all other
conflicts as LOG?

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-26T08:35:48Z

On Tue, Nov 25, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
>
> On a separate note, I've been considering how to manage conflict log
> insertions when an error causes the outer transaction to abort, which
> seems to be a non-trivial.
>
> Here is what I have in mind:
> ======================
> First, prepare_conflict_log() would be executed from
> ReportApplyConflict(). This function would handle all preliminary
> work, such as preparing the tuple for the conflict log table. Second,
> insert_conflict_log() would be executed. If the error level in
> ReportApplyConflict() is LOG, the insertion would occur directly.
> Otherwise, the log information would be stored in a global variable
> and inserted in a separate transaction once we exit start_apply() due
> to the error.
>
> @shveta malik @Amit Kapila let me know what you think?  Or do you
> think it can be simplified?
>

I could not think of a better way. This idea works for me. I had
doubts if it will be okay to start a new transaction in catch-block
(if we plan to do it in start_apply's), but then I found few other
functions doing it (see do_autovacuum, perform_work_item,
_SPI_commit). So IMO, we should be good.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-26T10:45:27Z

On Wed, Nov 26, 2025 at 2:05 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Nov 25, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> >
> > On a separate note, I've been considering how to manage conflict log
> > insertions when an error causes the outer transaction to abort, which
> > seems to be a non-trivial.
> >
> > Here is what I have in mind:
> > ======================
> > First, prepare_conflict_log() would be executed from
> > ReportApplyConflict(). This function would handle all preliminary
> > work, such as preparing the tuple for the conflict log table. Second,
> > insert_conflict_log() would be executed. If the error level in
> > ReportApplyConflict() is LOG, the insertion would occur directly.
> > Otherwise, the log information would be stored in a global variable
> > and inserted in a separate transaction once we exit start_apply() due
> > to the error.
> >
> > @shveta malik @Amit Kapila let me know what you think?  Or do you
> > think it can be simplified?
> >
>
> I could not think of a better way. This idea works for me. I had
> doubts if it will be okay to start a new transaction in catch-block
> (if we plan to do it in start_apply's), but then I found few other
> functions doing it (see do_autovacuum, perform_work_item,
> _SPI_commit). So IMO, we should be good.
>

On re-reading, I think you were not suggesting to handle it in the
CATCH block. Where exactly once we exit start_apply?
But since the situation will arise only in case of ERROR, I thought
handling in catch-block could be one option.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-26T11:18:50Z

On Wed, Nov 26, 2025 at 4:15 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Nov 26, 2025 at 2:05 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Nov 25, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > >
> > > On a separate note, I've been considering how to manage conflict log
> > > insertions when an error causes the outer transaction to abort, which
> > > seems to be a non-trivial.
> > >
> > > Here is what I have in mind:
> > > ======================
> > > First, prepare_conflict_log() would be executed from
> > > ReportApplyConflict(). This function would handle all preliminary
> > > work, such as preparing the tuple for the conflict log table. Second,
> > > insert_conflict_log() would be executed. If the error level in
> > > ReportApplyConflict() is LOG, the insertion would occur directly.
> > > Otherwise, the log information would be stored in a global variable
> > > and inserted in a separate transaction once we exit start_apply() due
> > > to the error.
> > >
> > > @shveta malik @Amit Kapila let me know what you think?  Or do you
> > > think it can be simplified?
> > >
> >
> > I could not think of a better way. This idea works for me. I had
> > doubts if it will be okay to start a new transaction in catch-block
> > (if we plan to do it in start_apply's), but then I found few other
> > functions doing it (see do_autovacuum, perform_work_item,
> > _SPI_commit). So IMO, we should be good.
> >
>
> On re-reading, I think you were not suggesting to handle it in the
> CATCH block. Where exactly once we exit start_apply?
> But since the situation will arise only in case of ERROR, I thought
> handling in catch-block could be one option.

Yeah it makes sense to handle in catch block, I have done that in the
attached patch and also handled other comments by Peter.

Now pending work status
1) fixed review comments of 0002 and 0003 - Pending
2) Need to add replica identity tuple instead of full tuple -- Done
3) Keeping the logs in case of outer transaction failure by moving log
insertion outside the main transaction - reported by Shveta - Done
(might need more validation and testing)
4) Run pgindent -- planning to do it after we complete the first level
of review - Pending
5) Subscription test cases for logging the actual conflicts - Pending

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-11-27T00:59:59Z

Hi Dilip. Some review comments for v7-0001.

======
src/backend/replication/logical/conflict.c

1.
+ /* Insert conflict details to conflict log table. */
+ if (conflictlogrel)
+ {
+ /*
+ * Prepare the conflict log tuple. If the error level is below
+ * ERROR, insert it immediately. Otherwise, defer the insertion to
+ * a new transaction after the current one aborts, ensuring the log
+ * tuple is not rolled back.
+ */
+ conflictlogtuple = prepare_conflict_log_tuple(estate,
+ relinfo->ri_RelationDesc,
+ conflictlogrel,
+ conflicttuple->xmin,
+ conflicttuple->ts, type,
+ conflicttuple->origin,
+ searchslot, conflicttuple->slot,
+ remoteslot);
+ if (elevel < ERROR)
+ {
+ InsertConflictLogTuple(conflictlogrel, conflictlogtuple);
+ heap_freetuple(conflictlogtuple);
+ }
+ else
+ MyLogicalRepWorker->conflict_log_tuple = conflictlogtuple;
+
+ table_close(conflictlogrel, AccessExclusiveLock);
+ }
+ }
+

IMO, some refactoring would help simplify conflictlogtuple processing. e.g.

i)   You don't need any separate 'conflictlogtuple' var
- Use MyLogicalRepWorker->conflict_log_tuple always for this purpose
ii)  prepare_conflict_log_tuple()
- Change this to a void; it will always side-effect
MyLogicalRepWorker->conflict_log_tuple
- Assert MyLogicalRepWorker->conflict_log_tuple must be NULL on entry
iii) InsertConflictLogTuple()
- The 2nd param it not needed if you always use
MyLogicalRepWorker->conflict_log_tuple
- Asserts MyLogicalRepWorker->conflict_log_tuple is not NULL, then writes it
- BTW, I felt that heap_freetuple could also be done here too
- Finally, sets to MyLogicalRepWorker->conflict_log_tuple to NULL
(ready for the next conflict)

~~~

InsertConflictLogTuple:

2.
+/*
+ * InsertConflictLogTuple
+ *
+ * Persistently records the input conflict log tuple into the conflict log
+ * table. It uses HEAP_INSERT_NO_LOGICAL to explicitly block logical decoding
+ * of the tuple inserted into the conflict log table.
+ */
+void
+InsertConflictLogTuple(Relation conflictlogrel, HeapTuple tup)
+{
+ int options = HEAP_INSERT_NO_LOGICAL;
+
+ heap_insert(conflictlogrel, tup, GetCurrentCommandId(true), options, NULL);
+}

See the above review comment (iii), for some suggested changes to this function.

~~~

prepare_conflict_log_tuple:

3.
+ * The caller is responsible for explicitly freeing the returned heap tuple
+ * after inserting.
+ */
+static HeapTuple
+prepare_conflict_log_tuple(EState *estate, Relation rel,

As per the above review comment (iii), I thought the Insert function
could handle the freeing.

~~~

4.
+ oldctx = MemoryContextSwitchTo(ApplyContext);
+ tup = heap_form_tuple(RelationGetDescr(conflictlogrel), values, nulls);
+ MemoryContextSwitchTo(oldctx);

- return index_value;
+ return tup;

Per the above comment (ii), change this to assign to
MyLogicalRepWorker->conflict_log_tuple.

======
src/backend/replication/logical/worker.c

start_apply:

5.
+ /*
+ * Insert the pending conflict log tuple under a new transaction.
+ */

/Insert the/Insert any/

~~~

6.
+ InsertConflictLogTuple(conflictlogrel,
+    MyLogicalRepWorker->conflict_log_tuple);
+ heap_freetuple(MyLogicalRepWorker->conflict_log_tuple);
+ MyLogicalRepWorker->conflict_log_tuple = NULL;

Per earlier reqview comment (iii), remove the 2nd parm to
InsertConflictLogTuple, and those other 2 statements can also be
handled within InsertConflictLogTuple.

======
src/include/replication/worker_internal.h

7.
+ /* Store conflict log tuple to be inserted before worker exit. */
+ HeapTuple conflict_log_tuple;
+

Per my above suggestions, this member comment becomes something more
like "A conflict log tuple which is prepared but not yet written. */

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-27T12:20:00Z

On Thu, Nov 27, 2025 at 6:30 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Dilip. Some review comments for v7-0001.
>
> ======
> src/backend/replication/logical/conflict.c
>
> 1.
> + /* Insert conflict details to conflict log table. */
> + if (conflictlogrel)
> + {
> + /*
> + * Prepare the conflict log tuple. If the error level is below
> + * ERROR, insert it immediately. Otherwise, defer the insertion to
> + * a new transaction after the current one aborts, ensuring the log
> + * tuple is not rolled back.
> + */
> + conflictlogtuple = prepare_conflict_log_tuple(estate,
> + relinfo->ri_RelationDesc,
> + conflictlogrel,
> + conflicttuple->xmin,
> + conflicttuple->ts, type,
> + conflicttuple->origin,
> + searchslot, conflicttuple->slot,
> + remoteslot);
> + if (elevel < ERROR)
> + {
> + InsertConflictLogTuple(conflictlogrel, conflictlogtuple);
> + heap_freetuple(conflictlogtuple);
> + }
> + else
> + MyLogicalRepWorker->conflict_log_tuple = conflictlogtuple;
> +
> + table_close(conflictlogrel, AccessExclusiveLock);
> + }
> + }
> +
>
> IMO, some refactoring would help simplify conflictlogtuple processing. e.g.
>
> i)   You don't need any separate 'conflictlogtuple' var
> - Use MyLogicalRepWorker->conflict_log_tuple always for this purpose
> ii)  prepare_conflict_log_tuple()
> - Change this to a void; it will always side-effect
> MyLogicalRepWorker->conflict_log_tuple
> - Assert MyLogicalRepWorker->conflict_log_tuple must be NULL on entry
> iii) InsertConflictLogTuple()
> - The 2nd param it not needed if you always use
> MyLogicalRepWorker->conflict_log_tuple
> - Asserts MyLogicalRepWorker->conflict_log_tuple is not NULL, then writes it
> - BTW, I felt that heap_freetuple could also be done here too
> - Finally, sets to MyLogicalRepWorker->conflict_log_tuple to NULL
> (ready for the next conflict)
>
> ~~~
>
> InsertConflictLogTuple:
>
> 2.
> +/*
> + * InsertConflictLogTuple
> + *
> + * Persistently records the input conflict log tuple into the conflict log
> + * table. It uses HEAP_INSERT_NO_LOGICAL to explicitly block logical decoding
> + * of the tuple inserted into the conflict log table.
> + */
> +void
> +InsertConflictLogTuple(Relation conflictlogrel, HeapTuple tup)
> +{
> + int options = HEAP_INSERT_NO_LOGICAL;
> +
> + heap_insert(conflictlogrel, tup, GetCurrentCommandId(true), options, NULL);
> +}
>
> See the above review comment (iii), for some suggested changes to this function.
>
> ~~~
>
> prepare_conflict_log_tuple:
>
> 3.
> + * The caller is responsible for explicitly freeing the returned heap tuple
> + * after inserting.
> + */
> +static HeapTuple
> +prepare_conflict_log_tuple(EState *estate, Relation rel,
>
> As per the above review comment (iii), I thought the Insert function
> could handle the freeing.
>
> ~~~
>
> 4.
> + oldctx = MemoryContextSwitchTo(ApplyContext);
> + tup = heap_form_tuple(RelationGetDescr(conflictlogrel), values, nulls);
> + MemoryContextSwitchTo(oldctx);
>
> - return index_value;
> + return tup;
>
> Per the above comment (ii), change this to assign to
> MyLogicalRepWorker->conflict_log_tuple.
>
> ======
> src/backend/replication/logical/worker.c
>
> start_apply:
>
> 5.
> + /*
> + * Insert the pending conflict log tuple under a new transaction.
> + */
>
> /Insert the/Insert any/
>
> ~~~
>
> 6.
> + InsertConflictLogTuple(conflictlogrel,
> +    MyLogicalRepWorker->conflict_log_tuple);
> + heap_freetuple(MyLogicalRepWorker->conflict_log_tuple);
> + MyLogicalRepWorker->conflict_log_tuple = NULL;
>
> Per earlier reqview comment (iii), remove the 2nd parm to
> InsertConflictLogTuple, and those other 2 statements can also be
> handled within InsertConflictLogTuple.
>
> ======
> src/include/replication/worker_internal.h
>
> 7.
> + /* Store conflict log tuple to be inserted before worker exit. */
> + HeapTuple conflict_log_tuple;
> +
>
> Per my above suggestions, this member comment becomes something more
> like "A conflict log tuple which is prepared but not yet written. */
>

I have fixed all these comments and also the comments of 0002, now I
feel we can actually merge 0001 and 0002, so I have merged both of
them.

Now pending work status
1) fixed review comments of 0003
2) Run pgindent -- planning to do it after we complete the first level
of review
3) Subscription TAP test for logging the actual conflicts

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-11-28T00:35:59Z

Hi Dilip.

Some review comments for v8-0001.

======
Commit message

1.
When the patches 0001 and 0002 got merged, I think the commit message
should have been updated also to say something along the lines of:

When ALL TABLES or ALL TABLES IN SCHEMA is used with publication won't
publish the clt.

======
src/backend/catalog/pg_publication.c

check_publication_add_relation:

2.
+ /* Can't be conflict log table */
+ if (IsConflictLogRelid(RelationGetRelid(targetrel)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(targetrel)),
+ errdetail("This operation is not supported for conflict log tables.")));

Should it also show the schema name of the clt in the message?

======
src/backend/commands/subscriptioncmds.c

3.
+/*
+ * Check if the specified relation is used as a conflict log table by any
+ * subscription.
+ */
+bool
+IsConflictLogRelid(Oid relid)

Most places refer to the clt. Wondering if this function ought to be
called 'IsConflictLogTable'.

======
src/backend/replication/logical/conflict.c

InsertConflictLogTuple:

4.
+ /* A valid tuple must be prepared and store into MyLogicalRepWorker. */

typo: /store into/stored in/

~~~

prepare_conflict_log_tuple:

5.
- index_close(indexDesc, NoLock);
+ oldctx = MemoryContextSwitchTo(ApplyContext);
+ tup = heap_form_tuple(RelationGetDescr(conflictlogrel), values, nulls);
+ MemoryContextSwitchTo(oldctx);

- return index_value;
+ /* Store conflict_log_tuple into the worker slot for inserting it later. */
+ MyLogicalRepWorker->conflict_log_tuple = tup;

5a.
I don't think you need the 'tup' variable. Just assign directly to
MyLogicalRepWorker->conflict_log_tuple.

~

5b.
"worker slot" -- I don't think this is a "slot".

======
src/backend/replication/logical/worker.c

6.
+ /* Open conflict log table. */
+ conflictlogrel = GetConflictLogTableRel();
+ InsertConflictLogTuple(conflictlogrel);
+ MyLogicalRepWorker->conflict_log_tuple = NULL;
+ table_close(conflictlogrel, AccessExclusiveLock);

Maybe that comment should say:
/* Open conflict log table and write the tuple. */


======
src/include/replication/conflict.h

7.
+ /* A conflict log tuple which is prepared but not yet inserted. */
+ HeapTuple conflict_log_tuple;
+

typo: /which/that/  (sorry, this one is my bad from a previous review comment)


======
src/test/regress/expected/subscription.out

8.
+-- ok - change the conflict log table name for an existing
subscription that already had one
+CREATE SCHEMA clt;
+ALTER SUBSCRIPTION regress_conflict_test2 SET (conflict_log_table =
'clt.regress_conflict_log3');
+SELECT subname, subconflictlogtable, subconflictlognspid = (SELECT
oid FROM pg_namespace WHERE nspname = 'public') AS is_public_schema
+FROM pg_subscription WHERE subname = 'regress_conflict_test2';
+        subname         |  subconflictlogtable  | is_public_schema
+------------------------+-----------------------+------------------
+ regress_conflict_test2 | regress_conflict_log3 | f
+(1 row)
+
+\dRs+
+

                    List of subscriptions
+          Name          |           Owner           | Enabled |
Publication | Binary | Streaming | Two-phase commit | Disable on error
| Origin | Password required | Run as owner? | Failover | Retain dead
tuples | Max retention duration | Retention active | Synchronous
commit |          Conninfo           |  Skip LSN  |  Conflict log
table
+------------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------+------------------+--------------------+-----------------------------+------------+-----------------------
+ regress_conflict_test1 | regress_subscription_user | f       |
{testpub}   | f      | parallel  | d                | f
| any    | t                 | f             | f        | f
      |                      0 | f                | off
| dbname=regress_doesnotexist | 0/00000000 | regress_conflict_log1
+ regress_conflict_test2 | regress_subscription_user | f       |
{testpub}   | f      | parallel  | d                | f
| any    | t                 | f             | f        | f
      |                      0 | f                | off
| dbname=regress_doesnotexist | 0/00000000 | regress_conflict_log3
+(2 rows)

~

After going to the trouble of specifying the CLT on a different
schema, that information is lost by the \dRs+. How about also showing
the CLT schema name (at least when it is not "public") in the \dRs+
output.

~~~

9.
+-- ok - conflict_log_table should not be published with ALL TABLE
+CREATE PUBLICATION pub FOR TABLES IN SCHEMA clt;
+SELECT * FROM pg_publication_tables WHERE pubname = 'pub';
+ pubname | schemaname | tablename | attnames | rowfilter
+---------+------------+-----------+----------+-----------
+(0 rows)

Perhaps you should repeat this same test but using FOR ALL TABLES,
instead of only FOR TABLES IN SCHEMA

======
src/test/regress/sql/subscription.sql

10.
In one of the tests, you could call the function
pg_relation_is_publishable(clt) to verify that it returns false.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-11-28T06:54:27Z

On Thu, 27 Nov 2025 at 17:50, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Nov 27, 2025 at 6:30 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> I have fixed all these comments and also the comments of 0002, now I
> feel we can actually merge 0001 and 0002, so I have merged both of
> them.

I just started to have a look at the patch, while using I found lock
level used is not correct:
I felt the reason is that table is opened with RowExclusiveLock but
closed in AccessExclusiveLock:

+       /* If conflict log table is not set for the subscription just return. */
+       conflictlogtable = get_subscription_conflict_log_table(
+
MyLogicalRepWorker->subid, &nspid);
+       if (conflictlogtable == NULL)
+       {
+               pfree(conflictlogtable);
+               return NULL;
+       }
+
+       conflictlogrelid = get_relname_relid(conflictlogtable, nspid);
+       if (OidIsValid(conflictlogrelid))
+               conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);

....
+                       if (elevel < ERROR)
+                               InsertConflictLogTuple(conflictlogrel);
+
+                       table_close(conflictlogrel, AccessExclusiveLock);
....

2025-11-28 12:17:55.631 IST [504133] WARNING:  you don't own a lock of
type AccessExclusiveLock
2025-11-28 12:17:55.631 IST [504133] CONTEXT:  processing remote data
for replication origin "pg_16402" during message type "INSERT" for
replication target relation "public.t1" in transaction 761, finished
at 0/01789AB8
2025-11-28 12:17:58.033 IST [504133] WARNING:  you don't own a lock of
type AccessExclusiveLock
2025-11-28 12:17:58.033 IST [504133] ERROR:  conflict detected on
relation "public.t1": conflict=insert_exists
2025-11-28 12:17:58.033 IST [504133] DETAIL:  Key already exists in
unique index "t1_pkey", modified in transaction 766.
        Key (c1)=(1); existing local row (1, 1); remote row (1, 1).
2025-11-28 12:17:58.033 IST [504133] CONTEXT:  processing remote data
for replication origin "pg_16402" during message type "INSERT" for
replication target relation "public.t1" in transaction 761, finished
at 0/01789AB8

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-11-28T09:01:59Z

On Thu, Nov 27, 2025 at 5:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
>
> I have fixed all these comments and also the comments of 0002, now I
> feel we can actually merge 0001 and 0002, so I have merged both of
> them.
>
> Now pending work status
> 1) fixed review comments of 0003
> 2) Run pgindent -- planning to do it after we complete the first level
> of review
> 3) Subscription TAP test for logging the actual conflicts
>

Thanks  for the patch. A few observations:

1)
It seems, as per LOG, 'key' and 'replica-identity' are different when
it comes to insert_exists, update_exists and
multiple_unique_conflicts, while I believe in CLT, key is
replica-identity i.e. there are no 2 separate terms. Please see below:

a)
Update_Exists:
2025-11-28 14:08:56.179 IST [60383] ERROR:  conflict detected on
relation "public.tab1": conflict=update_exists
2025-11-28 14:08:56.179 IST [60383] DETAIL:  Key already exists in
unique index "tab1_pkey", modified locally in transaction 790 at
2025-11-28 14:07:17.578887+05:30.
Key (i)=(40); existing local row (40, 10); remote row (40, 200);
replica identity (i)=(20).

postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple
from clt where conflict_type='update_exists';
 conflict_type | key_tuple |   local_tuple   |   remote_tuple
---------------+-----------+-----------------+------------------
 update_exists | {"i":20}  | {"i":40,"j":10} | {"i":40,"j":200}

b)
insert_Exists:
ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
DETAIL:  Key already exists in unique index "tab1_pkey", modified
locally in transaction 767 at 2025-11-28 13:59:22.431097+05:30.
Key (i)=(30); existing local row (30, 10); remote row (30, 10).

postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple from clt;
 conflict_type  | key_tuple |   local_tuple   |  remote_tuple
----------------+-----------+-----------------+-----------------
 insert_exists  |               | {"i":30,"j":10} | {"i":30,"j":10}

case a) has key_tuple same as replica-identity of LOG
case b) does not have replica-identity and thus key_tuple is NULL.

Does that mean we need to maintain both key_tuple and RI separately in
CLT? Thoughts?

2)
For multiple_unique_conflict (testcase is same as I shared earlier),
it asserts here:
CONTEXT:  processing remote data for replication origin "pg_16390"
during message type "INSERT" for replication target relation
"public.conf_tab" in transaction 778, finished at 0/017E6DE8
TRAP: failed Assert("MyLogicalRepWorker->conflict_log_tuple == NULL"),
File: "conflict.c", Line: 749, PID: 60627

I have not checked it, but maybe
'MyLogicalRepWorker->conflict_log_tuple' is left over from the
previous few tests I tried?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-11-28T12:20:15Z

On Tue, Nov 18, 2025 at 3:40 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > 3)
> > > We also need to think how we are going to display the info in case of
> > > multiple_unique_conflicts as there could be multiple local and remote
> > > tuples conflicting for one single operation. Example:
> > >
> > > create table conf_tab (a int primary key, b int unique, c int unique);
> > >
> > > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4);
> > >
> > > pub: insert into conf_tab values (2,3,4);
> > >
> > > ERROR:  conflict detected on relation "public.conf_tab":
> > > conflict=multiple_unique_conflicts
> > > DETAIL:  Key already exists in unique index "conf_tab_pkey", modified
> > > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
> > > Key already exists in unique index "conf_tab_b_key", modified locally
> > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
> > > Key already exists in unique index "conf_tab_c_key", modified locally
> > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).
> > > CONTEXT:  processing remote data for replication origin "pg_16392"
> > > during message type "INSERT" for replication target relation
> > > "public.conf_tab" in transaction 781, finished at 0/017FDDA0
> > >
> > > Currently in clt, we have singular terms such as 'key_tuple',
> > > 'local_tuple', 'remote_tuple'.  Shall we have multiple rows inserted?
> > > But it does not look reasonable to have multiple rows inserted for a
> > > single conflict raised. I will think more about this.
> >
> > Currently I am inserting multiple records in the conflict history
> > table, the same as each tuple is logged, but couldn't find any better
> > way for this.
> >

The biggest drawback of this approach is data bloat. The incoming data
row will be stored multiple times.

> > Another option is to use an array of tuples instead of a
> > single tuple but not sure this might make things more complicated to
> > process by any external tool.
>
> It’s arguable and hard to say what the correct behaviour should be.
> I’m slightly leaning toward having a single row per conflict.
>

Yeah, it is better to either have a single row per conflict or have
two tables conflict_history and conflict_history_details to avoid data
bloat as pointed above. For example, two-table approach could be:

1. The Header Table (Incoming Data)
This stores the data that tried to be applied.
SQL
CREATE TABLE conflict_header (
    conflict_id     SERIAL PRIMARY KEY,
    source_tx_id    VARCHAR(100),    -- Transaction ID from source
    table_name      VARCHAR(100),
    operation       CHAR(1),         -- 'I' for Insert
    incoming_data   JSONB,           -- Store the incoming row as JSON
...
);

2. The Detail Table (Existing Conflicting Data)
This stores the actual rows currently in the database that caused the
violations.
CREATE TABLE conflict_details (
    detail_id       SERIAL PRIMARY KEY,
    conflict_id     INT REFERENCES conflict_header(conflict_id),
    constraint_name/key_tuple VARCHAR(100),
    conflicting_row_data JSONB       -- The existing row in the DB
that blocked the insert
);

Please don't consider these exact columns; you can use something on
the lines of what is proposed in the patch. This is just to show how
the conflict data can be rearranged. Now, one argument against this is
that users need to use JOIN to query data but still better than
bloating the table. The idea to store in a single table could be
changed to have columns like violated_constraints TEXT[],      --
e.g., ['uk_email', 'uk_phone'], error_details   JSONB  -- e.g.,
[{"const": "uk_email", "val": "a@b.com"}, ...]. If we want to store
multiple conflicting tuples in a single column, we need to ensure it
is queryable via a JSONB column. The point in favour of a single JSONB
column to combine multiple conflicting tuples is that we need this
combination only for one kind of conflict.

Both the approaches have their pros and cons. I feel we should dig a
bit deeper for both by laying out details for each method and see what
others think.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T04:41:08Z

On Fri, Nov 28, 2025 at 5:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 18, 2025 at 3:40 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > 3)
> > > > We also need to think how we are going to display the info in case of
> > > > multiple_unique_conflicts as there could be multiple local and remote
> > > > tuples conflicting for one single operation. Example:
> > > >
> > > > create table conf_tab (a int primary key, b int unique, c int unique);
> > > >
> > > > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4);
> > > >
> > > > pub: insert into conf_tab values (2,3,4);
> > > >
> > > > ERROR:  conflict detected on relation "public.conf_tab":
> > > > conflict=multiple_unique_conflicts
> > > > DETAIL:  Key already exists in unique index "conf_tab_pkey", modified
> > > > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > > > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
> > > > Key already exists in unique index "conf_tab_b_key", modified locally
> > > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > > > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
> > > > Key already exists in unique index "conf_tab_c_key", modified locally
> > > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > > > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).
> > > > CONTEXT:  processing remote data for replication origin "pg_16392"
> > > > during message type "INSERT" for replication target relation
> > > > "public.conf_tab" in transaction 781, finished at 0/017FDDA0
> > > >
> > > > Currently in clt, we have singular terms such as 'key_tuple',
> > > > 'local_tuple', 'remote_tuple'.  Shall we have multiple rows inserted?
> > > > But it does not look reasonable to have multiple rows inserted for a
> > > > single conflict raised. I will think more about this.
> > >
> > > Currently I am inserting multiple records in the conflict history
> > > table, the same as each tuple is logged, but couldn't find any better
> > > way for this.
> > >
>
> The biggest drawback of this approach is data bloat. The incoming data
> row will be stored multiple times.
>
> > > Another option is to use an array of tuples instead of a
> > > single tuple but not sure this might make things more complicated to
> > > process by any external tool.
> >
> > It’s arguable and hard to say what the correct behaviour should be.
> > I’m slightly leaning toward having a single row per conflict.
> >
>
> Yeah, it is better to either have a single row per conflict or have
> two tables conflict_history and conflict_history_details to avoid data
> bloat as pointed above. For example, two-table approach could be:
>
> 1. The Header Table (Incoming Data)
> This stores the data that tried to be applied.
> SQL
> CREATE TABLE conflict_header (
>     conflict_id     SERIAL PRIMARY KEY,
>     source_tx_id    VARCHAR(100),    -- Transaction ID from source
>     table_name      VARCHAR(100),
>     operation       CHAR(1),         -- 'I' for Insert
>     incoming_data   JSONB,           -- Store the incoming row as JSON
> ...
> );
>
> 2. The Detail Table (Existing Conflicting Data)
> This stores the actual rows currently in the database that caused the
> violations.
> CREATE TABLE conflict_details (
>     detail_id       SERIAL PRIMARY KEY,
>     conflict_id     INT REFERENCES conflict_header(conflict_id),
>     constraint_name/key_tuple VARCHAR(100),
>     conflicting_row_data JSONB       -- The existing row in the DB
> that blocked the insert
> );
>
> Please don't consider these exact columns; you can use something on
> the lines of what is proposed in the patch. This is just to show how
> the conflict data can be rearranged. Now, one argument against this is
> that users need to use JOIN to query data but still better than
> bloating the table. The idea to store in a single table could be
> changed to have columns like violated_constraints TEXT[],      --
> e.g., ['uk_email', 'uk_phone'], error_details   JSONB  -- e.g.,
> [{"const": "uk_email", "val": "a@b.com"}, ...]. If we want to store
> multiple conflicting tuples in a single column, we need to ensure it
> is queryable via a JSONB column. The point in favour of a single JSONB
> column to combine multiple conflicting tuples is that we need this
> combination only for one kind of conflict.
>
> Both the approaches have their pros and cons. I feel we should dig a
> bit deeper for both by laying out details for each method and see what
> others think.

The specific scenario we are discussing is when a single row from the
publisher attempts to apply an operation that causes a conflict across
multiple unique keys, with each of those unique key violations
conflicting with a different local row on the subscriber, is very
rare.  IMHO this low-frequency scenario does not justify
overcomplicating the design with an array field or a multi-level
table.

Consider the infrequency of the root causes:
- How often does a table have more than 3 to 4 unique keys?
- How frequently would each of these keys conflict with a unique row
on the subscriber side?

If resolving this occasional, synthetic conflict requires inserting
two or three rows instead of a single one, this is an acceptable
trade-off considering how rare it can occur.  Anyway this is my
opinion and I am open to opinions from others.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T08:18:14Z

On Fri, Nov 28, 2025 at 12:24 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 27 Nov 2025 at 17:50, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Nov 27, 2025 at 6:30 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > I have fixed all these comments and also the comments of 0002, now I
> > feel we can actually merge 0001 and 0002, so I have merged both of
> > them.
>
> I just started to have a look at the patch, while using I found lock
> level used is not correct:
> I felt the reason is that table is opened with RowExclusiveLock but
> closed in AccessExclusiveLock:
>
> +       /* If conflict log table is not set for the subscription just return. */
> +       conflictlogtable = get_subscription_conflict_log_table(
> +
> MyLogicalRepWorker->subid, &nspid);
> +       if (conflictlogtable == NULL)
> +       {
> +               pfree(conflictlogtable);
> +               return NULL;
> +       }
> +
> +       conflictlogrelid = get_relname_relid(conflictlogtable, nspid);
> +       if (OidIsValid(conflictlogrelid))
> +               conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
>
> ....
> +                       if (elevel < ERROR)
> +                               InsertConflictLogTuple(conflictlogrel);
> +
> +                       table_close(conflictlogrel, AccessExclusiveLock);
> ....
>
> 2025-11-28 12:17:55.631 IST [504133] WARNING:  you don't own a lock of
> type AccessExclusiveLock
> 2025-11-28 12:17:55.631 IST [504133] CONTEXT:  processing remote data
> for replication origin "pg_16402" during message type "INSERT" for
> replication target relation "public.t1" in transaction 761, finished
> at 0/01789AB8
> 2025-11-28 12:17:58.033 IST [504133] WARNING:  you don't own a lock of
> type AccessExclusiveLock
> 2025-11-28 12:17:58.033 IST [504133] ERROR:  conflict detected on
> relation "public.t1": conflict=insert_exists
> 2025-11-28 12:17:58.033 IST [504133] DETAIL:  Key already exists in
> unique index "t1_pkey", modified in transaction 766.
>         Key (c1)=(1); existing local row (1, 1); remote row (1, 1).
> 2025-11-28 12:17:58.033 IST [504133] CONTEXT:  processing remote data
> for replication origin "pg_16402" during message type "INSERT" for
> replication target relation "public.t1" in transaction 761, finished
> at 0/01789AB8

Thanks, I will fix this.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T08:23:51Z

On Fri, Nov 28, 2025 at 2:32 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Nov 27, 2025 at 5:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> >
> > I have fixed all these comments and also the comments of 0002, now I
> > feel we can actually merge 0001 and 0002, so I have merged both of
> > them.
> >
> > Now pending work status
> > 1) fixed review comments of 0003
> > 2) Run pgindent -- planning to do it after we complete the first level
> > of review
> > 3) Subscription TAP test for logging the actual conflicts
> >
>
> Thanks  for the patch. A few observations:
>
> 1)
> It seems, as per LOG, 'key' and 'replica-identity' are different when
> it comes to insert_exists, update_exists and
> multiple_unique_conflicts, while I believe in CLT, key is
> replica-identity i.e. there are no 2 separate terms. Please see below:
>
> a)
> Update_Exists:
> 2025-11-28 14:08:56.179 IST [60383] ERROR:  conflict detected on
> relation "public.tab1": conflict=update_exists
> 2025-11-28 14:08:56.179 IST [60383] DETAIL:  Key already exists in
> unique index "tab1_pkey", modified locally in transaction 790 at
> 2025-11-28 14:07:17.578887+05:30.
> Key (i)=(40); existing local row (40, 10); remote row (40, 200);
> replica identity (i)=(20).
>
> postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple
> from clt where conflict_type='update_exists';
>  conflict_type | key_tuple |   local_tuple   |   remote_tuple
> ---------------+-----------+-----------------+------------------
>  update_exists | {"i":20}  | {"i":40,"j":10} | {"i":40,"j":200}
>
> b)
> insert_Exists:
> ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
> DETAIL:  Key already exists in unique index "tab1_pkey", modified
> locally in transaction 767 at 2025-11-28 13:59:22.431097+05:30.
> Key (i)=(30); existing local row (30, 10); remote row (30, 10).
>
> postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple from clt;
>  conflict_type  | key_tuple |   local_tuple   |  remote_tuple
> ----------------+-----------+-----------------+-----------------
>  insert_exists  |               | {"i":30,"j":10} | {"i":30,"j":10}
>
> case a) has key_tuple same as replica-identity of LOG
> case b) does not have replica-identity and thus key_tuple is NULL.
>
> Does that mean we need to maintain both key_tuple and RI separately in
> CLT? Thoughts?

Maybe we should then have a place for both key_tuple as well as
replica identity as we are logging, what others think about this case?

> 2)
> For multiple_unique_conflict (testcase is same as I shared earlier),
> it asserts here:
> CONTEXT:  processing remote data for replication origin "pg_16390"
> during message type "INSERT" for replication target relation
> "public.conf_tab" in transaction 778, finished at 0/017E6DE8
> TRAP: failed Assert("MyLogicalRepWorker->conflict_log_tuple == NULL"),
> File: "conflict.c", Line: 749, PID: 60627
>
> I have not checked it, but maybe
> 'MyLogicalRepWorker->conflict_log_tuple' is left over from the
> previous few tests I tried?

Yeah, prepare_conflict_log_tuple() is called in loop and when there
are multiple tuple we need to collect all of the tuple before
inserting it at worker exit so the current code has a bug, I will see
how we can fix it, I think this also depends upon the other discussion
we are having related to how to insert multiple unique conflict.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-01T08:27:40Z

On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > > Few observations related to publication.
> > > ------------------------------
>
> Thanks Shveta, for testing and sharing your thoughts.  IMHO for
> conflict log tables it should be good enough if we restrict it when
> ALL TABLE options are used, I don't think we need to put extra effort
> to completely restrict it even if users want to explicitly list it
> into the publication.
>
> > >
> > > (In the below comments, clt/CLT implies Conflict Log Table)
> > >
> > > 1)
> > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.
>
> This function is used while publishing every single change and I don't
> think we want to add a cost to check each subscription to identify
> whether the table is listed as CLT.
>
> > > 2)
> > > '\d+ clt'   shows all-tables publication name. I feel we should not
> > > show that for clt.
>
> I think we should fix this.
>
> > > 3)
> > > I am able to create a publication for clt table, should it be allowed?
>
> I believe we should not do any specific handling to restrict this but
> I am open for the opinions.
>
> > > create subscription sub1 connection '...' publication pub1
> > > WITH(conflict_log_table='clt');
> > > create publication pub3 for table clt;
> > >
> > > 4)
> > > Is there a reason we have not made '!IsConflictHistoryRelid' check as
> > > part of is_publishable_class() itself? If we do so, other code-logics
> > > will also get clt as non-publishable always (and will solve a few of
> > > the above issues I think). IIUC, there is no place where we want to
> > > mark CLT as publishable or is there any?
>
> IMHO the main reason is performance.
>
> > > 5) Also, I feel we can add some documentation now to help others to
> > > understand/review the patch better without going through the long
> > > thread.
>
> Make sense, I will do that in the next version.
>
> > >
> > > Few observations related to conflict-logging:
> > > ------------------------------
> > > 1)
> > > I found that for the conflicts which ultimately result in Error, we do
> > > not insert any conflict-record in clt.
> > >
> > > a)
> > > Example: insert_exists, update_Exists
> > > create table tab1 (i int primary key, j int);
> > > sub: insert into tab1 values(30,10);
> > > pub: insert into tab1 values(30,10);
> > > ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
> > > No record in clt.
> > >
> > > sub:
> > > <some pre-data needed>
> > > update tab1 set i=40 where i = 30;
> > > pub: update tab1 set i=40 where i = 20;
> > > ERROR:  conflict detected on relation "public.tab1": conflict=update_exists
> > > No record in clt.
>
> Yeah that interesting need to put thought on how to commit this record
> when an outer transaction is aborted as we do not have autonomous
> transactions which are generally used for this kind of logging.  But
> we can explore more options like inserting into conflict log tables
> outside the outer transaction.
>
> > > b)
> > > Another question related to this is, since these conflicts (which
> > > results in error) keep on happening until user resolves these or skips
> > > these or 'disable_on_error' is set. Then are we going to insert these
> > > multiple times? We do count these in 'confl_insert_exists' and
> > > 'confl_update_exists' everytime, so it makes sense to log those each
> > > time in clt as well. Thoughts?
>
> I think it make sense to insert every time we see the conflict, but it
> would be good to have opinion from others as well.

Since there is a concern that multiple rows for
multiple_unique_conflicts can cause data-bloat, it made me rethink
that this is actually more prone to causing data-bloat if it is not
resolved on time, as it seems a far more frequent scenario. So shall
we keep inserting the record or insert it once and avoid inserting it
again based on lsn?  Thoughts?

>
> > > 2)
> > > Conflicts where row on sub is missing, local_ts incorrectly inserted.
> > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something
> > > indicating that it is not applicable for this conflict-type?
> > >
> > > Example: delete_missing, update_missing
> > > pub:
> > >  insert into tab1 values(10,10);
> > >  insert into tab1 values(20,10);
> > >  sub:  delete from tab1 where i=10;
> > >  pub:  delete from tab1 where i=10;
>
> Sure I will test this.
>
> >
> > 3)
> > We also need to think how we are going to display the info in case of
> > multiple_unique_conflicts as there could be multiple local and remote
> > tuples conflicting for one single operation. Example:
> >
> > create table conf_tab (a int primary key, b int unique, c int unique);
> >
> > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4);
> >
> > pub: insert into conf_tab values (2,3,4);
> >
> > ERROR:  conflict detected on relation "public.conf_tab":
> > conflict=multiple_unique_conflicts
> > DETAIL:  Key already exists in unique index "conf_tab_pkey", modified
> > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4).
> > Key already exists in unique index "conf_tab_b_key", modified locally
> > in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4).
> > Key already exists in unique index "conf_tab_c_key", modified locally
> > in transaction 874 at 2025-11-12 14:35:13.452143+05:30.
> > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4).
> > CONTEXT:  processing remote data for replication origin "pg_16392"
> > during message type "INSERT" for replication target relation
> > "public.conf_tab" in transaction 781, finished at 0/017FDDA0
> >
> > Currently in clt, we have singular terms such as 'key_tuple',
> > 'local_tuple', 'remote_tuple'.  Shall we have multiple rows inserted?
> > But it does not look reasonable to have multiple rows inserted for a
> > single conflict raised. I will think more about this.
>
> Currently I am inserting multiple records in the conflict history
> table, the same as each tuple is logged, but couldn't find any better
> way for this. Another option is to use an array of tuples instead of a
> single tuple but not sure this might make things more complicated to
> process by any external tool.  But you are right, this needs more
> discussion.
>
> --
> Regards,
> Dilip Kumar
> Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T08:34:08Z

On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > > Few observations related to publication.
> > > > ------------------------------
> >
> > Thanks Shveta, for testing and sharing your thoughts.  IMHO for
> > conflict log tables it should be good enough if we restrict it when
> > ALL TABLE options are used, I don't think we need to put extra effort
> > to completely restrict it even if users want to explicitly list it
> > into the publication.
> >
> > > >
> > > > (In the below comments, clt/CLT implies Conflict Log Table)
> > > >
> > > > 1)
> > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.
> >
> > This function is used while publishing every single change and I don't
> > think we want to add a cost to check each subscription to identify
> > whether the table is listed as CLT.
> >
> > > > 2)
> > > > '\d+ clt'   shows all-tables publication name. I feel we should not
> > > > show that for clt.
> >
> > I think we should fix this.
> >
> > > > 3)
> > > > I am able to create a publication for clt table, should it be allowed?
> >
> > I believe we should not do any specific handling to restrict this but
> > I am open for the opinions.
> >
> > > > create subscription sub1 connection '...' publication pub1
> > > > WITH(conflict_log_table='clt');
> > > > create publication pub3 for table clt;
> > > >
> > > > 4)
> > > > Is there a reason we have not made '!IsConflictHistoryRelid' check as
> > > > part of is_publishable_class() itself? If we do so, other code-logics
> > > > will also get clt as non-publishable always (and will solve a few of
> > > > the above issues I think). IIUC, there is no place where we want to
> > > > mark CLT as publishable or is there any?
> >
> > IMHO the main reason is performance.
> >
> > > > 5) Also, I feel we can add some documentation now to help others to
> > > > understand/review the patch better without going through the long
> > > > thread.
> >
> > Make sense, I will do that in the next version.
> >
> > > >
> > > > Few observations related to conflict-logging:
> > > > ------------------------------
> > > > 1)
> > > > I found that for the conflicts which ultimately result in Error, we do
> > > > not insert any conflict-record in clt.
> > > >
> > > > a)
> > > > Example: insert_exists, update_Exists
> > > > create table tab1 (i int primary key, j int);
> > > > sub: insert into tab1 values(30,10);
> > > > pub: insert into tab1 values(30,10);
> > > > ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
> > > > No record in clt.
> > > >
> > > > sub:
> > > > <some pre-data needed>
> > > > update tab1 set i=40 where i = 30;
> > > > pub: update tab1 set i=40 where i = 20;
> > > > ERROR:  conflict detected on relation "public.tab1": conflict=update_exists
> > > > No record in clt.
> >
> > Yeah that interesting need to put thought on how to commit this record
> > when an outer transaction is aborted as we do not have autonomous
> > transactions which are generally used for this kind of logging.  But
> > we can explore more options like inserting into conflict log tables
> > outside the outer transaction.
> >
> > > > b)
> > > > Another question related to this is, since these conflicts (which
> > > > results in error) keep on happening until user resolves these or skips
> > > > these or 'disable_on_error' is set. Then are we going to insert these
> > > > multiple times? We do count these in 'confl_insert_exists' and
> > > > 'confl_update_exists' everytime, so it makes sense to log those each
> > > > time in clt as well. Thoughts?
> >
> > I think it make sense to insert every time we see the conflict, but it
> > would be good to have opinion from others as well.
>
> Since there is a concern that multiple rows for
> multiple_unique_conflicts can cause data-bloat, it made me rethink
> that this is actually more prone to causing data-bloat if it is not
> resolved on time, as it seems a far more frequent scenario. So shall
> we keep inserting the record or insert it once and avoid inserting it
> again based on lsn?  Thoughts?

I agree, this is the real problem related to bloat so maybe we can see
if the same tuple exists we can avoid inserting it again, although I
haven't put thought on how to we distinguish between the new conflict
on the same row vs the same conflict being inserted multiple times due
to worker restart.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-01T09:27:53Z

On Mon, Dec 1, 2025 at 2:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > > Few observations related to publication.
> > > > > ------------------------------
> > >
> > > Thanks Shveta, for testing and sharing your thoughts.  IMHO for
> > > conflict log tables it should be good enough if we restrict it when
> > > ALL TABLE options are used, I don't think we need to put extra effort
> > > to completely restrict it even if users want to explicitly list it
> > > into the publication.
> > >
> > > > >
> > > > > (In the below comments, clt/CLT implies Conflict Log Table)
> > > > >
> > > > > 1)
> > > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table.
> > >
> > > This function is used while publishing every single change and I don't
> > > think we want to add a cost to check each subscription to identify
> > > whether the table is listed as CLT.
> > >
> > > > > 2)
> > > > > '\d+ clt'   shows all-tables publication name. I feel we should not
> > > > > show that for clt.
> > >
> > > I think we should fix this.
> > >
> > > > > 3)
> > > > > I am able to create a publication for clt table, should it be allowed?
> > >
> > > I believe we should not do any specific handling to restrict this but
> > > I am open for the opinions.
> > >
> > > > > create subscription sub1 connection '...' publication pub1
> > > > > WITH(conflict_log_table='clt');
> > > > > create publication pub3 for table clt;
> > > > >
> > > > > 4)
> > > > > Is there a reason we have not made '!IsConflictHistoryRelid' check as
> > > > > part of is_publishable_class() itself? If we do so, other code-logics
> > > > > will also get clt as non-publishable always (and will solve a few of
> > > > > the above issues I think). IIUC, there is no place where we want to
> > > > > mark CLT as publishable or is there any?
> > >
> > > IMHO the main reason is performance.
> > >
> > > > > 5) Also, I feel we can add some documentation now to help others to
> > > > > understand/review the patch better without going through the long
> > > > > thread.
> > >
> > > Make sense, I will do that in the next version.
> > >
> > > > >
> > > > > Few observations related to conflict-logging:
> > > > > ------------------------------
> > > > > 1)
> > > > > I found that for the conflicts which ultimately result in Error, we do
> > > > > not insert any conflict-record in clt.
> > > > >
> > > > > a)
> > > > > Example: insert_exists, update_Exists
> > > > > create table tab1 (i int primary key, j int);
> > > > > sub: insert into tab1 values(30,10);
> > > > > pub: insert into tab1 values(30,10);
> > > > > ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
> > > > > No record in clt.
> > > > >
> > > > > sub:
> > > > > <some pre-data needed>
> > > > > update tab1 set i=40 where i = 30;
> > > > > pub: update tab1 set i=40 where i = 20;
> > > > > ERROR:  conflict detected on relation "public.tab1": conflict=update_exists
> > > > > No record in clt.
> > >
> > > Yeah that interesting need to put thought on how to commit this record
> > > when an outer transaction is aborted as we do not have autonomous
> > > transactions which are generally used for this kind of logging.  But
> > > we can explore more options like inserting into conflict log tables
> > > outside the outer transaction.
> > >
> > > > > b)
> > > > > Another question related to this is, since these conflicts (which
> > > > > results in error) keep on happening until user resolves these or skips
> > > > > these or 'disable_on_error' is set. Then are we going to insert these
> > > > > multiple times? We do count these in 'confl_insert_exists' and
> > > > > 'confl_update_exists' everytime, so it makes sense to log those each
> > > > > time in clt as well. Thoughts?
> > >
> > > I think it make sense to insert every time we see the conflict, but it
> > > would be good to have opinion from others as well.
> >
> > Since there is a concern that multiple rows for
> > multiple_unique_conflicts can cause data-bloat, it made me rethink
> > that this is actually more prone to causing data-bloat if it is not
> > resolved on time, as it seems a far more frequent scenario. So shall
> > we keep inserting the record or insert it once and avoid inserting it
> > again based on lsn?  Thoughts?
>
> I agree, this is the real problem related to bloat so maybe we can see
> if the same tuple exists we can avoid inserting it again, although I
> haven't put thought on how to we distinguish between the new conflict
> on the same row vs the same conflict being inserted multiple times due
> to worker restart.
>

If there is consensus on this approach, IMO, it appears safe to rely
on 'remote_origin' and 'remote_commit_lsn' as the comparison keys for
the given 'conflict_type' before we insert a new record.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-01T09:41:56Z

On Mon, Dec 1, 2025 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Dec 1, 2025 at 2:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > Since there is a concern that multiple rows for
> > > multiple_unique_conflicts can cause data-bloat, it made me rethink
> > > that this is actually more prone to causing data-bloat if it is not
> > > resolved on time, as it seems a far more frequent scenario. So shall
> > > we keep inserting the record or insert it once and avoid inserting it
> > > again based on lsn?  Thoughts?
> >
> > I agree, this is the real problem related to bloat so maybe we can see
> > if the same tuple exists we can avoid inserting it again, although I
> > haven't put thought on how to we distinguish between the new conflict
> > on the same row vs the same conflict being inserted multiple times due
> > to worker restart.
> >
>
> If there is consensus on this approach, IMO, it appears safe to rely
> on 'remote_origin' and 'remote_commit_lsn' as the comparison keys for
> the given 'conflict_type' before we insert a new record.
>

What happens if as part of multiple_unique_conflict, in the next apply
round only some of the rows conflict (say in the meantime user has
removed a few conflicting rows)? I think the ideal way for users to
avoid such multiple occurrences is to configure subscription with
disable_on_error. I think we should LOG errors again on retry and it
is better to keep it consistent with what we print in LOG because we
may want to give an option to users in future where to LOG (in
conflict_history_table, LOG, or both) the conflicts.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T10:32:17Z

On Mon, Dec 1, 2025 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 1, 2025 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Mon, Dec 1, 2025 at 2:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > Since there is a concern that multiple rows for
> > > > multiple_unique_conflicts can cause data-bloat, it made me rethink
> > > > that this is actually more prone to causing data-bloat if it is not
> > > > resolved on time, as it seems a far more frequent scenario. So shall
> > > > we keep inserting the record or insert it once and avoid inserting it
> > > > again based on lsn?  Thoughts?
> > >
> > > I agree, this is the real problem related to bloat so maybe we can see
> > > if the same tuple exists we can avoid inserting it again, although I
> > > haven't put thought on how to we distinguish between the new conflict
> > > on the same row vs the same conflict being inserted multiple times due
> > > to worker restart.
> > >
> >
> > If there is consensus on this approach, IMO, it appears safe to rely
> > on 'remote_origin' and 'remote_commit_lsn' as the comparison keys for
> > the given 'conflict_type' before we insert a new record.
> >
>
> What happens if as part of multiple_unique_conflict, in the next apply
> round only some of the rows conflict (say in the meantime user has
> removed a few conflicting rows)? I think the ideal way for users to
> avoid such multiple occurrences is to configure subscription with
> disable_on_error. I think we should LOG errors again on retry and it
> is better to keep it consistent with what we print in LOG because we
> may want to give an option to users in future where to LOG (in
> conflict_history_table, LOG, or both) the conflicts.
>

Yeah that makes sense, because if the user tried to fix the conflict
and if still didn't get fixed then next time onward user will have no
way to know that conflict reoccurred.  And also it make sense to
maintain consistency with LOGs.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T10:52:17Z

On Fri, Nov 28, 2025 at 6:06 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Some review comments for v8-0001.

Thank Peter, yes these all make sense and will fix in next version
along with other comments by Vignesh/Shveta and Amit, except one
comment

> 9.
> +-- ok - conflict_log_table should not be published with ALL TABLE
> +CREATE PUBLICATION pub FOR TABLES IN SCHEMA clt;
> +SELECT * FROM pg_publication_tables WHERE pubname = 'pub';
> + pubname | schemaname | tablename | attnames | rowfilter
> +---------+------------+-----------+----------+-----------
> +(0 rows)
>
> Perhaps you should repeat this same test but using FOR ALL TABLES,
> instead of only FOR TABLES IN SCHEMA

I will have to see how we can safely do this in testing without having
any side effects on the concurrent test, generally we run
publication.sql and subscription.sql concurrently in regression test
so if we do FOR ALL TABLES it can affect each others, one option is to
don't run these 2 test concurrently, I think we can do that as there
is no real concurrency we are testing by running them concurrently,
any thought on this?

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-02T05:07:41Z

On Fri, Nov 28, 2025 at 2:32 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Nov 27, 2025 at 5:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> >
> > I have fixed all these comments and also the comments of 0002, now I
> > feel we can actually merge 0001 and 0002, so I have merged both of
> > them.
> >
> > Now pending work status
> > 1) fixed review comments of 0003
> > 2) Run pgindent -- planning to do it after we complete the first level
> > of review
> > 3) Subscription TAP test for logging the actual conflicts
> >
>
> Thanks  for the patch. A few observations:
>
> 1)
> It seems, as per LOG, 'key' and 'replica-identity' are different when
> it comes to insert_exists, update_exists and
> multiple_unique_conflicts, while I believe in CLT, key is
> replica-identity i.e. there are no 2 separate terms. Please see below:
>
> a)
> Update_Exists:
> 2025-11-28 14:08:56.179 IST [60383] ERROR:  conflict detected on
> relation "public.tab1": conflict=update_exists
> 2025-11-28 14:08:56.179 IST [60383] DETAIL:  Key already exists in
> unique index "tab1_pkey", modified locally in transaction 790 at
> 2025-11-28 14:07:17.578887+05:30.
> Key (i)=(40); existing local row (40, 10); remote row (40, 200);
> replica identity (i)=(20).
>
> postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple
> from clt where conflict_type='update_exists';
>  conflict_type | key_tuple |   local_tuple   |   remote_tuple
> ---------------+-----------+-----------------+------------------
>  update_exists | {"i":20}  | {"i":40,"j":10} | {"i":40,"j":200}
>
> b)
> insert_Exists:
> ERROR:  conflict detected on relation "public.tab1": conflict=insert_exists
> DETAIL:  Key already exists in unique index "tab1_pkey", modified
> locally in transaction 767 at 2025-11-28 13:59:22.431097+05:30.
> Key (i)=(30); existing local row (30, 10); remote row (30, 10).
>
> postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple from clt;
>  conflict_type  | key_tuple |   local_tuple   |  remote_tuple
> ----------------+-----------+-----------------+-----------------
>  insert_exists  |               | {"i":30,"j":10} | {"i":30,"j":10}
>
> case a) has key_tuple same as replica-identity of LOG
> case b) does not have replica-identity and thus key_tuple is NULL.
>
> Does that mean we need to maintain both key_tuple and RI separately in
> CLT? Thoughts?
>

Yeah, it could be useful to display RI values separately. What should
be the column name? Few options could be: remote_val_for_ri, or
remote_value_ri, or something else. I think it may also be useful to
display conflicting_index but OTOH, it would be difficult to decide in
the first version what other information could be required, so it is
better to stick with what is being displayed in LOG.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-02T05:15:42Z

On Wed, Nov 19, 2025 at 3:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Nov 18, 2025 at 4:47 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
>
> > 3)
> > Do we need to have a timestamp column as well to say when conflict was
> > recorded? Or local_commit_ts, remote_commit_ts are sufficient?
> > Thoughts
>
> You mean we can record the timestamp now while inserting, not sure if
> it will add some more meaningful information than remote_commit_ts,
> but let's see what others think.
>

local_commit_ts and remote_commit_ts sounds sufficient as one can
identify the truth of information from those two. The key/schema
values displayed in this table could change later but the information
about a particular row is based on the time shown by those two
columns.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-02T06:08:23Z

On Mon, Dec 1, 2025 at 10:11 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> The specific scenario we are discussing is when a single row from the
> publisher attempts to apply an operation that causes a conflict across
> multiple unique keys, with each of those unique key violations
> conflicting with a different local row on the subscriber, is very
> rare.  IMHO this low-frequency scenario does not justify
> overcomplicating the design with an array field or a multi-level
> table.
>

I did some analysis and search on the internet to answer your
following two questions.

> Consider the infrequency of the root causes:
> - How often does a table have more than 3 to 4 unique keys?

It is extremely common—in fact, it is considered the industry "best
practice" for modern database design.

One can find this pattern in almost every enterprise system (e.g.
banking apps, CRMs). It relies on distinguishing between Technical
Identity (for the database) and Business Identity (for the real
world).

1. The Design Pattern: Surrogate vs. Natural Keys
Primary Key (Surrogate Key): Usually a meaningless number (e.g.,
10452) or a UUID. It is used strictly for the database to join tables
efficiently. It never changes.
Unique Key (Natural Key): A real-world value (e.g., john@email.com or
SSN-123). This is how humans or external systems identify the row. It
can change (e.g., someone updates their email).

2. Common Real-World Use Cases
A. User Management (The most classic example)
Primary Key: user_id (Integer). Used for foreign keys in the ORDERS table.
Unique Key 1: email (Varchar). Prevents two people from registering
with the same email.
Unique Key 2: username (Varchar). Ensures unique display names.
Why? If a user changes their email address, you only update one field
in one table. If you used email as the Primary Key, you would have to
update millions of rows in the ORDERS table that reference that email.

B. Inventory / E-Commerce
Primary Key: product_id (Integer). Used internally by the code.
Unique Key: SKU (Stock Keeping Unit) or Barcode (EAN/UPC).
Why? Companies often re-organize their SKU formats. If the SKU was the
Primary Key, a format change would require a massive database
migration.

C. Government / HR Systems
Primary Key: employee_id (Integer).
Unique Key: National_ID (SSN, Aadhaar, Passport Number).
Why? Privacy and security. You do not want to expose a National ID in
every URL or API call (e.g., api/employee/552 is safer than
api/employee/SSN-123).

> - How frequently would each of these keys conflict with a unique row
> on the subscriber side?
>

It can occur with medium-to-high probability in following cases. (a)
In Bi-Directional replication systems; for example, If two users
create the same "User Profile" on two different servers at the same
time, the row will conflict on every unique field (ID, Email, SSN)
simultaneously. (b) The chances of bloat are high, on retrying to fix
the error as mentioned by Shveta. Say, if Ops team fixes errors by
just "trying again" without checking the full row, you will hit the ID
error, fix it, then immediately hit the Email error. (c) The chances
are medium during initial data-load; If a user is loading data from a
legacy system with "dirty" data, rows often violate multiple rules
(e.g., a duplicate user with both a reused ID and a reused Email).

> If resolving this occasional, synthetic conflict requires inserting
> two or three rows instead of a single one, this is an acceptable
> trade-off considering how rare it can occur.
>

As per above analysis and the re-try point Shveta raises, I don't
think we can ignore the possibility of data-bloat especially for this
multiple_unique_key conflict. We can consider logging multiple local
conflicting rows as JSON Array.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-02T06:36:37Z

On Tue, Dec 2, 2025 at 11:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 1, 2025 at 10:11 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > The specific scenario we are discussing is when a single row from the
> > publisher attempts to apply an operation that causes a conflict across
> > multiple unique keys, with each of those unique key violations
> > conflicting with a different local row on the subscriber, is very
> > rare.  IMHO this low-frequency scenario does not justify
> > overcomplicating the design with an array field or a multi-level
> > table.
> >
>
> I did some analysis and search on the internet to answer your
> following two questions.
>
> > Consider the infrequency of the root causes:
> > - How often does a table have more than 3 to 4 unique keys?
>
> It is extremely common—in fact, it is considered the industry "best
> practice" for modern database design.
>
> One can find this pattern in almost every enterprise system (e.g.
> banking apps, CRMs). It relies on distinguishing between Technical
> Identity (for the database) and Business Identity (for the real
> world).
>
> 1. The Design Pattern: Surrogate vs. Natural Keys
> Primary Key (Surrogate Key): Usually a meaningless number (e.g.,
> 10452) or a UUID. It is used strictly for the database to join tables
> efficiently. It never changes.
> Unique Key (Natural Key): A real-world value (e.g., john@email.com or
> SSN-123). This is how humans or external systems identify the row. It
> can change (e.g., someone updates their email).
>
> 2. Common Real-World Use Cases
> A. User Management (The most classic example)
> Primary Key: user_id (Integer). Used for foreign keys in the ORDERS table.
> Unique Key 1: email (Varchar). Prevents two people from registering
> with the same email.
> Unique Key 2: username (Varchar). Ensures unique display names.
> Why? If a user changes their email address, you only update one field
> in one table. If you used email as the Primary Key, you would have to
> update millions of rows in the ORDERS table that reference that email.
>
> B. Inventory / E-Commerce
> Primary Key: product_id (Integer). Used internally by the code.
> Unique Key: SKU (Stock Keeping Unit) or Barcode (EAN/UPC).
> Why? Companies often re-organize their SKU formats. If the SKU was the
> Primary Key, a format change would require a massive database
> migration.
>
> C. Government / HR Systems
> Primary Key: employee_id (Integer).
> Unique Key: National_ID (SSN, Aadhaar, Passport Number).
> Why? Privacy and security. You do not want to expose a National ID in
> every URL or API call (e.g., api/employee/552 is safer than
> api/employee/SSN-123).
>
> > - How frequently would each of these keys conflict with a unique row
> > on the subscriber side?
> >
>
> It can occur with medium-to-high probability in following cases. (a)
> In Bi-Directional replication systems; for example, If two users
> create the same "User Profile" on two different servers at the same
> time, the row will conflict on every unique field (ID, Email, SSN)
> simultaneously. (b) The chances of bloat are high, on retrying to fix
> the error as mentioned by Shveta. Say, if Ops team fixes errors by
> just "trying again" without checking the full row, you will hit the ID
> error, fix it, then immediately hit the Email error. (c) The chances
> are medium during initial data-load; If a user is loading data from a
> legacy system with "dirty" data, rows often violate multiple rules
> (e.g., a duplicate user with both a reused ID and a reused Email).
>
> > If resolving this occasional, synthetic conflict requires inserting
> > two or three rows instead of a single one, this is an acceptable
> > trade-off considering how rare it can occur.
> >
>
> As per above analysis and the re-try point Shveta raises, I don't
> think we can ignore the possibility of data-bloat especially for this
> multiple_unique_key conflict. We can consider logging multiple local
> conflicting rows as JSON Array.

Okay, I will try to make multiple local rows as JSON Array in the next version.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-02T07:08:01Z

On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 2, 2025 at 11:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 1, 2025 at 10:11 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > The specific scenario we are discussing is when a single row from the
> > > publisher attempts to apply an operation that causes a conflict across
> > > multiple unique keys, with each of those unique key violations
> > > conflicting with a different local row on the subscriber, is very
> > > rare.  IMHO this low-frequency scenario does not justify
> > > overcomplicating the design with an array field or a multi-level
> > > table.
> > >
> >
> > I did some analysis and search on the internet to answer your
> > following two questions.
> >
> > > Consider the infrequency of the root causes:
> > > - How often does a table have more than 3 to 4 unique keys?
> >
> > It is extremely common—in fact, it is considered the industry "best
> > practice" for modern database design.
> >
> > One can find this pattern in almost every enterprise system (e.g.
> > banking apps, CRMs). It relies on distinguishing between Technical
> > Identity (for the database) and Business Identity (for the real
> > world).
> >
> > 1. The Design Pattern: Surrogate vs. Natural Keys
> > Primary Key (Surrogate Key): Usually a meaningless number (e.g.,
> > 10452) or a UUID. It is used strictly for the database to join tables
> > efficiently. It never changes.
> > Unique Key (Natural Key): A real-world value (e.g., john@email.com or
> > SSN-123). This is how humans or external systems identify the row. It
> > can change (e.g., someone updates their email).
> >
> > 2. Common Real-World Use Cases
> > A. User Management (The most classic example)
> > Primary Key: user_id (Integer). Used for foreign keys in the ORDERS table.
> > Unique Key 1: email (Varchar). Prevents two people from registering
> > with the same email.
> > Unique Key 2: username (Varchar). Ensures unique display names.
> > Why? If a user changes their email address, you only update one field
> > in one table. If you used email as the Primary Key, you would have to
> > update millions of rows in the ORDERS table that reference that email.
> >
> > B. Inventory / E-Commerce
> > Primary Key: product_id (Integer). Used internally by the code.
> > Unique Key: SKU (Stock Keeping Unit) or Barcode (EAN/UPC).
> > Why? Companies often re-organize their SKU formats. If the SKU was the
> > Primary Key, a format change would require a massive database
> > migration.
> >
> > C. Government / HR Systems
> > Primary Key: employee_id (Integer).
> > Unique Key: National_ID (SSN, Aadhaar, Passport Number).
> > Why? Privacy and security. You do not want to expose a National ID in
> > every URL or API call (e.g., api/employee/552 is safer than
> > api/employee/SSN-123).
> >
> > > - How frequently would each of these keys conflict with a unique row
> > > on the subscriber side?
> > >
> >
> > It can occur with medium-to-high probability in following cases. (a)
> > In Bi-Directional replication systems; for example, If two users
> > create the same "User Profile" on two different servers at the same
> > time, the row will conflict on every unique field (ID, Email, SSN)
> > simultaneously. (b) The chances of bloat are high, on retrying to fix
> > the error as mentioned by Shveta. Say, if Ops team fixes errors by
> > just "trying again" without checking the full row, you will hit the ID
> > error, fix it, then immediately hit the Email error. (c) The chances
> > are medium during initial data-load; If a user is loading data from a
> > legacy system with "dirty" data, rows often violate multiple rules
> > (e.g., a duplicate user with both a reused ID and a reused Email).
> >
> > > If resolving this occasional, synthetic conflict requires inserting
> > > two or three rows instead of a single one, this is an acceptable
> > > trade-off considering how rare it can occur.
> > >
> >
> > As per above analysis and the re-try point Shveta raises, I don't
> > think we can ignore the possibility of data-bloat especially for this
> > multiple_unique_key conflict. We can consider logging multiple local
> > conflicting rows as JSON Array.
>
> Okay, I will try to make multiple local rows as JSON Array in the next version.
>
Just to clarify so that we are on the same page, along with the local
tuple the other local fields like local_xid, local_commit_ts,
local_origin will also be converted into the array.  Hope that makes
sense?

So we will change the table like this, not sure if this makes sense to
keep all local array fields nearby in the table, or let it be near the
respective remote field, like we are doing now remote_xid and local
xid together etc.

      Column       |           Type           | Collation | Nullable | Default
-------------------+--------------------------+-----------+----------+---------
 relid             | oid                      |           |          |
 schemaname        | text                     |           |          |
 relname           | text                     |           |          |
 conflict_type     | text                     |           |          |
 local_xid         | xid[]                      |           |          |
 remote_xid        | xid                      |           |          |
 remote_commit_lsn | pg_lsn                   |           |          |
 local_commit_ts   | timestamp with time zone[] |           |          |
 remote_commit_ts  | timestamp with time zone |           |          |
 local_origin      | text[]                     |           |          |
 remote_origin     | text                     |           |          |
 key_tuple         | json                     |           |          |
 local_tuple       | json[]                     |           |          |
 remote_tuple      | json                     |           |          |

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-02T09:17:42Z

On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> >
> > Okay, I will try to make multiple local rows as JSON Array in the next version.
> >
> Just to clarify so that we are on the same page, along with the local
> tuple the other local fields like local_xid, local_commit_ts,
> local_origin will also be converted into the array.  Hope that makes
> sense?
>

Yes, what about key_tuple or RI?

> So we will change the table like this, not sure if this makes sense to
> keep all local array fields nearby in the table, or let it be near the
> respective remote field, like we are doing now remote_xid and local
> xid together etc.
>

It is better to keep the array fields together at the end. I think it
would be better to read via CLI. Also, it may take more space due to
padding/alignment if we store fixed-width and variable-width columns
interleaved and similarly the access will also be slower for
interleaved cases.

Having said that, can we consider an alternative way to store all
local_conflict_info together as a JSONB column (that can be used to
store an array of objects). For example, the multiple conflicting
tuple information can be stored as:

[
{ "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin":
"node_A", "tuple": { "id": 1, "email": "a@b.com" } },
{ "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin":
"node_B", "tuple": { "id": 2, "phone": "555-0199" } }
]

To access JSON array columns, I think one needs to use the unnest
function, whereas JSONB could be accessed with something like: "SELECT
* FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]".

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-02T11:15:24Z

On Tue, Dec 2, 2025 at 2:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > >
> > > Okay, I will try to make multiple local rows as JSON Array in the next version.
> > >
> > Just to clarify so that we are on the same page, along with the local
> > tuple the other local fields like local_xid, local_commit_ts,
> > local_origin will also be converted into the array.  Hope that makes
> > sense?
> >
>
> Yes, what about key_tuple or RI?
>
> > So we will change the table like this, not sure if this makes sense to
> > keep all local array fields nearby in the table, or let it be near the
> > respective remote field, like we are doing now remote_xid and local
> > xid together etc.
> >
>
> It is better to keep the array fields together at the end. I think it
> would be better to read via CLI. Also, it may take more space due to
> padding/alignment if we store fixed-width and variable-width columns
> interleaved and similarly the access will also be slower for
> interleaved cases.
>
> Having said that, can we consider an alternative way to store all
> local_conflict_info together as a JSONB column (that can be used to
> store an array of objects). For example, the multiple conflicting
> tuple information can be stored as:
>
> [
> { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin":
> "node_A", "tuple": { "id": 1, "email": "a@b.com" } },
> { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin":
> "node_B", "tuple": { "id": 2, "phone": "555-0199" } }
> ]
>
> To access JSON array columns, I think one needs to use the unnest
> function, whereas JSONB could be accessed with something like: "SELECT
> * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]".

Yeah we can do that as well, maybe that's a better idea compared to
creating separate array fields for each local element.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-02T15:10:15Z

On Tue, Dec 2, 2025 at 4:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 2, 2025 at 2:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > >
> > > > Okay, I will try to make multiple local rows as JSON Array in the next version.
> > > >
> > > Just to clarify so that we are on the same page, along with the local
> > > tuple the other local fields like local_xid, local_commit_ts,
> > > local_origin will also be converted into the array.  Hope that makes
> > > sense?
> > >
> >
> > Yes, what about key_tuple or RI?
> >
> > > So we will change the table like this, not sure if this makes sense to
> > > keep all local array fields nearby in the table, or let it be near the
> > > respective remote field, like we are doing now remote_xid and local
> > > xid together etc.
> > >
> >
> > It is better to keep the array fields together at the end. I think it
> > would be better to read via CLI. Also, it may take more space due to
> > padding/alignment if we store fixed-width and variable-width columns
> > interleaved and similarly the access will also be slower for
> > interleaved cases.
> >
> > Having said that, can we consider an alternative way to store all
> > local_conflict_info together as a JSONB column (that can be used to
> > store an array of objects). For example, the multiple conflicting
> > tuple information can be stored as:
> >
> > [
> > { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin":
> > "node_A", "tuple": { "id": 1, "email": "a@b.com" } },
> > { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin":
> > "node_B", "tuple": { "id": 2, "phone": "555-0199" } }
> > ]
> >
> > To access JSON array columns, I think one needs to use the unnest
> > function, whereas JSONB could be accessed with something like: "SELECT
> > * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]".
>
> Yeah we can do that as well, maybe that's a better idea compared to
> creating separate array fields for each local element.

So I tried the POC idea with this approach and tested with one of the
test cases given by Shveta, and now the conflict log table entry looks
like this.  So we can see the local conflicts field which is an array
of JSON and each entry of the array is formed using (xid, commit_ts,
origin, json tuple).  I will send the updated patch by tomorrow after
doing some more cleanup and testing.

relid             | 16391
schemaname        | public
relname           | conf_tab
conflict_type     | multiple_unique_conflicts
remote_xid        | 761
remote_commit_lsn | 0/01761400
remote_commit_ts  | 2025-12-02 15:02:07.045935+00
remote_origin     | pg_16406
key_tuple         |
remote_tuple      | {"a":2,"b":3,"c":4}
local_conflicts   |
{"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-03T04:19:10Z

On Tue, Dec 2, 2025 at 8:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 2, 2025 at 4:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Dec 2, 2025 at 2:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Okay, I will try to make multiple local rows as JSON Array in the next version.
> > > > >
> > > > Just to clarify so that we are on the same page, along with the local
> > > > tuple the other local fields like local_xid, local_commit_ts,
> > > > local_origin will also be converted into the array.  Hope that makes
> > > > sense?
> > > >
> > >
> > > Yes, what about key_tuple or RI?
> > >
> > > > So we will change the table like this, not sure if this makes sense to
> > > > keep all local array fields nearby in the table, or let it be near the
> > > > respective remote field, like we are doing now remote_xid and local
> > > > xid together etc.
> > > >
> > >
> > > It is better to keep the array fields together at the end. I think it
> > > would be better to read via CLI. Also, it may take more space due to
> > > padding/alignment if we store fixed-width and variable-width columns
> > > interleaved and similarly the access will also be slower for
> > > interleaved cases.
> > >
> > > Having said that, can we consider an alternative way to store all
> > > local_conflict_info together as a JSONB column (that can be used to
> > > store an array of objects). For example, the multiple conflicting
> > > tuple information can be stored as:
> > >
> > > [
> > > { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin":
> > > "node_A", "tuple": { "id": 1, "email": "a@b.com" } },
> > > { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin":
> > > "node_B", "tuple": { "id": 2, "phone": "555-0199" } }
> > > ]
> > >
> > > To access JSON array columns, I think one needs to use the unnest
> > > function, whereas JSONB could be accessed with something like: "SELECT
> > > * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]".
> >
> > Yeah we can do that as well, maybe that's a better idea compared to
> > creating separate array fields for each local element.
>
> So I tried the POC idea with this approach and tested with one of the
> test cases given by Shveta, and now the conflict log table entry looks
> like this.  So we can see the local conflicts field which is an array
> of JSON and each entry of the array is formed using (xid, commit_ts,
> origin, json tuple).  I will send the updated patch by tomorrow after
> doing some more cleanup and testing.
>
> relid             | 16391
> schemaname        | public
> relname           | conf_tab
> conflict_type     | multiple_unique_conflicts
> remote_xid        | 761
> remote_commit_lsn | 0/01761400
> remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> remote_origin     | pg_16406
> key_tuple         |
> remote_tuple      | {"a":2,"b":3,"c":4}
> local_conflicts   |
> {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
>

Thanks, it looks good. For the benefit of others, could you include a
brief note, perhaps in the commit message for now, describing how to
access or read this array column? We can remove it later.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-03T11:26:49Z

On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > relid             | 16391
> > schemaname        | public
> > relname           | conf_tab
> > conflict_type     | multiple_unique_conflicts
> > remote_xid        | 761
> > remote_commit_lsn | 0/01761400
> > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > remote_origin     | pg_16406
> > key_tuple         |
> > remote_tuple      | {"a":2,"b":3,"c":4}
> > local_conflicts   |
> > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> >
>
> Thanks, it looks good. For the benefit of others, could you include a
> brief note, perhaps in the commit message for now, describing how to
> access or read this array column? We can remove it later.

Thanks, okay, temporarily I have added in a commit message how we can
fetch the data from the JSON array field.  In next version I will add
a test to get the conflict stored in conflict log history table and
fetch from it.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Masahiko Sawada <sawada.mshk@gmail.com> — 2025-12-04T02:00:47Z

On Wed, Dec 3, 2025 at 3:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > relid             | 16391
> > > schemaname        | public
> > > relname           | conf_tab
> > > conflict_type     | multiple_unique_conflicts
> > > remote_xid        | 761
> > > remote_commit_lsn | 0/01761400
> > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > remote_origin     | pg_16406
> > > key_tuple         |
> > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > local_conflicts   |
> > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > >
> >
> > Thanks, it looks good. For the benefit of others, could you include a
> > brief note, perhaps in the commit message for now, describing how to
> > access or read this array column? We can remove it later.
>
> Thanks, okay, temporarily I have added in a commit message how we can
> fetch the data from the JSON array field.  In next version I will add
> a test to get the conflict stored in conflict log history table and
> fetch from it.
>

I've reviewed the v9 patch and here are some comments:

The patch utilizes SPI for creating and dropping the conflict history
table, but I'm really not sure if it's okay because it's actually
affected by some GUC parameters such as default_tablespace and
default_toast_compression etc. Also, probably some hooks and event
triggers could be fired during the creation and removal. Is it
intentional behavior? I'm concerned that it would make investigation
harder if an issue happened in the user environment.

---
+   /* build and execute the CREATE TABLE query. */
+   appendStringInfo(&querybuf,
+                    "CREATE TABLE %s.%s ("
+                    "relid Oid,"
+                    "schemaname TEXT,"
+                    "relname TEXT,"
+                    "conflict_type TEXT,"
+                    "remote_xid xid,"
+                    "remote_commit_lsn pg_lsn,"
+                    "remote_commit_ts TIMESTAMPTZ,"
+                    "remote_origin TEXT,"
+                    "key_tuple     JSON,"
+                    "remote_tuple  JSON,"
+                    "local_conflicts JSON[])",
+                    quote_identifier(get_namespace_name(namespaceId)),
+                    quote_identifier(conflictrel));

If we want to use SPI for history table creation, we should use
qualified names in all the places including data types.

---
The patch doesn't create the dependency between the subscription and
the conflict history table. So users can entirely drop the schema
(with CASCADE option) where the history table is created. And once
dropping the schema along with the history table, ALTER SUBSCRIPTION
... SET (conflict_history_table = '') seems not to work (I got a
SEGV).

---
We can create the history table in pg_temp namespace but it should not
be allowed.

---
I think the conflict history table should not be transferred to the
new cluster when pg_upgrade since the table definition could be
different across major versions.

I got the following log when the publisher disables track_commit_timestamp:

local_conflicts   |
{"{\"xid\":\"790\",\"commit_ts\":\"1999-12-31T16:00:00-08:00\",\"origin\":\"\",\"tuple\":{\"c\":1}}"}

I think we can omit commit_ts when it's omitted.

---
I think we should keep the history table name case-sensitive:

postgres(1:351685)=# create subscription sub connection
'dbname=postgres port=5551' publication pub with (conflict_log_table =
'LOGTABLE');
CREATE SUBSCRIPTION
postgres(1:351685)=# \d
          List of relations
 Schema |   Name   | Type  |  Owner
--------+----------+-------+----------
 public | test     | table | masahiko
 public | logtable | table | masahiko
(2 rows)

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-04T05:18:42Z

On Thu, Dec 4, 2025 at 7:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Dec 3, 2025 at 3:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > relid             | 16391
> > > > schemaname        | public
> > > > relname           | conf_tab
> > > > conflict_type     | multiple_unique_conflicts
> > > > remote_xid        | 761
> > > > remote_commit_lsn | 0/01761400
> > > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > > remote_origin     | pg_16406
> > > > key_tuple         |
> > > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > > local_conflicts   |
> > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > > >
> > >
> > > Thanks, it looks good. For the benefit of others, could you include a
> > > brief note, perhaps in the commit message for now, describing how to
> > > access or read this array column? We can remove it later.
> >
> > Thanks, okay, temporarily I have added in a commit message how we can
> > fetch the data from the JSON array field.  In next version I will add
> > a test to get the conflict stored in conflict log history table and
> > fetch from it.
> >
>
> I've reviewed the v9 patch and here are some comments:

Thanks for reviewing this and your valuable comments.

> The patch utilizes SPI for creating and dropping the conflict history
> table, but I'm really not sure if it's okay because it's actually
> affected by some GUC parameters such as default_tablespace and
> default_toast_compression etc. Also, probably some hooks and event
> triggers could be fired during the creation and removal. Is it
> intentional behavior? I'm concerned that it would make investigation
> harder if an issue happened in the user environment.

Hmm, interesting point, well we can control the value of default
parameters while creating the table using SPI, but I don't see any
reason to not use heap_create_with_catalog() directly, so maybe that's
a better choice than using SPI because then we don't need to bother
about any event triggers/utility hooks etc.  Although I don't see any
specific issue with that, unless the user intentionally wants to
create trouble while creating this table.  What do others think about
it?

> ---
> +   /* build and execute the CREATE TABLE query. */
> +   appendStringInfo(&querybuf,
> +                    "CREATE TABLE %s.%s ("
> +                    "relid Oid,"
> +                    "schemaname TEXT,"
> +                    "relname TEXT,"
> +                    "conflict_type TEXT,"
> +                    "remote_xid xid,"
> +                    "remote_commit_lsn pg_lsn,"
> +                    "remote_commit_ts TIMESTAMPTZ,"
> +                    "remote_origin TEXT,"
> +                    "key_tuple     JSON,"
> +                    "remote_tuple  JSON,"
> +                    "local_conflicts JSON[])",
> +                    quote_identifier(get_namespace_name(namespaceId)),
> +                    quote_identifier(conflictrel));
>
> If we want to use SPI for history table creation, we should use
> qualified names in all the places including data types.

That's true, so that we can avoid interference of any user created types.

> ---
> The patch doesn't create the dependency between the subscription and
> the conflict history table. So users can entirely drop the schema
> (with CASCADE option) where the history table is created.

I think as part of the initial discussion we thought since it is
created under the subscription owner privileges so only that user can
drop that table and if the user intentionally drops the table the
conflict will not be recorded in the table and that's acceptable. But
now I think it would be a good idea to maintain the dependency with
subscription so that users can not drop it without dropping the
subscription.

 And once
> dropping the schema along with the history table, ALTER SUBSCRIPTION
> ... SET (conflict_history_table = '') seems not to work (I got a
> SEGV).

I will check this, thanks

> ---
> We can create the history table in pg_temp namespace but it should not
> be allowed.

Right, will check this and also add the test for the same.

> ---
> I think the conflict history table should not be transferred to the
> new cluster when pg_upgrade since the table definition could be
> different across major versions.

Let me think more on this with respect to behaviour of other factors
like subscriptions etc.

> I got the following log when the publisher disables track_commit_timestamp:
>
> local_conflicts   |
> {"{\"xid\":\"790\",\"commit_ts\":\"1999-12-31T16:00:00-08:00\",\"origin\":\"\",\"tuple\":{\"c\":1}}"}
>
> I think we can omit commit_ts when it's omitted.

+1

> ---
> I think we should keep the history table name case-sensitive:

Yeah we can do that, it looks good to me, what do others think about it?


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-04T06:51:20Z

On Wed, Dec 3, 2025 at 4:57 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> >
> > Thanks, it looks good. For the benefit of others, could you include a
> > brief note, perhaps in the commit message for now, describing how to
> > access or read this array column? We can remove it later.
>
> Thanks, okay, temporarily I have added in a commit message how we can
> fetch the data from the JSON array field.  In next version I will add
> a test to get the conflict stored in conflict log history table and
> fetch from it.
>

Thanks, I have not looked at the patch in detail yet, but a few things:

1)
Assert is hit here:
 LOG:  logical replication apply worker for subscription "sub1" has started
TRAP: failed Assert("slot != NULL"), File: "conflict.c", Line: 669, PID: 137604

Steps: create table tab1 (i int primary key, j int);
Pub: insert into tab1 values(10,10); insert into tab1 values(20,10);
Sub:  delete from tab1 where i=10;
Pub:  delete from tab1 where i=10;

2)
I see that key_tuple still points to RI and there is no RI field
added. It seems that discussion at [1] is missed in this patch.

[1]: https://www.postgresql.org/message-id/CAA4eK1L3umixUUik7Ef1eU%3Dx-JMb8iXD7rWWExBMP4dmOGTS9A%40mail.gmail.com

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-12-04T07:01:56Z

Hi. Some review comments for v9-0001.

======
Commit message.

1.
Note: A single remote tuple may conflict with multiple local conflict
when conflict type
is CT_MULTIPLE_UNIQUE_CONFLICTS, so for handling this case we create a
single row in
conflict log table with respect to each remote conflict row even if it
conflicts with
multiple local rows and we store the multiple conflict tuples as a
single JSON array
element in format as
[ { "xid": "1001", "commit_ts": "...", "origin": "...", "tuple": {...} }, ... ]
We can extract the elements from local tuple as given in below example

~

Something seems broken/confused with this description:

1a.
"A single remote tuple may conflict with multiple local conflict"
Should that say "... with multiple local tuples" ?

~

1b.
There is a mixture of terminology here, "row" vs "tuple", which
doesn't seem correct.

~

1c.
"We can extract the elements from local tuple"
Should that say "... elements of the local tuples from the CLT row ..."

======
src/backend/replication/logical/conflict.c

2.
+
+#define N_LOCAL_CONFLICT_INFO_ATTRS 4

I felt it would be better to put this where it is used. e.g. IMO put
it within the build_conflict_tupledesc().

~~~

InsertConflictLogTuple:

3.
+ /* A valid tuple must be prepared and store in MyLogicalRepWorker. */

Typo still here: /store in/stored in/

~~~

4.
+static TupleDesc
+build_conflict_tupledesc(void)
+{
+ TupleDesc tupdesc;
+
+ tupdesc = CreateTemplateTupleDesc(N_LOCAL_CONFLICT_INFO_ATTRS);
+
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "xid",
+ XIDOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "commit_ts",
+ TIMESTAMPTZOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "origin",
+ TEXTOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "tuple",
+ JSONOID, -1, 0);

If you had some incrementing attno instead of hard-wiring the
(1,2,3,4) then you'd be able to add a sanity check like Assert(attno +
1 ==  N_LOCAL_CONFLICT_INFO_ATTRS); that can safeguard against future
mistakes in case something changes without updating the constant.

~~~

build_local_conflicts_json_array:

5.
+ /* Process local conflict tuple list and prepare a array of JSON. */
+ foreach(lc, conflicttuples)
  {
- tableslot = table_slot_create(localrel, &estate->es_tupleTable);
- tableslot = ExecCopySlot(tableslot, slot);
+ ConflictTupleInfo *conflicttuple = (ConflictTupleInfo *) lfirst(lc);

5a.
typo in comment: /a array/an array/

~

5b.
SUGGESTION
foreach_ptr(ConflictTupleInfo, conflicttuple, confrlicttuples)
{

~~~

6.
+ i = 0;
+ foreach(lc, json_datums)
+ {
+ json_datum_array[i] = (Datum) lfirst(lc);
+ json_null_array[i] = false;
+ i++;
+ }

6a.
The loop seemed to be unnecessarily complicated since you already know
the size. Isn't it the same as below?

SUGGESTION
for (int i = 0; i < num_conflicts; i++)
{
  json_datum_array[i] = (Datum) list_nth(json_datums, i);
  json_null_array[i] = false;
}

6b.
Also, there is probably no need to do json_null_array[i] = false; at
every iteration here, because you could have just used palloc0 for the
whole array in the first place.

======
src/test/regress/expected/subscription.out

7.
+-- check if the table exists and has the correct schema (15 columns)
+SELECT count(*) FROM pg_attribute WHERE attrelid =
'public.regress_conflict_log1'::regclass AND attnum > 0;
+ count
+-------
+    11
+(1 row)
+

That comment is wrong; there aren't 15 columns anymore.

~~~

8.
(mentioned in a previous review)

I felt that \dRs should display the CLT's schema name in the "Conflict
log table" field -- at least when it's not "public". Otherwise, it
won't be easy for the user to know it.

I did not see a test case for this.

~~~

9.
(mentioned in a previous review)

You could have another test case to explicitly call the function
pg_relation_is_publishable(clt) to verify it returns false for a CTL
table.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-04T10:50:22Z

On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Dec 4, 2025 at 7:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
>
> > The patch utilizes SPI for creating and dropping the conflict history
> > table, but I'm really not sure if it's okay because it's actually
> > affected by some GUC parameters such as default_tablespace and
> > default_toast_compression etc. Also, probably some hooks and event
> > triggers could be fired during the creation and removal. Is it
> > intentional behavior? I'm concerned that it would make investigation
> > harder if an issue happened in the user environment.
>
> Hmm, interesting point, well we can control the value of default
> parameters while creating the table using SPI, but I don't see any
> reason to not use heap_create_with_catalog() directly, so maybe that's
> a better choice than using SPI because then we don't need to bother
> about any event triggers/utility hooks etc.  Although I don't see any
> specific issue with that, unless the user intentionally wants to
> create trouble while creating this table.  What do others think about
> it?
>
> > ---
> > +   /* build and execute the CREATE TABLE query. */
> > +   appendStringInfo(&querybuf,
> > +                    "CREATE TABLE %s.%s ("
> > +                    "relid Oid,"
> > +                    "schemaname TEXT,"
> > +                    "relname TEXT,"
> > +                    "conflict_type TEXT,"
> > +                    "remote_xid xid,"
> > +                    "remote_commit_lsn pg_lsn,"
> > +                    "remote_commit_ts TIMESTAMPTZ,"
> > +                    "remote_origin TEXT,"
> > +                    "key_tuple     JSON,"
> > +                    "remote_tuple  JSON,"
> > +                    "local_conflicts JSON[])",
> > +                    quote_identifier(get_namespace_name(namespaceId)),
> > +                    quote_identifier(conflictrel));
> >
> > If we want to use SPI for history table creation, we should use
> > qualified names in all the places including data types.
>
> That's true, so that we can avoid interference of any user created types.
>
> > ---
> > The patch doesn't create the dependency between the subscription and
> > the conflict history table. So users can entirely drop the schema
> > (with CASCADE option) where the history table is created.
>
> I think as part of the initial discussion we thought since it is
> created under the subscription owner privileges so only that user can
> drop that table and if the user intentionally drops the table the
> conflict will not be recorded in the table and that's acceptable. But
> now I think it would be a good idea to maintain the dependency with
> subscription so that users can not drop it without dropping the
> subscription.
>

Yeah, it seems reasonable to maintain its dependency with the
subscription in this model. BTW, for this it would be easier to record
dependency, if we use heap_create_with_catalog() as we do for
create_toast_table(). The other places where we use SPI interface to
execute statements are either the places where we need to execute
multiple SQL statements or non-CREATE Table statements. So, for this
patch's purpose, I feel heap_create_with_catalog() suits more.

I was also thinking whether it is a good idea to create one global
conflict table and let all subscriptions use it. However, it has
disadvantages like whenever, user drops any subscription, we need to
DELETE all conflict rows for that subscription causing the need for
vacuum. Then we somehow need to ensure that conflicts from one
subscription_owner are not visible to other subscription_owner via
some RLS policy. So, catalog table per-subscription (aka) the current
way appears better.

Also, shall we give the option to the user where she wants to see
conflict/resolution information? One idea to achieve the same is to
provide subscription options like (a) conflict_resolution_format, the
values could be log and table for now, in future, one could extend it
to other options like xml, json, etc. (b) conflict_log_table: in this
user can specify the conflict table name, this can be optional such
that if user omits this and conflict_resolution_format is table, then
we will use internally generated table name like
pg_conflicts_<subscription_id>.

>  And once
> > dropping the schema along with the history table, ALTER SUBSCRIPTION
> > ... SET (conflict_history_table = '') seems not to work (I got a
> > SEGV).
>
> I will check this, thanks
>
> > ---
> > We can create the history table in pg_temp namespace but it should not
> > be allowed.
>
> Right, will check this and also add the test for the same.
>
> > ---
> > I think the conflict history table should not be transferred to the
> > new cluster when pg_upgrade since the table definition could be
> > different across major versions.
>
> Let me think more on this with respect to behaviour of other factors
> like subscriptions etc.
>

Can we deal with different schema of tables across versions via
pg_dump/restore during upgrade?

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-04T14:35:31Z

On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > relid             | 16391
> > > schemaname        | public
> > > relname           | conf_tab
> > > conflict_type     | multiple_unique_conflicts
> > > remote_xid        | 761
> > > remote_commit_lsn | 0/01761400
> > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > remote_origin     | pg_16406
> > > key_tuple         |
> > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > local_conflicts   |
> > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > >
> >
> > Thanks, it looks good. For the benefit of others, could you include a
> > brief note, perhaps in the commit message for now, describing how to
> > access or read this array column? We can remove it later.
>
> Thanks, okay, temporarily I have added in a commit message how we can
> fetch the data from the JSON array field.  In next version I will add
> a test to get the conflict stored in conflict log history table and
> fetch from it.

I noticed that the table structure can get changed by the time the
conflict record is prepared. In ReportApplyConflict(), the code
currently prepares the conflict log tuple before deciding whether the
insertion will be immediate or deferred:
+       /* Insert conflict details to conflict log table. */
+       if (conflictlogrel)
+       {
+               /*
+                * Prepare the conflict log tuple. If the error level
is below ERROR,
+                * insert it immediately. Otherwise, defer the
insertion to a new
+                * transaction after the current one aborts, ensuring
the insertion of
+                * the log tuple is not rolled back.
+                */
+               prepare_conflict_log_tuple(estate,
+
relinfo->ri_RelationDesc,
+
conflictlogrel,
+                                                                  type,
+                                                                  searchslot,
+
conflicttuples,
+                                                                  remoteslot);
+               if (elevel < ERROR)
+                       InsertConflictLogTuple(conflictlogrel);
+
+               table_close(conflictlogrel, RowExclusiveLock);
+       }

If the conflict history table defintion is changed just before
prepare_conflict_log_tuple, the tuple creation will crash:
Program received signal SIGSEGV, Segmentation fault.
0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
../../../../src/include/varatt.h:419
419 return VARATT_IS_4B_U(PTR) &&
(gdb) bt
#0  0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
../../../../src/include/varatt.h:419
#1  0x00005a342e01e5ed in heap_compute_data_size
(tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20,
isnull=0x7ffd7af3ad15) at heaptuple.c:239
#2  0x00005a342e0200dd in heap_form_tuple
(tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20,
isnull=0x7ffd7af3ad15) at heaptuple.c:1158
#3  0x00005a342e55e8c2 in prepare_conflict_log_tuple
(estate=0x5a3467944530, rel=0x7ab405e594e8,
conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS,
searchslot=0x0,
    conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936
#4  0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530,
relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS,
searchslot=0x0, remoteslot=0x5a346792e498,
    conflicttuples=0x5a3467942da0) at conflict.c:168
#5  0x00005a342e348c35 in CheckAndReportConflict
(resultRelInfo=0x5a346792e778, estate=0x5a3467944530,
type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0,
    remoteslot=0x5a346792e498) at execReplication.c:793

This can be reproduced by the following steps:
CREATE PUBLICATION pub;
CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict');
ALTER TABLE conflict RENAME TO conflict1:
CREATE TABLE conflict(c1 varchar, c2 varchar);
-- Cause a conflict, this will crash while trying to prepare the
conflicting tuple

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-05T03:54:18Z

On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > relid             | 16391
> > > schemaname        | public
> > > relname           | conf_tab
> > > conflict_type     | multiple_unique_conflicts
> > > remote_xid        | 761
> > > remote_commit_lsn | 0/01761400
> > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > remote_origin     | pg_16406
> > > key_tuple         |
> > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > local_conflicts   |
> > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > >
> >
> > Thanks, it looks good. For the benefit of others, could you include a
> > brief note, perhaps in the commit message for now, describing how to
> > access or read this array column? We can remove it later.
>
> Thanks, okay, temporarily I have added in a commit message how we can
> fetch the data from the JSON array field.  In next version I will add
> a test to get the conflict stored in conflict log history table and
> fetch from it.

Few comments:
1) Currently pg_dump is not dumping conflict_log_table option, I felt
it should be included while dumping.

2) Is there a way to unset the conflict log table after we create the
subscription with conflict_log_table option

3) Any reason why this table should not be allowed to add to a publication:
+       /* Can't be conflict log table */
+       if (IsConflictLogTable(RelationGetRelid(targetrel)))
+               ereport(ERROR,
+                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                                errmsg("cannot add relation \"%s.%s\"
to publication",
+
get_namespace_name(RelationGetNamespace(targetrel)),
+
RelationGetRelationName(targetrel)),
+                                errdetail("This operation is not
supported for conflict log tables.")));

Is the reason like the same table can be a conflict table in the
subscriber and prevent corruption in the subscriber

4) I did not find any documentation for this feature, can we include
documentation in create_subscription.sgml, alter_subscription.sgml and
logical_replication.sgml

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-05T05:09:47Z

On Thu, Dec 4, 2025 at 8:05 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > relid             | 16391
> > > > schemaname        | public
> > > > relname           | conf_tab
> > > > conflict_type     | multiple_unique_conflicts
> > > > remote_xid        | 761
> > > > remote_commit_lsn | 0/01761400
> > > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > > remote_origin     | pg_16406
> > > > key_tuple         |
> > > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > > local_conflicts   |
> > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > > >
> > >
> > > Thanks, it looks good. For the benefit of others, could you include a
> > > brief note, perhaps in the commit message for now, describing how to
> > > access or read this array column? We can remove it later.
> >
> > Thanks, okay, temporarily I have added in a commit message how we can
> > fetch the data from the JSON array field.  In next version I will add
> > a test to get the conflict stored in conflict log history table and
> > fetch from it.
>
> I noticed that the table structure can get changed by the time the
> conflict record is prepared. In ReportApplyConflict(), the code
> currently prepares the conflict log tuple before deciding whether the
> insertion will be immediate or deferred:
> +       /* Insert conflict details to conflict log table. */
> +       if (conflictlogrel)
> +       {
> +               /*
> +                * Prepare the conflict log tuple. If the error level
> is below ERROR,
> +                * insert it immediately. Otherwise, defer the
> insertion to a new
> +                * transaction after the current one aborts, ensuring
> the insertion of
> +                * the log tuple is not rolled back.
> +                */
> +               prepare_conflict_log_tuple(estate,
> +
> relinfo->ri_RelationDesc,
> +
> conflictlogrel,
> +                                                                  type,
> +                                                                  searchslot,
> +
> conflicttuples,
> +                                                                  remoteslot);
> +               if (elevel < ERROR)
> +                       InsertConflictLogTuple(conflictlogrel);
> +
> +               table_close(conflictlogrel, RowExclusiveLock);
> +       }
>
> If the conflict history table defintion is changed just before
> prepare_conflict_log_tuple, the tuple creation will crash:
> Program received signal SIGSEGV, Segmentation fault.
> 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
> ../../../../src/include/varatt.h:419
> 419 return VARATT_IS_4B_U(PTR) &&
> (gdb) bt
> #0  0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
> ../../../../src/include/varatt.h:419
> #1  0x00005a342e01e5ed in heap_compute_data_size
> (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20,
> isnull=0x7ffd7af3ad15) at heaptuple.c:239
> #2  0x00005a342e0200dd in heap_form_tuple
> (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20,
> isnull=0x7ffd7af3ad15) at heaptuple.c:1158
> #3  0x00005a342e55e8c2 in prepare_conflict_log_tuple
> (estate=0x5a3467944530, rel=0x7ab405e594e8,
> conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS,
> searchslot=0x0,
>     conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936
> #4  0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530,
> relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS,
> searchslot=0x0, remoteslot=0x5a346792e498,
>     conflicttuples=0x5a3467942da0) at conflict.c:168
> #5  0x00005a342e348c35 in CheckAndReportConflict
> (resultRelInfo=0x5a346792e778, estate=0x5a3467944530,
> type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0,
>     remoteslot=0x5a346792e498) at execReplication.c:793
>
> This can be reproduced by the following steps:
> CREATE PUBLICATION pub;
> CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict');
> ALTER TABLE conflict RENAME TO conflict1:
> CREATE TABLE conflict(c1 varchar, c2 varchar);
> -- Cause a conflict, this will crash while trying to prepare the
> conflicting tuple

Yeah while it is allowed to drop or alter the conflict log table, it
should not seg fault, IMHO error is acceptable as per the initial
discussion, so I will look into this and tighten up the logic so that
it will throw an error whenever it can not insert into the conflict
log table.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-05T05:16:44Z

On Fri, Dec 5, 2025 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > relid             | 16391
> > > > schemaname        | public
> > > > relname           | conf_tab
> > > > conflict_type     | multiple_unique_conflicts
> > > > remote_xid        | 761
> > > > remote_commit_lsn | 0/01761400
> > > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > > remote_origin     | pg_16406
> > > > key_tuple         |
> > > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > > local_conflicts   |
> > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > > >
> > >
> > > Thanks, it looks good. For the benefit of others, could you include a
> > > brief note, perhaps in the commit message for now, describing how to
> > > access or read this array column? We can remove it later.
> >
> > Thanks, okay, temporarily I have added in a commit message how we can
> > fetch the data from the JSON array field.  In next version I will add
> > a test to get the conflict stored in conflict log history table and
> > fetch from it.
>
> Few comments:
> 1) Currently pg_dump is not dumping conflict_log_table option, I felt
> it should be included while dumping.

Yeah, we should.

> 2) Is there a way to unset the conflict log table after we create the
> subscription with conflict_log_table option

IMHO we can use ALTER SUBSCRIPTION...WITH(conflict_log_table='') so
unset? What do others think about it?

> 3) Any reason why this table should not be allowed to add to a publication:
> +       /* Can't be conflict log table */
> +       if (IsConflictLogTable(RelationGetRelid(targetrel)))
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                                errmsg("cannot add relation \"%s.%s\"
> to publication",
> +
> get_namespace_name(RelationGetNamespace(targetrel)),
> +
> RelationGetRelationName(targetrel)),
> +                                errdetail("This operation is not
> supported for conflict log tables.")));
>
> Is the reason like the same table can be a conflict table in the
> subscriber and prevent corruption in the subscriber

The main reason was that, since these tables are internally created
for maintaining the conflict information which is very much internal
node specific details, so there is no reason someone want to replicate
those tables, so we blocked it with ALL TABLES option and then based
on suggestion from Shveta we blocked it from getting added to
publication as well.  So there is no strong reason to disallow from
forcefully getting added to publication OTOH there is no reason why
someone wants to do that considering those are internally managed
tables.

> 4) I did not find any documentation for this feature, can we include
> documentation in create_subscription.sgml, alter_subscription.sgml and
> logical_replication.sgml

Yeah, in the initial version I posted a doc patch, but since we are
doing changes in the first patch and also some behavior might change
so I will postpone it for a later stage after we have consensus on
most of the behaviour.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-05T09:29:54Z

On Fri, Dec 5, 2025 at 10:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 5, 2025 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> > 2) Is there a way to unset the conflict log table after we create the
> > subscription with conflict_log_table option
>
> IMHO we can use ALTER SUBSCRIPTION...WITH(conflict_log_table='') so
> unset? What do others think about it?
>

We already have a syntax: ALTER SUBSCRIPTION name SET (
subscription_parameter [= value] [, ... ] ) which can be used to
set/unset this new subscription option.

> > 3) Any reason why this table should not be allowed to add to a publication:
> > +       /* Can't be conflict log table */
> > +       if (IsConflictLogTable(RelationGetRelid(targetrel)))
> > +               ereport(ERROR,
> > +                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > +                                errmsg("cannot add relation \"%s.%s\"
> > to publication",
> > +
> > get_namespace_name(RelationGetNamespace(targetrel)),
> > +
> > RelationGetRelationName(targetrel)),
> > +                                errdetail("This operation is not
> > supported for conflict log tables.")));
> >
> > Is the reason like the same table can be a conflict table in the
> > subscriber and prevent corruption in the subscriber
>
> The main reason was that, since these tables are internally created
> for maintaining the conflict information which is very much internal
> node specific details, so there is no reason someone want to replicate
> those tables, so we blocked it with ALL TABLES option and then based
> on suggestion from Shveta we blocked it from getting added to
> publication as well.  So there is no strong reason to disallow from
> forcefully getting added to publication OTOH there is no reason why
> someone wants to do that considering those are internally managed
> tables.
>

I also don't see any reason to allow such internal tables to be
replicated. So, it is okay to prohibit them for now. If we see any use
case, we can allow it.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-05T09:55:07Z

On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Also, shall we give the option to the user where she wants to see
> conflict/resolution information? One idea to achieve the same is to
> provide subscription options like (a) conflict_resolution_format, the
> values could be log and table for now, in future, one could extend it
> to other options like xml, json, etc. (b) conflict_log_table: in this
> user can specify the conflict table name, this can be optional such
> that if user omits this and conflict_resolution_format is table, then
> we will use internally generated table name like
> pg_conflicts_<subscription_id>.
>

In this idea, we can keep the name of the second option as
conflict_log_name instead of conflict_log_table. This can help us LOG
the conflicts in a totally separate conflict file instead of in server
log. Say, the user provides conflict_resolution_format as 'log' and
conflict_log_name as 'conflict_report' then we can report conflicts in
this separate file by appending subid to distinguish it. And, if the
user gives only the first option conflict_resolution_format as 'log'
then we can keep reporting the information in server log files.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-05T10:13:38Z

On Fri, Dec 5, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Also, shall we give the option to the user where she wants to see
> > conflict/resolution information? One idea to achieve the same is to
> > provide subscription options like (a) conflict_resolution_format, the
> > values could be log and table for now, in future, one could extend it
> > to other options like xml, json, etc. (b) conflict_log_table: in this
> > user can specify the conflict table name, this can be optional such
> > that if user omits this and conflict_resolution_format is table, then
> > we will use internally generated table name like
> > pg_conflicts_<subscription_id>.
> >
>
> In this idea, we can keep the name of the second option as
> conflict_log_name instead of conflict_log_table. This can help us LOG
> the conflicts in a totally separate conflict file instead of in server
> log. Say, the user provides conflict_resolution_format as 'log' and
> conflict_log_name as 'conflict_report' then we can report conflicts in
> this separate file by appending subid to distinguish it. And, if the
> user gives only the first option conflict_resolution_format as 'log'
> then we can keep reporting the information in server log files.
>

+1 on the idea.
Instead of using conflict_resolution_format, I feel it should be
conflict_log_format as we are referring to LOGs and not resolutions.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-06T04:50:14Z

On Fri, Dec 5, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Also, shall we give the option to the user where she wants to see
> > conflict/resolution information? One idea to achieve the same is to
> > provide subscription options like (a) conflict_resolution_format, the
> > values could be log and table for now, in future, one could extend it
> > to other options like xml, json, etc. (b) conflict_log_table: in this
> > user can specify the conflict table name, this can be optional such
> > that if user omits this and conflict_resolution_format is table, then
> > we will use internally generated table name like
> > pg_conflicts_<subscription_id>.
> >
>
> In this idea, we can keep the name of the second option as
> conflict_log_name instead of conflict_log_table. This can help us LOG
> the conflicts in a totally separate conflict file instead of in server
> log. Say, the user provides conflict_resolution_format as 'log' and
> conflict_log_name as 'conflict_report' then we can report conflicts in
> this separate file by appending subid to distinguish it. And, if the
> user gives only the first option conflict_resolution_format as 'log'
> then we can keep reporting the information in server log files.

Yeah that looks good, so considering the extensibility I think we can
keep the option name as 'conflict_log_name' from the first version
itself even if we don't provide all the options in the first version.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-06T15:06:30Z

On Fri, Dec 5, 2025 at 10:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Dec 4, 2025 at 8:05 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > relid             | 16391
> > > > > schemaname        | public
> > > > > relname           | conf_tab
> > > > > conflict_type     | multiple_unique_conflicts
> > > > > remote_xid        | 761
> > > > > remote_commit_lsn | 0/01761400
> > > > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > > > remote_origin     | pg_16406
> > > > > key_tuple         |
> > > > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > > > local_conflicts   |
> > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > > > >
> > > >
> > > > Thanks, it looks good. For the benefit of others, could you include a
> > > > brief note, perhaps in the commit message for now, describing how to
> > > > access or read this array column? We can remove it later.
> > >
> > > Thanks, okay, temporarily I have added in a commit message how we can
> > > fetch the data from the JSON array field.  In next version I will add
> > > a test to get the conflict stored in conflict log history table and
> > > fetch from it.
> >
> > I noticed that the table structure can get changed by the time the
> > conflict record is prepared. In ReportApplyConflict(), the code
> > currently prepares the conflict log tuple before deciding whether the
> > insertion will be immediate or deferred:
> > +       /* Insert conflict details to conflict log table. */
> > +       if (conflictlogrel)
> > +       {
> > +               /*
> > +                * Prepare the conflict log tuple. If the error level
> > is below ERROR,
> > +                * insert it immediately. Otherwise, defer the
> > insertion to a new
> > +                * transaction after the current one aborts, ensuring
> > the insertion of
> > +                * the log tuple is not rolled back.
> > +                */
> > +               prepare_conflict_log_tuple(estate,
> > +
> > relinfo->ri_RelationDesc,
> > +
> > conflictlogrel,
> > +                                                                  type,
> > +                                                                  searchslot,
> > +
> > conflicttuples,
> > +                                                                  remoteslot);
> > +               if (elevel < ERROR)
> > +                       InsertConflictLogTuple(conflictlogrel);
> > +
> > +               table_close(conflictlogrel, RowExclusiveLock);
> > +       }
> >
> > If the conflict history table defintion is changed just before
> > prepare_conflict_log_tuple, the tuple creation will crash:
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
> > ../../../../src/include/varatt.h:419
> > 419 return VARATT_IS_4B_U(PTR) &&
> > (gdb) bt
> > #0  0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
> > ../../../../src/include/varatt.h:419
> > #1  0x00005a342e01e5ed in heap_compute_data_size
> > (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20,
> > isnull=0x7ffd7af3ad15) at heaptuple.c:239
> > #2  0x00005a342e0200dd in heap_form_tuple
> > (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20,
> > isnull=0x7ffd7af3ad15) at heaptuple.c:1158
> > #3  0x00005a342e55e8c2 in prepare_conflict_log_tuple
> > (estate=0x5a3467944530, rel=0x7ab405e594e8,
> > conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS,
> > searchslot=0x0,
> >     conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936
> > #4  0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530,
> > relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS,
> > searchslot=0x0, remoteslot=0x5a346792e498,
> >     conflicttuples=0x5a3467942da0) at conflict.c:168
> > #5  0x00005a342e348c35 in CheckAndReportConflict
> > (resultRelInfo=0x5a346792e778, estate=0x5a3467944530,
> > type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0,
> >     remoteslot=0x5a346792e498) at execReplication.c:793
> >
> > This can be reproduced by the following steps:
> > CREATE PUBLICATION pub;
> > CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict');
> > ALTER TABLE conflict RENAME TO conflict1:
> > CREATE TABLE conflict(c1 varchar, c2 varchar);
> > -- Cause a conflict, this will crash while trying to prepare the
> > conflicting tuple
>
> Yeah while it is allowed to drop or alter the conflict log table, it
> should not seg fault, IMHO error is acceptable as per the initial
> discussion, so I will look into this and tighten up the logic so that
> it will throw an error whenever it can not insert into the conflict
> log table.

I was thinking about the solution that we need to do if table
definition is changed, one option is whenever we try to prepare the
tuple after acquiring the lock we can validate the table definition if
this doesn't qualify the standard conflict log table schema we can
ERROR out.  IMHO that should not be an issue as we are only doing this
in conflict logging.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-08T03:42:40Z

On Sat, 6 Dec 2025 at 20:36, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 5, 2025 at 10:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Dec 4, 2025 at 8:05 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > > >
> > > > > > relid             | 16391
> > > > > > schemaname        | public
> > > > > > relname           | conf_tab
> > > > > > conflict_type     | multiple_unique_conflicts
> > > > > > remote_xid        | 761
> > > > > > remote_commit_lsn | 0/01761400
> > > > > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > > > > remote_origin     | pg_16406
> > > > > > key_tuple         |
> > > > > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > > > > local_conflicts   |
> > > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > > > > >
> > > > >
> > > > > Thanks, it looks good. For the benefit of others, could you include a
> > > > > brief note, perhaps in the commit message for now, describing how to
> > > > > access or read this array column? We can remove it later.
> > > >
> > > > Thanks, okay, temporarily I have added in a commit message how we can
> > > > fetch the data from the JSON array field.  In next version I will add
> > > > a test to get the conflict stored in conflict log history table and
> > > > fetch from it.
> > >
> > > I noticed that the table structure can get changed by the time the
> > > conflict record is prepared. In ReportApplyConflict(), the code
> > > currently prepares the conflict log tuple before deciding whether the
> > > insertion will be immediate or deferred:
> > > +       /* Insert conflict details to conflict log table. */
> > > +       if (conflictlogrel)
> > > +       {
> > > +               /*
> > > +                * Prepare the conflict log tuple. If the error level
> > > is below ERROR,
> > > +                * insert it immediately. Otherwise, defer the
> > > insertion to a new
> > > +                * transaction after the current one aborts, ensuring
> > > the insertion of
> > > +                * the log tuple is not rolled back.
> > > +                */
> > > +               prepare_conflict_log_tuple(estate,
> > > +
> > > relinfo->ri_RelationDesc,
> > > +
> > > conflictlogrel,
> > > +                                                                  type,
> > > +                                                                  searchslot,
> > > +
> > > conflicttuples,
> > > +                                                                  remoteslot);
> > > +               if (elevel < ERROR)
> > > +                       InsertConflictLogTuple(conflictlogrel);
> > > +
> > > +               table_close(conflictlogrel, RowExclusiveLock);
> > > +       }
> > >
> > > If the conflict history table defintion is changed just before
> > > prepare_conflict_log_tuple, the tuple creation will crash:
> > > Program received signal SIGSEGV, Segmentation fault.
> > > 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
> > > ../../../../src/include/varatt.h:419
> > > 419 return VARATT_IS_4B_U(PTR) &&
> > > (gdb) bt
> > > #0  0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
> > > ../../../../src/include/varatt.h:419
> > > #1  0x00005a342e01e5ed in heap_compute_data_size
> > > (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20,
> > > isnull=0x7ffd7af3ad15) at heaptuple.c:239
> > > #2  0x00005a342e0200dd in heap_form_tuple
> > > (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20,
> > > isnull=0x7ffd7af3ad15) at heaptuple.c:1158
> > > #3  0x00005a342e55e8c2 in prepare_conflict_log_tuple
> > > (estate=0x5a3467944530, rel=0x7ab405e594e8,
> > > conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS,
> > > searchslot=0x0,
> > >     conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936
> > > #4  0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530,
> > > relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS,
> > > searchslot=0x0, remoteslot=0x5a346792e498,
> > >     conflicttuples=0x5a3467942da0) at conflict.c:168
> > > #5  0x00005a342e348c35 in CheckAndReportConflict
> > > (resultRelInfo=0x5a346792e778, estate=0x5a3467944530,
> > > type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0,
> > >     remoteslot=0x5a346792e498) at execReplication.c:793
> > >
> > > This can be reproduced by the following steps:
> > > CREATE PUBLICATION pub;
> > > CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict');
> > > ALTER TABLE conflict RENAME TO conflict1:
> > > CREATE TABLE conflict(c1 varchar, c2 varchar);
> > > -- Cause a conflict, this will crash while trying to prepare the
> > > conflicting tuple
> >
> > Yeah while it is allowed to drop or alter the conflict log table, it
> > should not seg fault, IMHO error is acceptable as per the initial
> > discussion, so I will look into this and tighten up the logic so that
> > it will throw an error whenever it can not insert into the conflict
> > log table.
>
> I was thinking about the solution that we need to do if table
> definition is changed, one option is whenever we try to prepare the
> tuple after acquiring the lock we can validate the table definition if
> this doesn't qualify the standard conflict log table schema we can
> ERROR out.  IMHO that should not be an issue as we are only doing this
> in conflict logging.

Should we emit a warning instead of error, to stay consistent with the
other exception case where a warning is raised when the conflict log
table does not exist?
+       /* Conflict log table is dropped or not accessible. */
+       if (conflictlogrel == NULL)
+               ereport(WARNING,
+                               (errcode(ERRCODE_UNDEFINED_TABLE),
+                                errmsg("conflict log table \"%s.%s\"
does not exist",
+
get_namespace_name(nspid), conflictlogtable)));

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-08T04:06:43Z

On Mon, Dec 8, 2025 at 9:12 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sat, 6 Dec 2025 at 20:36, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Dec 5, 2025 at 10:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Dec 4, 2025 at 8:05 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > > > >
> > > > > > > relid             | 16391
> > > > > > > schemaname        | public
> > > > > > > relname           | conf_tab
> > > > > > > conflict_type     | multiple_unique_conflicts
> > > > > > > remote_xid        | 761
> > > > > > > remote_commit_lsn | 0/01761400
> > > > > > > remote_commit_ts  | 2025-12-02 15:02:07.045935+00
> > > > > > > remote_origin     | pg_16406
> > > > > > > key_tuple         |
> > > > > > > remote_tuple      | {"a":2,"b":3,"c":4}
> > > > > > > local_conflicts   |
> > > > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"
> > > > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
> > > > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}
> > > > > > >
> > > > > >
> > > > > > Thanks, it looks good. For the benefit of others, could you include a
> > > > > > brief note, perhaps in the commit message for now, describing how to
> > > > > > access or read this array column? We can remove it later.
> > > > >
> > > > > Thanks, okay, temporarily I have added in a commit message how we can
> > > > > fetch the data from the JSON array field.  In next version I will add
> > > > > a test to get the conflict stored in conflict log history table and
> > > > > fetch from it.
> > > >
> > > > I noticed that the table structure can get changed by the time the
> > > > conflict record is prepared. In ReportApplyConflict(), the code
> > > > currently prepares the conflict log tuple before deciding whether the
> > > > insertion will be immediate or deferred:
> > > > +       /* Insert conflict details to conflict log table. */
> > > > +       if (conflictlogrel)
> > > > +       {
> > > > +               /*
> > > > +                * Prepare the conflict log tuple. If the error level
> > > > is below ERROR,
> > > > +                * insert it immediately. Otherwise, defer the
> > > > insertion to a new
> > > > +                * transaction after the current one aborts, ensuring
> > > > the insertion of
> > > > +                * the log tuple is not rolled back.
> > > > +                */
> > > > +               prepare_conflict_log_tuple(estate,
> > > > +
> > > > relinfo->ri_RelationDesc,
> > > > +
> > > > conflictlogrel,
> > > > +                                                                  type,
> > > > +                                                                  searchslot,
> > > > +
> > > > conflicttuples,
> > > > +                                                                  remoteslot);
> > > > +               if (elevel < ERROR)
> > > > +                       InsertConflictLogTuple(conflictlogrel);
> > > > +
> > > > +               table_close(conflictlogrel, RowExclusiveLock);
> > > > +       }
> > > >
> > > > If the conflict history table defintion is changed just before
> > > > prepare_conflict_log_tuple, the tuple creation will crash:
> > > > Program received signal SIGSEGV, Segmentation fault.
> > > > 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
> > > > ../../../../src/include/varatt.h:419
> > > > 419 return VARATT_IS_4B_U(PTR) &&
> > > > (gdb) bt
> > > > #0  0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at
> > > > ../../../../src/include/varatt.h:419
> > > > #1  0x00005a342e01e5ed in heap_compute_data_size
> > > > (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20,
> > > > isnull=0x7ffd7af3ad15) at heaptuple.c:239
> > > > #2  0x00005a342e0200dd in heap_form_tuple
> > > > (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20,
> > > > isnull=0x7ffd7af3ad15) at heaptuple.c:1158
> > > > #3  0x00005a342e55e8c2 in prepare_conflict_log_tuple
> > > > (estate=0x5a3467944530, rel=0x7ab405e594e8,
> > > > conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS,
> > > > searchslot=0x0,
> > > >     conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936
> > > > #4  0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530,
> > > > relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS,
> > > > searchslot=0x0, remoteslot=0x5a346792e498,
> > > >     conflicttuples=0x5a3467942da0) at conflict.c:168
> > > > #5  0x00005a342e348c35 in CheckAndReportConflict
> > > > (resultRelInfo=0x5a346792e778, estate=0x5a3467944530,
> > > > type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0,
> > > >     remoteslot=0x5a346792e498) at execReplication.c:793
> > > >
> > > > This can be reproduced by the following steps:
> > > > CREATE PUBLICATION pub;
> > > > CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict');
> > > > ALTER TABLE conflict RENAME TO conflict1:
> > > > CREATE TABLE conflict(c1 varchar, c2 varchar);
> > > > -- Cause a conflict, this will crash while trying to prepare the
> > > > conflicting tuple
> > >
> > > Yeah while it is allowed to drop or alter the conflict log table, it
> > > should not seg fault, IMHO error is acceptable as per the initial
> > > discussion, so I will look into this and tighten up the logic so that
> > > it will throw an error whenever it can not insert into the conflict
> > > log table.
> >
> > I was thinking about the solution that we need to do if table
> > definition is changed, one option is whenever we try to prepare the
> > tuple after acquiring the lock we can validate the table definition if
> > this doesn't qualify the standard conflict log table schema we can
> > ERROR out.  IMHO that should not be an issue as we are only doing this
> > in conflict logging.
>
> Should we emit a warning instead of error, to stay consistent with the
> other exception case where a warning is raised when the conflict log
> table does not exist?
> +       /* Conflict log table is dropped or not accessible. */
> +       if (conflictlogrel == NULL)
> +               ereport(WARNING,
> +                               (errcode(ERRCODE_UNDEFINED_TABLE),
> +                                errmsg("conflict log table \"%s.%s\"
> does not exist",
> +
> get_namespace_name(nspid), conflictlogtable)));

Yes this should be WARNING.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-08T04:55:19Z

On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > > ---
> > > I think the conflict history table should not be transferred to the
> > > new cluster when pg_upgrade since the table definition could be
> > > different across major versions.
> >
> > Let me think more on this with respect to behaviour of other factors
> > like subscriptions etc.
> >
>
> Can we deal with different schema of tables across versions via
> pg_dump/restore during upgrade?
>

While handling the case of conflict_log_table option during pg_dump, I
realized that the restore is trying to create conflict log table 2
different places 1) As part of the regular table dump 2) As part of
the CREATE SUBSCRIPTION when conflict_log_table option is set.

So one option is we can avoid dumping the conflict log tables as part
of the regular table dump if we think that we do not need to conflict
log table data and let it get created as part of the create
subscription command, OTOH if we think we want to keep the conflict
log table data, let it get dumped as part of the regular tables and in
CREATE SUBSCRIPTION we will just set the option but do not create the
table, although we might need to do special handling of this case
because if we allow the existing tables to be set as conflict log
tables then it may allow other user tables to be set, so need to think
how to handle this if we need to go with this option.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-08T09:08:32Z

On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > > ---
> > > > I think the conflict history table should not be transferred to the
> > > > new cluster when pg_upgrade since the table definition could be
> > > > different across major versions.
> > >
> > > Let me think more on this with respect to behaviour of other factors
> > > like subscriptions etc.
> > >
> >
> > Can we deal with different schema of tables across versions via
> > pg_dump/restore during upgrade?
> >
>
> While handling the case of conflict_log_table option during pg_dump, I
> realized that the restore is trying to create conflict log table 2
> different places 1) As part of the regular table dump 2) As part of
> the CREATE SUBSCRIPTION when conflict_log_table option is set.
>
> So one option is we can avoid dumping the conflict log tables as part
> of the regular table dump if we think that we do not need to conflict
> log table data and let it get created as part of the create
> subscription command, OTOH if we think we want to keep the conflict
> log table data,
>

We want to retain conflict_history after upgrade. This is required for
various reasons (a) after upgrade DBA user will still require to
resolved the pending unresolved conflicts, (b) Regulations often
require keeping audit trails for a longer period of time. If a
conflict occurred at time X (which is less than the regulations
requirement) regarding a financial transaction, that record must
survive the upgrade, (c)
If something breaks after the upgrade (e.g., missing rows, constraint
violations), conflict history helps trace root causes. It shows
whether issues existed before the upgrade or were introduced during
migration, (d) as users can query the conflict_history tables, it
should be treated similar to user tables.

BTW, we are also planning to migrate commit_ts data in thread [1]
which would be helpful for conflict_resolutions after upgrade.

 let it get dumped as part of the regular tables and in
> CREATE SUBSCRIPTION we will just set the option but do not create the
> table,
>

Yeah, we can turn this option during CREATE SUBSCRIPTION so that it
doesn't try to create the table again.

> although we might need to do special handling of this case
> because if we allow the existing tables to be set as conflict log
> tables then it may allow other user tables to be set, so need to think
> how to handle this if we need to go with this option.
>

Yeah, probably but it should be allowed internally only not to users.
I think we can split this upgrade handling as a top-up patch at least
for the purpose of review.

[1] - https://www.postgresql.org/message-id/182311743703924%40mail.yandex.ru

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-08T09:30:47Z

On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > > ---
> > > > > I think the conflict history table should not be transferred to the
> > > > > new cluster when pg_upgrade since the table definition could be
> > > > > different across major versions.
> > > >
> > > > Let me think more on this with respect to behaviour of other factors
> > > > like subscriptions etc.
> > > >
> > >
> > > Can we deal with different schema of tables across versions via
> > > pg_dump/restore during upgrade?
> > >
> >
> > While handling the case of conflict_log_table option during pg_dump, I
> > realized that the restore is trying to create conflict log table 2
> > different places 1) As part of the regular table dump 2) As part of
> > the CREATE SUBSCRIPTION when conflict_log_table option is set.
> >
> > So one option is we can avoid dumping the conflict log tables as part
> > of the regular table dump if we think that we do not need to conflict
> > log table data and let it get created as part of the create
> > subscription command, OTOH if we think we want to keep the conflict
> > log table data,
> >
>
> We want to retain conflict_history after upgrade. This is required for
> various reasons (a) after upgrade DBA user will still require to
> resolved the pending unresolved conflicts, (b) Regulations often
> require keeping audit trails for a longer period of time. If a
> conflict occurred at time X (which is less than the regulations
> requirement) regarding a financial transaction, that record must
> survive the upgrade, (c)
> If something breaks after the upgrade (e.g., missing rows, constraint
> violations), conflict history helps trace root causes. It shows
> whether issues existed before the upgrade or were introduced during
> migration, (d) as users can query the conflict_history tables, it
> should be treated similar to user tables.
>
> BTW, we are also planning to migrate commit_ts data in thread [1]
> which would be helpful for conflict_resolutions after upgrade.
>
>  let it get dumped as part of the regular tables and in
> > CREATE SUBSCRIPTION we will just set the option but do not create the
> > table,
> >
>
> Yeah, we can turn this option during CREATE SUBSCRIPTION so that it
> doesn't try to create the table again.
>
> > although we might need to do special handling of this case
> > because if we allow the existing tables to be set as conflict log
> > tables then it may allow other user tables to be set, so need to think
> > how to handle this if we need to go with this option.
> >
>
> Yeah, probably but it should be allowed internally only not to users.

Yeah I wanted to do that, but problem is with dump and restore, I mean
if you just dump into a sql file and execute the sql file at that time
the CREATE SUBSCRIPTION with conflict_log_table option will fail as
the table already exists because it was restored as part of the dump.
I know under binary upgrade we have binary_upgrade flag so can do
special handling not sure how to distinguish the sql executing as part
of the restore or normal sql execution by user?

> I think we can split this upgrade handling as a top-up patch at least
> for the purpose of review.

Make sense.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-08T09:51:40Z

On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > > ---
> > > > > > I think the conflict history table should not be transferred to the
> > > > > > new cluster when pg_upgrade since the table definition could be
> > > > > > different across major versions.
> > > > >
> > > > > Let me think more on this with respect to behaviour of other factors
> > > > > like subscriptions etc.
> > > > >
> > > >
> > > > Can we deal with different schema of tables across versions via
> > > > pg_dump/restore during upgrade?
> > > >
> > >
> > > While handling the case of conflict_log_table option during pg_dump, I
> > > realized that the restore is trying to create conflict log table 2
> > > different places 1) As part of the regular table dump 2) As part of
> > > the CREATE SUBSCRIPTION when conflict_log_table option is set.
> > >
> > > So one option is we can avoid dumping the conflict log tables as part
> > > of the regular table dump if we think that we do not need to conflict
> > > log table data and let it get created as part of the create
> > > subscription command, OTOH if we think we want to keep the conflict
> > > log table data,
> > >
> >
> > We want to retain conflict_history after upgrade. This is required for
> > various reasons (a) after upgrade DBA user will still require to
> > resolved the pending unresolved conflicts, (b) Regulations often
> > require keeping audit trails for a longer period of time. If a
> > conflict occurred at time X (which is less than the regulations
> > requirement) regarding a financial transaction, that record must
> > survive the upgrade, (c)
> > If something breaks after the upgrade (e.g., missing rows, constraint
> > violations), conflict history helps trace root causes. It shows
> > whether issues existed before the upgrade or were introduced during
> > migration, (d) as users can query the conflict_history tables, it
> > should be treated similar to user tables.
> >
> > BTW, we are also planning to migrate commit_ts data in thread [1]
> > which would be helpful for conflict_resolutions after upgrade.
> >
> >  let it get dumped as part of the regular tables and in
> > > CREATE SUBSCRIPTION we will just set the option but do not create the
> > > table,
> > >
> >
> > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it
> > doesn't try to create the table again.
> >
> > > although we might need to do special handling of this case
> > > because if we allow the existing tables to be set as conflict log
> > > tables then it may allow other user tables to be set, so need to think
> > > how to handle this if we need to go with this option.
> > >
> >
> > Yeah, probably but it should be allowed internally only not to users.
>
> Yeah I wanted to do that, but problem is with dump and restore, I mean
> if you just dump into a sql file and execute the sql file at that time
> the CREATE SUBSCRIPTION with conflict_log_table option will fail as
> the table already exists because it was restored as part of the dump.
> I know under binary upgrade we have binary_upgrade flag so can do
> special handling not sure how to distinguish the sql executing as part
> of the restore or normal sql execution by user?
>

See dumpSubscription(). We always use (connect = false) while dumping
subscription, so, similarly, we should always dump the new option with
default value which not to create the history table. Won't that be
sufficient?

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-08T11:45:37Z

On Mon, Dec 8, 2025 at 3:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > >
> > > > > > > ---
> > > > > > > I think the conflict history table should not be transferred to the
> > > > > > > new cluster when pg_upgrade since the table definition could be
> > > > > > > different across major versions.
> > > > > >
> > > > > > Let me think more on this with respect to behaviour of other factors
> > > > > > like subscriptions etc.
> > > > > >
> > > > >
> > > > > Can we deal with different schema of tables across versions via
> > > > > pg_dump/restore during upgrade?
> > > > >
> > > >
> > > > While handling the case of conflict_log_table option during pg_dump, I
> > > > realized that the restore is trying to create conflict log table 2
> > > > different places 1) As part of the regular table dump 2) As part of
> > > > the CREATE SUBSCRIPTION when conflict_log_table option is set.
> > > >
> > > > So one option is we can avoid dumping the conflict log tables as part
> > > > of the regular table dump if we think that we do not need to conflict
> > > > log table data and let it get created as part of the create
> > > > subscription command, OTOH if we think we want to keep the conflict
> > > > log table data,
> > > >
> > >
> > > We want to retain conflict_history after upgrade. This is required for
> > > various reasons (a) after upgrade DBA user will still require to
> > > resolved the pending unresolved conflicts, (b) Regulations often
> > > require keeping audit trails for a longer period of time. If a
> > > conflict occurred at time X (which is less than the regulations
> > > requirement) regarding a financial transaction, that record must
> > > survive the upgrade, (c)
> > > If something breaks after the upgrade (e.g., missing rows, constraint
> > > violations), conflict history helps trace root causes. It shows
> > > whether issues existed before the upgrade or were introduced during
> > > migration, (d) as users can query the conflict_history tables, it
> > > should be treated similar to user tables.
> > >
> > > BTW, we are also planning to migrate commit_ts data in thread [1]
> > > which would be helpful for conflict_resolutions after upgrade.
> > >
> > >  let it get dumped as part of the regular tables and in
> > > > CREATE SUBSCRIPTION we will just set the option but do not create the
> > > > table,
> > > >
> > >
> > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it
> > > doesn't try to create the table again.
> > >
> > > > although we might need to do special handling of this case
> > > > because if we allow the existing tables to be set as conflict log
> > > > tables then it may allow other user tables to be set, so need to think
> > > > how to handle this if we need to go with this option.
> > > >
> > >
> > > Yeah, probably but it should be allowed internally only not to users.
> >
> > Yeah I wanted to do that, but problem is with dump and restore, I mean
> > if you just dump into a sql file and execute the sql file at that time
> > the CREATE SUBSCRIPTION with conflict_log_table option will fail as
> > the table already exists because it was restored as part of the dump.
> > I know under binary upgrade we have binary_upgrade flag so can do
> > special handling not sure how to distinguish the sql executing as part
> > of the restore or normal sql execution by user?
> >
>
> See dumpSubscription(). We always use (connect = false) while dumping
> subscription, so, similarly, we should always dump the new option with
> default value which not to create the history table. Won't that be
> sufficient?

Thinking out loud, so basically what we need is we need to create
subscription and set the conflict log table in catalog entry of the
subscription in pg_subscription but do not want to create the conflict
log table, so seems like we need to invent something new which set the
conflict log table in catalog but do not create the table.  Currently
we have a single option that if conflict_log_table='table_name' is set
then we will create the table as well as set the table name in the
catalog, so need to think of something on the line of separating this,
or something more innovative.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-09T04:42:11Z

On Mon, Dec 8, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 8, 2025 at 3:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > >
> > > > > > > > ---
> > > > > > > > I think the conflict history table should not be transferred to the
> > > > > > > > new cluster when pg_upgrade since the table definition could be
> > > > > > > > different across major versions.
> > > > > > >
> > > > > > > Let me think more on this with respect to behaviour of other factors
> > > > > > > like subscriptions etc.
> > > > > > >
> > > > > >
> > > > > > Can we deal with different schema of tables across versions via
> > > > > > pg_dump/restore during upgrade?
> > > > > >
> > > > >
> > > > > While handling the case of conflict_log_table option during pg_dump, I
> > > > > realized that the restore is trying to create conflict log table 2
> > > > > different places 1) As part of the regular table dump 2) As part of
> > > > > the CREATE SUBSCRIPTION when conflict_log_table option is set.
> > > > >
> > > > > So one option is we can avoid dumping the conflict log tables as part
> > > > > of the regular table dump if we think that we do not need to conflict
> > > > > log table data and let it get created as part of the create
> > > > > subscription command, OTOH if we think we want to keep the conflict
> > > > > log table data,
> > > > >
> > > >
> > > > We want to retain conflict_history after upgrade. This is required for
> > > > various reasons (a) after upgrade DBA user will still require to
> > > > resolved the pending unresolved conflicts, (b) Regulations often
> > > > require keeping audit trails for a longer period of time. If a
> > > > conflict occurred at time X (which is less than the regulations
> > > > requirement) regarding a financial transaction, that record must
> > > > survive the upgrade, (c)
> > > > If something breaks after the upgrade (e.g., missing rows, constraint
> > > > violations), conflict history helps trace root causes. It shows
> > > > whether issues existed before the upgrade or were introduced during
> > > > migration, (d) as users can query the conflict_history tables, it
> > > > should be treated similar to user tables.
> > > >
> > > > BTW, we are also planning to migrate commit_ts data in thread [1]
> > > > which would be helpful for conflict_resolutions after upgrade.
> > > >
> > > >  let it get dumped as part of the regular tables and in
> > > > > CREATE SUBSCRIPTION we will just set the option but do not create the
> > > > > table,
> > > > >
> > > >
> > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it
> > > > doesn't try to create the table again.
> > > >
> > > > > although we might need to do special handling of this case
> > > > > because if we allow the existing tables to be set as conflict log
> > > > > tables then it may allow other user tables to be set, so need to think
> > > > > how to handle this if we need to go with this option.
> > > > >
> > > >
> > > > Yeah, probably but it should be allowed internally only not to users.
> > >
> > > Yeah I wanted to do that, but problem is with dump and restore, I mean
> > > if you just dump into a sql file and execute the sql file at that time
> > > the CREATE SUBSCRIPTION with conflict_log_table option will fail as
> > > the table already exists because it was restored as part of the dump.
> > > I know under binary upgrade we have binary_upgrade flag so can do
> > > special handling not sure how to distinguish the sql executing as part
> > > of the restore or normal sql execution by user?
> > >
> >
> > See dumpSubscription(). We always use (connect = false) while dumping
> > subscription, so, similarly, we should always dump the new option with
> > default value which not to create the history table. Won't that be
> > sufficient?
>
> Thinking out loud, so basically what we need is we need to create
> subscription and set the conflict log table in catalog entry of the
> subscription in pg_subscription but do not want to create the conflict
> log table, so seems like we need to invent something new which set the
> conflict log table in catalog but do not create the table.  Currently
> we have a single option that if conflict_log_table='table_name' is set
> then we will create the table as well as set the table name in the
> catalog, so need to think of something on the line of separating this,
> or something more innovative.
>

This needs more thought and discussion, so it is better to separate
out this part at this stage and let's try to review the core patch
first. BTW, I told a few days back to have two options (instead of a
single option conflict_log_table) to allow extension of more ways to
LOG the conflict data.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-09T06:36:34Z

On Tue, Dec 9, 2025 at 10:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 8, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Dec 8, 2025 at 3:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > > I think the conflict history table should not be transferred to the
> > > > > > > > > new cluster when pg_upgrade since the table definition could be
> > > > > > > > > different across major versions.
> > > > > > > >
> > > > > > > > Let me think more on this with respect to behaviour of other factors
> > > > > > > > like subscriptions etc.
> > > > > > > >
> > > > > > >
> > > > > > > Can we deal with different schema of tables across versions via
> > > > > > > pg_dump/restore during upgrade?
> > > > > > >
> > > > > >
> > > > > > While handling the case of conflict_log_table option during pg_dump, I
> > > > > > realized that the restore is trying to create conflict log table 2
> > > > > > different places 1) As part of the regular table dump 2) As part of
> > > > > > the CREATE SUBSCRIPTION when conflict_log_table option is set.
> > > > > >
> > > > > > So one option is we can avoid dumping the conflict log tables as part
> > > > > > of the regular table dump if we think that we do not need to conflict
> > > > > > log table data and let it get created as part of the create
> > > > > > subscription command, OTOH if we think we want to keep the conflict
> > > > > > log table data,
> > > > > >
> > > > >
> > > > > We want to retain conflict_history after upgrade. This is required for
> > > > > various reasons (a) after upgrade DBA user will still require to
> > > > > resolved the pending unresolved conflicts, (b) Regulations often
> > > > > require keeping audit trails for a longer period of time. If a
> > > > > conflict occurred at time X (which is less than the regulations
> > > > > requirement) regarding a financial transaction, that record must
> > > > > survive the upgrade, (c)
> > > > > If something breaks after the upgrade (e.g., missing rows, constraint
> > > > > violations), conflict history helps trace root causes. It shows
> > > > > whether issues existed before the upgrade or were introduced during
> > > > > migration, (d) as users can query the conflict_history tables, it
> > > > > should be treated similar to user tables.
> > > > >
> > > > > BTW, we are also planning to migrate commit_ts data in thread [1]
> > > > > which would be helpful for conflict_resolutions after upgrade.
> > > > >
> > > > >  let it get dumped as part of the regular tables and in
> > > > > > CREATE SUBSCRIPTION we will just set the option but do not create the
> > > > > > table,
> > > > > >
> > > > >
> > > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it
> > > > > doesn't try to create the table again.
> > > > >
> > > > > > although we might need to do special handling of this case
> > > > > > because if we allow the existing tables to be set as conflict log
> > > > > > tables then it may allow other user tables to be set, so need to think
> > > > > > how to handle this if we need to go with this option.
> > > > > >
> > > > >
> > > > > Yeah, probably but it should be allowed internally only not to users.
> > > >
> > > > Yeah I wanted to do that, but problem is with dump and restore, I mean
> > > > if you just dump into a sql file and execute the sql file at that time
> > > > the CREATE SUBSCRIPTION with conflict_log_table option will fail as
> > > > the table already exists because it was restored as part of the dump.
> > > > I know under binary upgrade we have binary_upgrade flag so can do
> > > > special handling not sure how to distinguish the sql executing as part
> > > > of the restore or normal sql execution by user?
> > > >
> > >
> > > See dumpSubscription(). We always use (connect = false) while dumping
> > > subscription, so, similarly, we should always dump the new option with
> > > default value which not to create the history table. Won't that be
> > > sufficient?
> >
> > Thinking out loud, so basically what we need is we need to create
> > subscription and set the conflict log table in catalog entry of the
> > subscription in pg_subscription but do not want to create the conflict
> > log table, so seems like we need to invent something new which set the
> > conflict log table in catalog but do not create the table.  Currently
> > we have a single option that if conflict_log_table='table_name' is set
> > then we will create the table as well as set the table name in the
> > catalog, so need to think of something on the line of separating this,
> > or something more innovative.
> >
>
> This needs more thought and discussion, so it is better to separate
> out this part at this stage and let's try to review the core patch
> first.

+1

BTW, I told a few days back to have two options (instead of a
> single option conflict_log_table) to allow extension of more ways to
> LOG the conflict data.

Yeah, I will put that as well in an add on patch, once I fix all the
option issues of the core patch.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-09T15:11:21Z

On Tue, Dec 9, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 9, 2025 at 10:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 8, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, Dec 8, 2025 at 3:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > >
> > > > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > > I think the conflict history table should not be transferred to the
> > > > > > > > > > new cluster when pg_upgrade since the table definition could be
> > > > > > > > > > different across major versions.
> > > > > > > > >
> > > > > > > > > Let me think more on this with respect to behaviour of other factors
> > > > > > > > > like subscriptions etc.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Can we deal with different schema of tables across versions via
> > > > > > > > pg_dump/restore during upgrade?
> > > > > > > >
> > > > > > >
> > > > > > > While handling the case of conflict_log_table option during pg_dump, I
> > > > > > > realized that the restore is trying to create conflict log table 2
> > > > > > > different places 1) As part of the regular table dump 2) As part of
> > > > > > > the CREATE SUBSCRIPTION when conflict_log_table option is set.
> > > > > > >
> > > > > > > So one option is we can avoid dumping the conflict log tables as part
> > > > > > > of the regular table dump if we think that we do not need to conflict
> > > > > > > log table data and let it get created as part of the create
> > > > > > > subscription command, OTOH if we think we want to keep the conflict
> > > > > > > log table data,
> > > > > > >
> > > > > >
> > > > > > We want to retain conflict_history after upgrade. This is required for
> > > > > > various reasons (a) after upgrade DBA user will still require to
> > > > > > resolved the pending unresolved conflicts, (b) Regulations often
> > > > > > require keeping audit trails for a longer period of time. If a
> > > > > > conflict occurred at time X (which is less than the regulations
> > > > > > requirement) regarding a financial transaction, that record must
> > > > > > survive the upgrade, (c)
> > > > > > If something breaks after the upgrade (e.g., missing rows, constraint
> > > > > > violations), conflict history helps trace root causes. It shows
> > > > > > whether issues existed before the upgrade or were introduced during
> > > > > > migration, (d) as users can query the conflict_history tables, it
> > > > > > should be treated similar to user tables.
> > > > > >
> > > > > > BTW, we are also planning to migrate commit_ts data in thread [1]
> > > > > > which would be helpful for conflict_resolutions after upgrade.
> > > > > >
> > > > > >  let it get dumped as part of the regular tables and in
> > > > > > > CREATE SUBSCRIPTION we will just set the option but do not create the
> > > > > > > table,
> > > > > > >
> > > > > >
> > > > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it
> > > > > > doesn't try to create the table again.
> > > > > >
> > > > > > > although we might need to do special handling of this case
> > > > > > > because if we allow the existing tables to be set as conflict log
> > > > > > > tables then it may allow other user tables to be set, so need to think
> > > > > > > how to handle this if we need to go with this option.
> > > > > > >
> > > > > >
> > > > > > Yeah, probably but it should be allowed internally only not to users.
> > > > >
> > > > > Yeah I wanted to do that, but problem is with dump and restore, I mean
> > > > > if you just dump into a sql file and execute the sql file at that time
> > > > > the CREATE SUBSCRIPTION with conflict_log_table option will fail as
> > > > > the table already exists because it was restored as part of the dump.
> > > > > I know under binary upgrade we have binary_upgrade flag so can do
> > > > > special handling not sure how to distinguish the sql executing as part
> > > > > of the restore or normal sql execution by user?
> > > > >
> > > >
> > > > See dumpSubscription(). We always use (connect = false) while dumping
> > > > subscription, so, similarly, we should always dump the new option with
> > > > default value which not to create the history table. Won't that be
> > > > sufficient?
> > >
> > > Thinking out loud, so basically what we need is we need to create
> > > subscription and set the conflict log table in catalog entry of the
> > > subscription in pg_subscription but do not want to create the conflict
> > > log table, so seems like we need to invent something new which set the
> > > conflict log table in catalog but do not create the table.  Currently
> > > we have a single option that if conflict_log_table='table_name' is set
> > > then we will create the table as well as set the table name in the
> > > catalog, so need to think of something on the line of separating this,
> > > or something more innovative.
> > >
> >
> > This needs more thought and discussion, so it is better to separate
> > out this part at this stage and let's try to review the core patch
> > first.
>
> +1
>
> BTW, I told a few days back to have two options (instead of a
> > single option conflict_log_table) to allow extension of more ways to
> > LOG the conflict data.
>
> Yeah, I will put that as well in an add on patch, once I fix all the
> option issues of the core patch.
>
Here is the updated version of patch
What has changed
1. Table is created using create_heap_with_catalog() instead of SPI as
suggested by Sawada-San and Amit Kapila.
2. Validated the table schema after acquiring the lock before
preparing/inserting conflict tuples for defects raised by Vignesh.
3. Bug fixes raised by Shweta (segfault)
3. Comments from Peter (except exposing namespace in \dRs+, it's still pending.

What's not done/pending
1. Adding for key_tuple/RI as pointed by Shveta - will do in next version
2. Adding dependency of subscription on table so that we are not
allowed to drop the table - I think when we put the dependency on
shared objects those can not be dropped even with cascade option, but
I am still exploring more on this.
3. dump/restore and upgrade, I have partially working patch but then I
need to figure out how to skip table creation while creating
subscription, while discussing offlist with Hannu, he suggested we can
do something with dump dependency ordering, e.g. we can dump create
subscription first and then dump the clt data without actually dumping
the clt definition, with that table will be created while creating the
subscription and then data will be restored with COPY command, I will
explore more on this.
4. Test case for conflit insertion
5. Documentation patch


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-11T11:34:19Z

On Tue, Dec 9, 2025 at 8:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> >
> Here is the updated version of patch
> What has changed
> 1. Table is created using create_heap_with_catalog() instead of SPI as
> suggested by Sawada-San and Amit Kapila.
> 2. Validated the table schema after acquiring the lock before
> preparing/inserting conflict tuples for defects raised by Vignesh.
> 3. Bug fixes raised by Shweta (segfault)
> 3. Comments from Peter (except exposing namespace in \dRs+, it's still pending.
>

Thanks for the patch.
I tested all conflict-types on this version, they (basic scenarios)
seem to work well. Except only that key-RI pending issue, other issues
seem to be addressed. I will start with code-review now.

Few observations:

1)
\dRs+  shows 'Conflict log table' without namespace, this could be
confusing if the same table exists in multiple schemas.

2)
When we do below:
alter subscription sub1 SET (conflict_log_table=clt2);

the previous conflict log table is dropped. Is this behavior
intentional and discussed/concluded earlier? It’s possible that a user
may want to create a new conflict log table for future events while
still retaining the old one for analysis. If the subscription itself
is dropped, then dropping the CLT makes sense, but I’m not sure this
behavior is intended for ALTER SUBSCRIPTION.  I do understand that
once we unlink CLT from subscription, later even DROP subscription
cannot drop it, but user can always drop it when not needed.

If we plan to keep existing behavior, it should be clearly documented
in a CAUTION section, and the command should explicitly log the table
drop.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-11T11:40:19Z

On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Dec 9, 2025 at 8:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > >
> > Here is the updated version of patch
> > What has changed
> > 1. Table is created using create_heap_with_catalog() instead of SPI as
> > suggested by Sawada-San and Amit Kapila.
> > 2. Validated the table schema after acquiring the lock before
> > preparing/inserting conflict tuples for defects raised by Vignesh.
> > 3. Bug fixes raised by Shweta (segfault)
> > 3. Comments from Peter (except exposing namespace in \dRs+, it's still pending.
> >
>
> Thanks for the patch.
> I tested all conflict-types on this version, they (basic scenarios)
> seem to work well. Except only that key-RI pending issue, other issues
> seem to be addressed. I will start with code-review now.
>
> Few observations:
>
> 1)
> \dRs+  shows 'Conflict log table' without namespace, this could be
> confusing if the same table exists in multiple schemas.

Yeah this is not yet fixed comments, will fix in next version.

> 2)
> When we do below:
> alter subscription sub1 SET (conflict_log_table=clt2);
>
> the previous conflict log table is dropped. Is this behavior
> intentional and discussed/concluded earlier? It’s possible that a user
> may want to create a new conflict log table for future events while
> still retaining the old one for analysis. If the subscription itself
> is dropped, then dropping the CLT makes sense, but I’m not sure this
> behavior is intended for ALTER SUBSCRIPTION.  I do understand that
> once we unlink CLT from subscription, later even DROP subscription
> cannot drop it, but user can always drop it when not needed.
>
> If we plan to keep existing behavior, it should be clearly documented
> in a CAUTION section, and the command should explicitly log the table
> drop.

Yeah we discussed this behavior and the conclusion was we would
document this behavior and its user's responsibility to take necessary
backup of the conflict log table data if they are setting a new log
table or NONE for the subscription.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-11T12:26:59Z

On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> > 2)
> > When we do below:
> > alter subscription sub1 SET (conflict_log_table=clt2);
> >
> > the previous conflict log table is dropped. Is this behavior
> > intentional and discussed/concluded earlier? It’s possible that a user
> > may want to create a new conflict log table for future events while
> > still retaining the old one for analysis. If the subscription itself
> > is dropped, then dropping the CLT makes sense, but I’m not sure this
> > behavior is intended for ALTER SUBSCRIPTION.  I do understand that
> > once we unlink CLT from subscription, later even DROP subscription
> > cannot drop it, but user can always drop it when not needed.
> >
> > If we plan to keep existing behavior, it should be clearly documented
> > in a CAUTION section, and the command should explicitly log the table
> > drop.
>
> Yeah we discussed this behavior and the conclusion was we would
> document this behavior and its user's responsibility to take necessary
> backup of the conflict log table data if they are setting a new log
> table or NONE for the subscription.
>

+1. If we don't do this then it will be difficult to track for
postgres or users the previous conflict history tables.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-11T14:19:29Z

On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > > 2)
> > > When we do below:
> > > alter subscription sub1 SET (conflict_log_table=clt2);
> > >
> > > the previous conflict log table is dropped. Is this behavior
> > > intentional and discussed/concluded earlier? It’s possible that a user
> > > may want to create a new conflict log table for future events while
> > > still retaining the old one for analysis. If the subscription itself
> > > is dropped, then dropping the CLT makes sense, but I’m not sure this
> > > behavior is intended for ALTER SUBSCRIPTION.  I do understand that
> > > once we unlink CLT from subscription, later even DROP subscription
> > > cannot drop it, but user can always drop it when not needed.
> > >
> > > If we plan to keep existing behavior, it should be clearly documented
> > > in a CAUTION section, and the command should explicitly log the table
> > > drop.
> >
> > Yeah we discussed this behavior and the conclusion was we would
> > document this behavior and its user's responsibility to take necessary
> > backup of the conflict log table data if they are setting a new log
> > table or NONE for the subscription.
> >
>
> +1. If we don't do this then it will be difficult to track for
> postgres or users the previous conflict history tables.

Right, it makes sense.

Attached patch fixed most of the open comments
1) \dRs+ now show the schema qualified name
2) Now key_tuple and replica_identify tuple both are add in conflict
log tuple wherever applicable
3) Refactored the code so that we can define the conflict log table
schema only once in the header file and both create_conflict_log_table
and ValidateConflictLogTable use it.

I was considering the interdependence between the subscription and the
conflict log table (CLT). IMHO, it would be logical to establish the
subscription as dependent on the CLT. This way, if someone attempts to
drop the CLT, the system would recognize the dependency of the
subscription and prevent the drop unless the subscription is removed
first or the CASCADE option is used.

However, while investigating this, I encountered an error [1] stating
that global objects are not supported in this context. This indicates
that global objects cannot be made dependent on local objects.
Although making an object dependent on global/shared objects is
possible for certain types of shared objects [2], this is not our main
objective.

We do not need to make the CLT dependent on the subscription because
the table can be dropped when the subscription is dropped anyway and
we are already doing it as part of drop subscription as well as alter
subscription when CLT is set to NONE or a different table. Therefore,
extending the functionality of shared dependency is unnecessary for
this purpose.

Thoughts?

[1]
doDeletion()
{
....
/*
* These global object types are not supported here.
*/
case AuthIdRelationId:
case DatabaseRelationId:
case TableSpaceRelationId:
case SubscriptionRelationId:
case ParameterAclRelationId:
elog(ERROR, "global objects cannot be deleted by doDeletion");
break;
}

[2]
typedef enum SharedDependencyType
{
SHARED_DEPENDENCY_OWNER = 'o',
SHARED_DEPENDENCY_ACL = 'a',
SHARED_DEPENDENCY_INITACL = 'i',
SHARED_DEPENDENCY_POLICY = 'r',
SHARED_DEPENDENCY_TABLESPACE = 't',
SHARED_DEPENDENCY_INVALID = 0,
} SharedDependencyType;

Pending Items are:
1. Handling dump/upgrade
2. Test case for conflit insertion
3. Documentation patch

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-12T03:49:01Z

On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > > 2)
> > > > When we do below:
> > > > alter subscription sub1 SET (conflict_log_table=clt2);
> > > >
> > > > the previous conflict log table is dropped. Is this behavior
> > > > intentional and discussed/concluded earlier? It’s possible that a user
> > > > may want to create a new conflict log table for future events while
> > > > still retaining the old one for analysis. If the subscription itself
> > > > is dropped, then dropping the CLT makes sense, but I’m not sure this
> > > > behavior is intended for ALTER SUBSCRIPTION.  I do understand that
> > > > once we unlink CLT from subscription, later even DROP subscription
> > > > cannot drop it, but user can always drop it when not needed.
> > > >
> > > > If we plan to keep existing behavior, it should be clearly documented
> > > > in a CAUTION section, and the command should explicitly log the table
> > > > drop.
> > >
> > > Yeah we discussed this behavior and the conclusion was we would
> > > document this behavior and its user's responsibility to take necessary
> > > backup of the conflict log table data if they are setting a new log
> > > table or NONE for the subscription.
> > >
> >
> > +1. If we don't do this then it will be difficult to track for
> > postgres or users the previous conflict history tables.
>
> Right, it makes sense.

Okay, right.

>
> Attached patch fixed most of the open comments
> 1) \dRs+ now show the schema qualified name
> 2) Now key_tuple and replica_identify tuple both are add in conflict
> log tuple wherever applicable
> 3) Refactored the code so that we can define the conflict log table
> schema only once in the header file and both create_conflict_log_table
> and ValidateConflictLogTable use it.
>
> I was considering the interdependence between the subscription and the
> conflict log table (CLT). IMHO, it would be logical to establish the
> subscription as dependent on the CLT. This way, if someone attempts to
> drop the CLT, the system would recognize the dependency of the
> subscription and prevent the drop unless the subscription is removed
> first or the CASCADE option is used.
>
> However, while investigating this, I encountered an error [1] stating
> that global objects are not supported in this context. This indicates
> that global objects cannot be made dependent on local objects.
> Although making an object dependent on global/shared objects is
> possible for certain types of shared objects [2], this is not our main
> objective.
>
> We do not need to make the CLT dependent on the subscription because
> the table can be dropped when the subscription is dropped anyway and
> we are already doing it as part of drop subscription as well as alter
> subscription when CLT is set to NONE or a different table. Therefore,
> extending the functionality of shared dependency is unnecessary for
> this purpose.
>
> Thoughts?

I believe the recommendation to create a dependency was meant to
prevent the table from being accidentally dropped during a DROP SCHEMA
or DROP TABLE operation. That risk still remains, regardless of the
fact that dropping or altering a subscription will result in the table
removal. I will give this more thought and let you know if anything
comes to mind.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-12T04:11:58Z

On Fri, Dec 12, 2025 at 9:19 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > We do not need to make the CLT dependent on the subscription because
> > the table can be dropped when the subscription is dropped anyway and
> > we are already doing it as part of drop subscription as well as alter
> > subscription when CLT is set to NONE or a different table. Therefore,
> > extending the functionality of shared dependency is unnecessary for
> > this purpose.
> >
> > Thoughts?
>
> I believe the recommendation to create a dependency was meant to
> prevent the table from being accidentally dropped during a DROP SCHEMA
> or DROP TABLE operation. That risk still remains, regardless of the
> fact that dropping or altering a subscription will result in the table
> removal. I will give this more thought and let you know if anything
> comes to mind.

I mean we can register the dependency of subscriber on table and that
will prevent dropping the tables via DROP TABLE/DROP SCHEMA, but what
I do not like is the internal error[1] in doDeletion() when someone
will try to DROP TABLE CLT CASCADE;

I suggest an alternative approach for handling this: implement a check
within the ALTER/DROP table commands. If the table is a CLT (using
IsConflictLogTable() to verify), these operations should be
disallowed. This would enhance the robustness of CLT handling by
entirely preventing external drop/alter actions. What are your
thoughts on this solution? And let's also see what Amit and Sawada-san
think about this solution.

[1]
doDeletion()
{
....
/*
* These global object types are not supported here.
*/
case AuthIdRelationId:
case DatabaseRelationId:
case TableSpaceRelationId:
case SubscriptionRelationId:
case ParameterAclRelationId:
elog(ERROR, "global objects cannot be deleted by doDeletion");
break;
}

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-12T04:32:18Z

On Fri, Dec 12, 2025 at 9:42 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 12, 2025 at 9:19 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > We do not need to make the CLT dependent on the subscription because
> > > the table can be dropped when the subscription is dropped anyway and
> > > we are already doing it as part of drop subscription as well as alter
> > > subscription when CLT is set to NONE or a different table. Therefore,
> > > extending the functionality of shared dependency is unnecessary for
> > > this purpose.
> > >
> > > Thoughts?
> >
> > I believe the recommendation to create a dependency was meant to
> > prevent the table from being accidentally dropped during a DROP SCHEMA
> > or DROP TABLE operation. That risk still remains, regardless of the
> > fact that dropping or altering a subscription will result in the table
> > removal. I will give this more thought and let you know if anything
> > comes to mind.
>
> I mean we can register the dependency of subscriber on table and that
> will prevent dropping the tables via DROP TABLE/DROP SCHEMA, but what
> I do not like is the internal error[1] in doDeletion() when someone
> will try to DROP TABLE CLT CASCADE;
>

Yes, I understand that part.

> I suggest an alternative approach for handling this: implement a check
> within the ALTER/DROP table commands. If the table is a CLT (using
> IsConflictLogTable() to verify), these operations should be
> disallowed. This would enhance the robustness of CLT handling by
> entirely preventing external drop/alter actions. What are your
> thoughts on this solution? And let's also see what Amit and Sawada-san
> think about this solution.

I had similar thoughts, but was unsure how this should behave when a
user runs DROP SCHEMA … CASCADE. We can’t simply block the entire
operation with an error just because the schema contains a CLT, but we
also shouldn’t allow it to proceed without notifying the user that the
schema includes a CLT.

>
> [1]
> doDeletion()
> {
> ....
> /*
> * These global object types are not supported here.
> */
> case AuthIdRelationId:
> case DatabaseRelationId:
> case TableSpaceRelationId:
> case SubscriptionRelationId:
> case ParameterAclRelationId:
> elog(ERROR, "global objects cannot be deleted by doDeletion");
> break;
> }
>
> --
> Regards,
> Dilip Kumar
> Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-12T04:58:52Z

On Fri, Dec 12, 2025 at 10:02 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Dec 12, 2025 at 9:42 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Dec 12, 2025 at 9:19 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > We do not need to make the CLT dependent on the subscription because
> > > > the table can be dropped when the subscription is dropped anyway and
> > > > we are already doing it as part of drop subscription as well as alter
> > > > subscription when CLT is set to NONE or a different table. Therefore,
> > > > extending the functionality of shared dependency is unnecessary for
> > > > this purpose.
> > > >
> > > > Thoughts?
> > >
> > > I believe the recommendation to create a dependency was meant to
> > > prevent the table from being accidentally dropped during a DROP SCHEMA
> > > or DROP TABLE operation. That risk still remains, regardless of the
> > > fact that dropping or altering a subscription will result in the table
> > > removal. I will give this more thought and let you know if anything
> > > comes to mind.
> >
> > I mean we can register the dependency of subscriber on table and that
> > will prevent dropping the tables via DROP TABLE/DROP SCHEMA, but what
> > I do not like is the internal error[1] in doDeletion() when someone
> > will try to DROP TABLE CLT CASCADE;
> >
>
> Yes, I understand that part.
>
> > I suggest an alternative approach for handling this: implement a check
> > within the ALTER/DROP table commands. If the table is a CLT (using
> > IsConflictLogTable() to verify), these operations should be
> > disallowed. This would enhance the robustness of CLT handling by
> > entirely preventing external drop/alter actions. What are your
> > thoughts on this solution? And let's also see what Amit and Sawada-san
> > think about this solution.
>
> I had similar thoughts, but was unsure how this should behave when a
> user runs DROP SCHEMA … CASCADE. We can’t simply block the entire
> operation with an error just because the schema contains a CLT, but we
> also shouldn’t allow it to proceed without notifying the user that the
> schema includes a CLT.

I understand your concern about whether this restriction is
appropriate, particularly when using DROP SCHEMA … CASCADE is.
However, considering the logical dependency where the subscription
relies on the table (CLT), expecting DROP SCHEMA … CASCADE to drop the
CLT implies it should also drop the dependent subscription, which is
not permitted. Therefore, a more appropriate behavior would be to
issue an error message stating that the table is a conflict log table
and that subscriber "<subname>" depends on it. This message should
instruct the user to either drop the subscription or reset the
conflict log table before proceeding with the drop operation.

OTOH, we can simply let the CLT get dropped and altered and document
this behavior so that it is the user's responsibility to not to
drop/alter the CLT otherwise conflict logging will be skipped as we
have now.  While thinking more I feel it might be better to keep it
simple as we have now instead of overcomplicating it?

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-12T05:57:18Z

On Thu, 11 Dec 2025 at 19:50, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > > 2)
> > > > When we do below:
> > > > alter subscription sub1 SET (conflict_log_table=clt2);
> > > >
> > > > the previous conflict log table is dropped. Is this behavior
> > > > intentional and discussed/concluded earlier? It’s possible that a user
> > > > may want to create a new conflict log table for future events while
> > > > still retaining the old one for analysis. If the subscription itself
> > > > is dropped, then dropping the CLT makes sense, but I’m not sure this
> > > > behavior is intended for ALTER SUBSCRIPTION.  I do understand that
> > > > once we unlink CLT from subscription, later even DROP subscription
> > > > cannot drop it, but user can always drop it when not needed.
> > > >
> > > > If we plan to keep existing behavior, it should be clearly documented
> > > > in a CAUTION section, and the command should explicitly log the table
> > > > drop.
> > >
> > > Yeah we discussed this behavior and the conclusion was we would
> > > document this behavior and its user's responsibility to take necessary
> > > backup of the conflict log table data if they are setting a new log
> > > table or NONE for the subscription.
> > >
> >
> > +1. If we don't do this then it will be difficult to track for
> > postgres or users the previous conflict history tables.
>
> Right, it makes sense.
>
> Attached patch fixed most of the open comments
> 1) \dRs+ now show the schema qualified name
> 2) Now key_tuple and replica_identify tuple both are add in conflict
> log tuple wherever applicable
> 3) Refactored the code so that we can define the conflict log table
> schema only once in the header file and both create_conflict_log_table
> and ValidateConflictLogTable use it.
>
> I was considering the interdependence between the subscription and the
> conflict log table (CLT). IMHO, it would be logical to establish the
> subscription as dependent on the CLT. This way, if someone attempts to
> drop the CLT, the system would recognize the dependency of the
> subscription and prevent the drop unless the subscription is removed
> first or the CASCADE option is used.
>
> However, while investigating this, I encountered an error [1] stating
> that global objects are not supported in this context. This indicates
> that global objects cannot be made dependent on local objects.
> Although making an object dependent on global/shared objects is
> possible for certain types of shared objects [2], this is not our main
> objective.
>
> We do not need to make the CLT dependent on the subscription because
> the table can be dropped when the subscription is dropped anyway and
> we are already doing it as part of drop subscription as well as alter
> subscription when CLT is set to NONE or a different table. Therefore,
> extending the functionality of shared dependency is unnecessary for
> this purpose.

I noticed an inconsistency in the checks that prevent adding a
conflict log table to a publication.  At creation time, we explicitly
reject attempts to publish a conflict log table:
/* Can't be conflict log table */
if (IsConflictLogTable(RelationGetRelid(targetrel)))
    ereport(ERROR,
            (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
             errmsg("cannot add relation \"%s.%s\" to publication",
                    get_namespace_name(RelationGetNamespace(targetrel)),
                    RelationGetRelationName(targetrel)),
             errdetail("This operation is not supported for conflict
log tables.")));

However, the restriction can be bypassed through a sequence of table
renames like below:
-- Set up logical replication
CREATE PUBLICATION pub_all;
CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_all  WITH
(conflict_log_table = 'conflict');

-- Rename the conflict log table
ALTER TABLE conflict RENAME TO conflict1;

-- Now this succeeds:
CREATE PUBLICATION pub1 FOR TABLE conflict1;

-- Rename it back
ALTER TABLE conflict1 RENAME TO conflict;

\dRp+ pub1
  Publication pub1
  ...
  Tables:
      public.conflict

Thus, although we prohibit publishing the conflict log table directly,
a publication can still end up referencing it through renaming. This
is inconsistent with the invariant the code attempts to enforce.

Should we extend the checks to handle renames so that a conflict log
table can never end up in a publication?
Alternatively, should the creation-time restriction be relaxed if this
case is acceptable?
If the invariant should be enforced, should we also prevent renaming a
conflict-log table into a published table's name?

Thoughts?

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-12T09:33:47Z

On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I was considering the interdependence between the subscription and the
> conflict log table (CLT). IMHO, it would be logical to establish the
> subscription as dependent on the CLT. This way, if someone attempts to
> drop the CLT, the system would recognize the dependency of the
> subscription and prevent the drop unless the subscription is removed
> first or the CASCADE option is used.
>
> However, while investigating this, I encountered an error [1] stating
> that global objects are not supported in this context. This indicates
> that global objects cannot be made dependent on local objects.
>

What we need here is an equivalent of DEPENDENCY_INTERNAL for database
objects. For example, consider following case:
postgres=# create table t1(c1 int primary key);
CREATE TABLE
postgres=# \d+ t1
                                           Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage |
Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 c1     | integer |           | not null |         | plain   |
    |              |
Indexes:
    "t1_pkey" PRIMARY KEY, btree (c1)
Publications:
    "pub1"
Not-null constraints:
    "t1_c1_not_null" NOT NULL "c1"
Access method: heap
postgres=# drop index t1_pkey;
ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
t1 requires it
HINT:  You can drop constraint t1_pkey on table t1 instead.

Here, the PK index is created as part for CREATE TABLE operation and
pk_index is not allowed to be dropped independently.

> Although making an object dependent on global/shared objects is
> possible for certain types of shared objects [2], this is not our main
> objective.
>

As per my understanding from the above example, we need something like
that only for shared object subscription and (internally created)
table.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-12T10:03:29Z

On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > I was considering the interdependence between the subscription and the
> > conflict log table (CLT). IMHO, it would be logical to establish the
> > subscription as dependent on the CLT. This way, if someone attempts to
> > drop the CLT, the system would recognize the dependency of the
> > subscription and prevent the drop unless the subscription is removed
> > first or the CASCADE option is used.
> >
> > However, while investigating this, I encountered an error [1] stating
> > that global objects are not supported in this context. This indicates
> > that global objects cannot be made dependent on local objects.
> >
>
> What we need here is an equivalent of DEPENDENCY_INTERNAL for database
> objects. For example, consider following case:
> postgres=# create table t1(c1 int primary key);
> CREATE TABLE
> postgres=# \d+ t1
>                                            Table "public.t1"
>  Column |  Type   | Collation | Nullable | Default | Storage |
> Compression | Stats target | Description
> --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
>  c1     | integer |           | not null |         | plain   |
>     |              |
> Indexes:
>     "t1_pkey" PRIMARY KEY, btree (c1)
> Publications:
>     "pub1"
> Not-null constraints:
>     "t1_c1_not_null" NOT NULL "c1"
> Access method: heap
> postgres=# drop index t1_pkey;
> ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
> t1 requires it
> HINT:  You can drop constraint t1_pkey on table t1 instead.
>
> Here, the PK index is created as part for CREATE TABLE operation and
> pk_index is not allowed to be dropped independently.
>
> > Although making an object dependent on global/shared objects is
> > possible for certain types of shared objects [2], this is not our main
> > objective.
> >
>
> As per my understanding from the above example, we need something like
> that only for shared object subscription and (internally created)
> table.
>

+1

~~

Few comments for v11:

1)
+#include "executor/spi.h"
+#include "replication/conflict.h"
+#include "utils/fmgroids.h"
+#include "utils/regproc.h"

subscriptioncmds.c compiles without the above inclusions.

2)
postgres=# create subscription sub3 connection '...' publication pub1
WITH(conflict_log_table='pg_temp.clt');
NOTICE:  created replication slot "sub3" on publisher
CREATE SUBSCRIPTION

Should we restrict clt creation in pg_temp?

3)
+ /* Fetch the eixsting conflict table table information. */

typos: eixsting->existing,
          table table -> table

4)
AlterSubscription():
+ values[Anum_pg_subscription_subconflictlognspid - 1] =
+ ObjectIdGetDatum(nspid);
+
+ if (relname != NULL)
+ values[Anum_pg_subscription_subconflictlogtable - 1] =
+ CStringGetTextDatum(relname);
+ else
+ nulls[Anum_pg_subscription_subconflictlogtable - 1] =
+ true;

Should we move the nspid setting inside 'if(relname != NULL)' block?

5)
Is there a way to reset/remove conflict_log_table? I did not see any
such handling in AlterSubscription? It gives error:

postgres=# alter subscription sub3 set (conflict_log_table='');
ERROR:  invalid name syntax

6)
+char *
+get_subscription_conflict_log_table(Oid subid, Oid *nspid)
+{
+ HeapTuple tup;
+ Datum datum;
+ bool isnull;
+ char    *relname = NULL;
+ Form_pg_subscription subform;
+
+ *nspid = InvalidOid;
+
+ tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
+
+ if (!HeapTupleIsValid(tup))
+ return NULL;

Should we have elog(ERROR) here for cache lookup failure? Callers like
AlterSubscription, DropSubscription lock the sub entry, so it being
missing at this stage is not normal. I have not seen all the callers
though.

7)
+#include "access/htup.h"
+#include "access/skey.h"

+#include "access/table.h"
+#include "catalog/pg_attribute.h"
+#include "catalog/indexing.h"
+#include "catalog/namespace.h"
+#include "catalog/pg_namespace.h"
+#include "catalog/pg_type.h"

+#include "executor/spi.h"
+#include "utils/array.h"

conflict.c compiles without above inclusions.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-14T10:21:40Z

On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > I was considering the interdependence between the subscription and the
> > conflict log table (CLT). IMHO, it would be logical to establish the
> > subscription as dependent on the CLT. This way, if someone attempts to
> > drop the CLT, the system would recognize the dependency of the
> > subscription and prevent the drop unless the subscription is removed
> > first or the CASCADE option is used.
> >
> > However, while investigating this, I encountered an error [1] stating
> > that global objects are not supported in this context. This indicates
> > that global objects cannot be made dependent on local objects.
> >
>
> What we need here is an equivalent of DEPENDENCY_INTERNAL for database
> objects. For example, consider following case:
> postgres=# create table t1(c1 int primary key);
> CREATE TABLE
> postgres=# \d+ t1
>                                            Table "public.t1"
>  Column |  Type   | Collation | Nullable | Default | Storage |
> Compression | Stats target | Description
> --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
>  c1     | integer |           | not null |         | plain   |
>     |              |
> Indexes:
>     "t1_pkey" PRIMARY KEY, btree (c1)
> Publications:
>     "pub1"
> Not-null constraints:
>     "t1_c1_not_null" NOT NULL "c1"
> Access method: heap
> postgres=# drop index t1_pkey;
> ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
> t1 requires it
> HINT:  You can drop constraint t1_pkey on table t1 instead.
>
> Here, the PK index is created as part for CREATE TABLE operation and
> pk_index is not allowed to be dropped independently.
>
> > Although making an object dependent on global/shared objects is
> > possible for certain types of shared objects [2], this is not our main
> > objective.
> >
>
> As per my understanding from the above example, we need something like
> that only for shared object subscription and (internally created)
> table.

Yeah that seems to be exactly what we want, so I tried doing that by
recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and
it is behaving as we want[2].  And while dropping the subscription or
altering CLT we can delete internal dependency so that CLT get dropped
automatically[3]

I will send an updated patch after testing a few more scenarios and
fixing other pending issues.

[1]
+       ObjectAddressSet(myself, RelationRelationId, relid);
+       ObjectAddressSet(subaddr, SubscriptionRelationId, subid);
+       recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL);


[2]
postgres[670778]=# DROP TABLE myschema.conflict_log_history2;
ERROR:  2BP01: cannot drop table myschema.conflict_log_history2
because subscription sub requires it
HINT:  You can drop subscription sub instead.
LOCATION:  findDependentObjects, dependency.c:788
postgres[670778]=#

[3]
ObjectAddressSet(object, SubscriptionRelationId, subid);
performDeletion(&object, DROP_CASCADE
                           PERFORM_DELETION_INTERNAL |
                           PERFORM_DELETION_SKIP_ORIGINAL);



--
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-14T15:46:30Z

On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > I was considering the interdependence between the subscription and the
> > > conflict log table (CLT). IMHO, it would be logical to establish the
> > > subscription as dependent on the CLT. This way, if someone attempts to
> > > drop the CLT, the system would recognize the dependency of the
> > > subscription and prevent the drop unless the subscription is removed
> > > first or the CASCADE option is used.
> > >
> > > However, while investigating this, I encountered an error [1] stating
> > > that global objects are not supported in this context. This indicates
> > > that global objects cannot be made dependent on local objects.
> > >
> >
> > What we need here is an equivalent of DEPENDENCY_INTERNAL for database
> > objects. For example, consider following case:
> > postgres=# create table t1(c1 int primary key);
> > CREATE TABLE
> > postgres=# \d+ t1
> >                                            Table "public.t1"
> >  Column |  Type   | Collation | Nullable | Default | Storage |
> > Compression | Stats target | Description
> > --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
> >  c1     | integer |           | not null |         | plain   |
> >     |              |
> > Indexes:
> >     "t1_pkey" PRIMARY KEY, btree (c1)
> > Publications:
> >     "pub1"
> > Not-null constraints:
> >     "t1_c1_not_null" NOT NULL "c1"
> > Access method: heap
> > postgres=# drop index t1_pkey;
> > ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
> > t1 requires it
> > HINT:  You can drop constraint t1_pkey on table t1 instead.
> >
> > Here, the PK index is created as part for CREATE TABLE operation and
> > pk_index is not allowed to be dropped independently.
> >
> > > Although making an object dependent on global/shared objects is
> > > possible for certain types of shared objects [2], this is not our main
> > > objective.
> > >
> >
> > As per my understanding from the above example, we need something like
> > that only for shared object subscription and (internally created)
> > table.
>
> Yeah that seems to be exactly what we want, so I tried doing that by
> recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and
> it is behaving as we want[2].  And while dropping the subscription or
> altering CLT we can delete internal dependency so that CLT get dropped
> automatically[3]
>
> I will send an updated patch after testing a few more scenarios and
> fixing other pending issues.
>
> [1]
> +       ObjectAddressSet(myself, RelationRelationId, relid);
> +       ObjectAddressSet(subaddr, SubscriptionRelationId, subid);
> +       recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL);
>
>
> [2]
> postgres[670778]=# DROP TABLE myschema.conflict_log_history2;
> ERROR:  2BP01: cannot drop table myschema.conflict_log_history2
> because subscription sub requires it
> HINT:  You can drop subscription sub instead.
> LOCATION:  findDependentObjects, dependency.c:788
> postgres[670778]=#
>
> [3]
> ObjectAddressSet(object, SubscriptionRelationId, subid);
> performDeletion(&object, DROP_CASCADE
>                            PERFORM_DELETION_INTERNAL |
>                            PERFORM_DELETION_SKIP_ORIGINAL);
>
>

Here is the patch which implements the dependency and fixes other
comments from Shveta.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-14T15:50:24Z

On Fri, Dec 12, 2025 at 3:33 PM shveta malik <shveta.malik@gmail.com> wrote:
>
>
> Few comments for v11:
>
> 1)
> +#include "executor/spi.h"
> +#include "replication/conflict.h"
> +#include "utils/fmgroids.h"
> +#include "utils/regproc.h"
>
> subscriptioncmds.c compiles without the above inclusions.

I think we need utils/regproc.h for "stringToQualifiedNameList()"

> 2)
> postgres=# create subscription sub3 connection '...' publication pub1
> WITH(conflict_log_table='pg_temp.clt');
> NOTICE:  created replication slot "sub3" on publisher
> CREATE SUBSCRIPTION
>
> Should we restrict clt creation in pg_temp?

Done and added a test.

> 3)
> + /* Fetch the eixsting conflict table table information. */
>
> typos: eixsting->existing,
>           table table -> table

Fixed

> 4)
> AlterSubscription():
> + values[Anum_pg_subscription_subconflictlognspid - 1] =
> + ObjectIdGetDatum(nspid);
> +
> + if (relname != NULL)
> + values[Anum_pg_subscription_subconflictlogtable - 1] =
> + CStringGetTextDatum(relname);
> + else
> + nulls[Anum_pg_subscription_subconflictlogtable - 1] =
> + true;
>
> Should we move the nspid setting inside 'if(relname != NULL)' block?

Since subconflictlognspid is part of the fixed size structure so we
will always have to set it so I prefer it to keep it out.

> 5)
> Is there a way to reset/remove conflict_log_table? I did not see any
> such handling in AlterSubscription? It gives error:
>
> postgres=# alter subscription sub3 set (conflict_log_table='');
> ERROR:  invalid name syntax

Fixed and added a test case

> 6)
> +char *
> +get_subscription_conflict_log_table(Oid subid, Oid *nspid)
> +{
> + HeapTuple tup;
> + Datum datum;
> + bool isnull;
> + char    *relname = NULL;
> + Form_pg_subscription subform;
> +
> + *nspid = InvalidOid;
> +
> + tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid));
> +
> + if (!HeapTupleIsValid(tup))
> + return NULL;
>
> Should we have elog(ERROR) here for cache lookup failure? Callers like
> AlterSubscription, DropSubscription lock the sub entry, so it being
> missing at this stage is not normal. I have not seen all the callers
> though.

Yeah we can do that.

> 7)
> +#include "access/htup.h"
> +#include "access/skey.h"
>
> +#include "access/table.h"
> +#include "catalog/pg_attribute.h"
> +#include "catalog/indexing.h"
> +#include "catalog/namespace.h"
> +#include "catalog/pg_namespace.h"
> +#include "catalog/pg_type.h"
>
> +#include "executor/spi.h"
> +#include "utils/array.h"
>
> conflict.c compiles without above inclusions.

Done


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-15T08:45:58Z

On Sun, Dec 14, 2025 at 9:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>

Thanks for the patch. Few comments:

1)
+ if (isTempNamespace(namespaceId))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot create conflict log table \"%s\" in a temporary namespace",
+ conflictrel),
+ errhint("Use a permanent schema.")));

a)
Shall we use 'temporary schema' instead of 'temporary namespace'? See
other similar errors:

errmsg("cannot move objects into or out of temporary schemas")
errmsg("cannot create relations in temporary schemas of other
sessions"))
errmsg("cannot create temporary relation in non-temporary schema")

b)
Do we really need errhint here? It seems self-explanatory. If we
really want to specify HINT, shall we say:
"Specify a non-temporary schema for conflict log table."

2)
postgres=# alter subscription sub1 set (conflict_log_table='');
ERROR:  conflict log table name cannot be empty
HINT:  Provide a valid table name or omit the parameter.

My idea was to allow the above operation to enable users to reset the
conflict_log_table when the conflict log history is no longer needed.
Is there any other way to reset it, or is this intentionally not
supported?

3)
postgres=# alter subscription sub1 set (conflict_log_table=NULL);
ALTER SUBSCRIPTION
postgres=# alter subscription sub2 set (conflict_log_table=create);
ALTER SUBSCRIPTION
postgres=# \d
         List of relations
 Schema |  Name   | Type  | Owner
--------+---------+-------+--------
 public | create  | table | shveta
 public | null    | table | shveta


It takes reserved keywords and creates tables with those names. It
should be restricted.

4)
postgres=# SELECT c.relname FROM pg_depend d JOIN pg_class c ON c.oid
= d.objid JOIN pg_subscription s ON s.oid = d.refobjid WHERE s.subname
= 'sub1';
 relname
---------
 clt

postgres=#  select count(*) from pg_shdepend  where refobjid = (select
oid from pg_subscription where subname='sub1');
 count
-------
     0

Since dependency between sub and clt is a dependency involving
shared-object, shouldn't the entry be in pg_shdepend? Or do we allow
such entries in pg_depend as well?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-15T09:25:18Z

On Mon, Dec 15, 2025 at 2:16 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sun, Dec 14, 2025 at 9:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
>
> Thanks for the patch. Few comments:

>
> 2)
> postgres=# alter subscription sub1 set (conflict_log_table='');
> ERROR:  conflict log table name cannot be empty
> HINT:  Provide a valid table name or omit the parameter.
>
> My idea was to allow the above operation to enable users to reset the
> conflict_log_table when the conflict log history is no longer needed.
> Is there any other way to reset it, or is this intentionally not
> supported?

ALTEr SUBSCRIPTION..SET (conflict_log_table=NONE); this is same as how
other subscription parameters are being reset

> 3)
> postgres=# alter subscription sub1 set (conflict_log_table=NULL);
> ALTER SUBSCRIPTION
> postgres=# alter subscription sub2 set (conflict_log_table=create);
> ALTER SUBSCRIPTION
> postgres=# \d
>          List of relations
>  Schema |  Name   | Type  | Owner
> --------+---------+-------+--------
>  public | create  | table | shveta
>  public | null    | table | shveta
>
>
> It takes reserved keywords and creates tables with those names. It
> should be restricted.

I somehow assume table creation will be restricted with these names,
but since we switch from SPI to internal interface its not true
anymore, need to see how we can handle this.

> 4)
> postgres=# SELECT c.relname FROM pg_depend d JOIN pg_class c ON c.oid
> = d.objid JOIN pg_subscription s ON s.oid = d.refobjid WHERE s.subname
> = 'sub1';
>  relname
> ---------
>  clt
>
> postgres=#  select count(*) from pg_shdepend  where refobjid = (select
> oid from pg_subscription where subname='sub1');
>  count
> -------
>      0
>
> Since dependency between sub and clt is a dependency involving
> shared-object, shouldn't the entry be in pg_shdepend? Or do we allow
> such entries in pg_depend as well?

The primary reason for recording in pg_depend is that the
RemoveRelations() function already includes logic to check for and
report internal dependencies within pg_depends. Consequently, if we
were to record the dependency in pg_shdepends, we would likely need to
modify RemoveRelations() to incorporate handling for pg_shdepends
dependencies.

However, some might argue that when an object ID (objid) is local and
the referenced object ID (refobjid) is shared, such as when a table is
created under a ROLE, establishing a dependency with the owner, the
dependency is currently recorded in pg_shdepend. In this scenario, the
dependent object (the local table) can be dropped independently, while
the referenced object (the shared owner) cannot. However, when aiming
to record an internal dependency, the dependent object should not be
droppable without first dropping the referencing object. Therefore, I
believe the dependency record should be placed in pg_depend, as the
depender is a local object and will check for dependencies there.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-15T09:48:23Z

On Sun, Dec 14, 2025 at 9:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Here is the patch which implements the dependency and fixes other
> comments from Shveta.
>

+/*
+ * Check if the specified relation is used as a conflict log table by any
+ * subscription.
+ */
+bool
+IsConflictLogTable(Oid relid)
+{
+ Relation rel;
+ TableScanDesc scan;
+ HeapTuple tup;
+ bool is_clt = false;
+
+ rel = table_open(SubscriptionRelationId, AccessShareLock);
+ scan = table_beginscan_catalog(rel, 0, NULL);
+
+ while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))

This function has been used at multiple places in the patch, though
not in any performance-critical paths, but still, it seems like the
impact can be noticeable for a large number of subscriptions. Also, I
am not sure it is a good design to scan the entire system table to
find whether some other relation is publishable or not. I see below
kinds of usages for it:

+ /* Subscription conflict log tables are not published */
+ result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) &&
+ !IsConflictLogTable(relid);

In this regard, I see a comment atop is_publishable_class which
suggests as follows:

The best
 * long-term solution may be to add a "relispublishable" bool to pg_class,
 * and depend on that instead of OID checks.
 */
static bool
is_publishable_class(Oid relid, Form_pg_class reltuple)

I feel that is a good idea for reasons mentioned atop
is_publishable_class and for the conflict table. What do you think?

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-15T10:31:47Z

On Mon, Dec 15, 2025 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Dec 14, 2025 at 9:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > Here is the patch which implements the dependency and fixes other
> > comments from Shveta.
> >
>
> +/*
> + * Check if the specified relation is used as a conflict log table by any
> + * subscription.
> + */
> +bool
> +IsConflictLogTable(Oid relid)
> +{
> + Relation rel;
> + TableScanDesc scan;
> + HeapTuple tup;
> + bool is_clt = false;
> +
> + rel = table_open(SubscriptionRelationId, AccessShareLock);
> + scan = table_beginscan_catalog(rel, 0, NULL);
> +
> + while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
>
> This function has been used at multiple places in the patch, though
> not in any performance-critical paths, but still, it seems like the
> impact can be noticeable for a large number of subscriptions. Also, I
> am not sure it is a good design to scan the entire system table to
> find whether some other relation is publishable or not. I see below
> kinds of usages for it:
>
> + /* Subscription conflict log tables are not published */
> + result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) &&
> + !IsConflictLogTable(relid);
>
> In this regard, I see a comment atop is_publishable_class which
> suggests as follows:
>
> The best
>  * long-term solution may be to add a "relispublishable" bool to pg_class,
>  * and depend on that instead of OID checks.
>  */
> static bool
> is_publishable_class(Oid relid, Form_pg_class reltuple)
>
> I feel that is a good idea for reasons mentioned atop
> is_publishable_class and for the conflict table. What do you think?

On quick thought, this seems like a good idea and may simplify a
couple of places.  And might be good for future extension as we can
mark publishable at individual relation instead of targeting broad
categories like IsCatalogRelationOid() or checking individual items by
its Oid.  IMHO this can be done as an individual patch in a separate
thread, or as a base patch.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-15T11:15:53Z

On Mon, Dec 15, 2025 at 4:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 15, 2025 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sun, Dec 14, 2025 at 9:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > Here is the patch which implements the dependency and fixes other
> > > comments from Shveta.
> > >
> >
> > +/*
> > + * Check if the specified relation is used as a conflict log table by any
> > + * subscription.
> > + */
> > +bool
> > +IsConflictLogTable(Oid relid)
> > +{
> > + Relation rel;
> > + TableScanDesc scan;
> > + HeapTuple tup;
> > + bool is_clt = false;
> > +
> > + rel = table_open(SubscriptionRelationId, AccessShareLock);
> > + scan = table_beginscan_catalog(rel, 0, NULL);
> > +
> > + while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
> >
> > This function has been used at multiple places in the patch, though
> > not in any performance-critical paths, but still, it seems like the
> > impact can be noticeable for a large number of subscriptions. Also, I
> > am not sure it is a good design to scan the entire system table to
> > find whether some other relation is publishable or not. I see below
> > kinds of usages for it:
> >
> > + /* Subscription conflict log tables are not published */
> > + result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) &&
> > + !IsConflictLogTable(relid);
> >
> > In this regard, I see a comment atop is_publishable_class which
> > suggests as follows:
> >
> > The best
> >  * long-term solution may be to add a "relispublishable" bool to pg_class,
> >  * and depend on that instead of OID checks.
> >  */
> > static bool
> > is_publishable_class(Oid relid, Form_pg_class reltuple)
> >
> > I feel that is a good idea for reasons mentioned atop
> > is_publishable_class and for the conflict table. What do you think?
>
> On quick thought, this seems like a good idea and may simplify a
> couple of places.  And might be good for future extension as we can
> mark publishable at individual relation instead of targeting broad
> categories like IsCatalogRelationOid() or checking individual items by
> its Oid.  IMHO this can be done as an individual patch in a separate
> thread, or as a base patch.
>

I prefer to do it in a separate thread, so that it can get some more
attention. But it should be done before the main conflict patch. I
think we can subdivide the main patch into (a) DDL handling,
everything except inserting data into conflict table, (b) inserting
data into conflict table, (c) upgrade handling. That way it will be
easier to review.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-15T11:41:06Z

On Mon, Dec 15, 2025 at 2:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > 3)
> > postgres=# alter subscription sub1 set (conflict_log_table=NULL);
> > ALTER SUBSCRIPTION
> > postgres=# alter subscription sub2 set (conflict_log_table=create);
> > ALTER SUBSCRIPTION
> > postgres=# \d
> >          List of relations
> >  Schema |  Name   | Type  | Owner
> > --------+---------+-------+--------
> >  public | create  | table | shveta
> >  public | null    | table | shveta
> >
> >
> > It takes reserved keywords and creates tables with those names. It
> > should be restricted.
>
> I somehow assume table creation will be restricted with these names,
> but since we switch from SPI to internal interface its not true
> anymore, need to see how we can handle this.

While thinking more on this, I was seeing other places where we use
'heap_create_with_catalog()' so I noticed that we always use the
internally generated name, so wouldn't it be nice to make the conflict
log table as bool and use internally generated name something like
conflict_log_table_$subid$ and we will always create that in current
active searchpath?  Thought?

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-12-16T01:10:03Z

Some review comments for v12-0001.

======
General

1.
There is no documentation. Even if it seems a bit premature IMO
writing/reviewing the documention could help identify unanticipated
usability issues.

======
src/backend/commands/subscriptioncmds.c

2.
+
+ /* Setting conflict_log_table = NONE is treated as no table. */
+ if (strcmp(opts->conflictlogtable, "none") == 0)
+ opts->conflictlogtable = NULL;
+ }

2a.
This was unexpected when I cam across this code. This feature needs to
be described in the commit message.

~

2b.
Case sensitive?

~~~

CreateSubscription:

3.
+ List   *names;
+
+ /* Explicitly check for empty string before any processing. */
+ if (opts.conflictlogtable[0] == '\0')
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("conflict log table name cannot be empty"),
+ errhint("Provide a valid table name or omit the parameter.")));
+
+ names = stringToQualifiedNameList(opts.conflictlogtable, NULL);

Should '' just be equivalent of NONE instead of another error condition?

~~~

AlterSubscription:

4.
+ Oid     old_nspid = InvalidOid;
+ char   *old_relname = NULL;
+ char   *relname = NULL;
+ List   *names = NIL;

Var 'names' can be declared at a lower scope -- e.g. in the 'if' block.

~~~

DropSubscription:

5.
+ /*
+ * Conflict log tables are recorded as internal dependencies of the
+ * subscription.  We must drop the dependent objects before the
+ * subscription itself is removed.  By using
+ * PERFORM_DELETION_SKIP_ORIGINAL, we ensure that only the conflict log
+ * table is reaped while the  subscription remains for the final deletion
+ * step.
+ */

Double spaces? /the  subscription/the subscription/

~~~

create_conflict_log_table_tupdesc:

6.
+static TupleDesc
+create_conflict_log_table_tupdesc(void)
+{
+ TupleDesc tupdesc;
+ int i;
+
+ tupdesc = CreateTemplateTupleDesc(MAX_CONFLICT_ATTR_NUM);
+
+ for (i = 0; i < MAX_CONFLICT_ATTR_NUM; i++)

Declare 'i' as a for-loop var.

~~~

create_conflict_log_table:

7.
+/*
+ * Create conflict log table.
+ *
+ * The subscription owner becomes the owner of this table and has all
+ * privileges on it.
+ */
+static void
+create_conflict_log_table(Oid namespaceId, char *conflictrel, Oid subid)
+{

I felt that the 'subid' should be the first parameter, not the last.

~~~

8.
namespace > relation, so I felt it is more natural to check for the
temp namespace *before* checking for clashing table names.

======
src/backend/replication/logical/conflict.c

9.
+ if (ValidateConflictLogTable(conflictlogrel))
+ {
+ /*
+ * Prepare the conflict log tuple. If the error level is below
+ * ERROR, insert it immediately. Otherwise, defer the insertion to
+ * a new transaction after the current one aborts, ensuring the
+ * insertion of the log tuple is not rolled back.
+ */
+ prepare_conflict_log_tuple(estate,
+    relinfo->ri_RelationDesc,
+    conflictlogrel,
+    type,
+    searchslot,
+    conflicttuples,
+    remoteslot);
+ if (elevel < ERROR)
+ InsertConflictLogTuple(conflictlogrel);
+ }
+ else
+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("Conflict log table \"%s.%s\" structure changed, skipping insertion",
+ get_namespace_name(RelationGetNamespace(conflictlogrel)),
+ RelationGetRelationName(conflictlogrel)));

9a.
AFAICT in the only few places this function is called it emits exactly
the same warning, so it seems unnecessary duplication. Would it be
better to have that WARNING code inside the ValidateConflictLogTable
(eg always give the warning when returning false). But see also 9b.

~

9b.
I have some doubts about this validation function. It seems
inefficient to be validating the same CLT structure over and over
every time there is a new conflict. Not only is that going to be
slower, but the logfile is going to fill up with warnings. Maybe this
"validation" phase should be a one-time check only during the
CREATE/ALTER SUBSCRIPTION.

Maybe if validation fails it could give some NOTICE that the CLT
logging is broken and then reset the CLT to NONE?

~~~

ValidateConflictLogTable:

10.
+/*
+ * ValidateConflictLogTable - Validate conflict log table
+ *
+ * Validate whether the conflict log table is still suitable for considering as
+ * conflict log table.
+ */
+bool
+ValidateConflictLogTable(Relation rel)

This function comment seems unhelpful. 3 times it mentions equivalent
of "validate conflict log table" but nowhere does it say what that
even means.

 Maybe the later comment (below):

+ /*
+ * Check whether the table definition including its column names, data
+ * types, and column ordering meets the requirements for conflict log
+ * table.
+ */

Should be moved into the function comment part.

~~~

11.
+ Relation    pg_attribute;
+ HeapTuple   atup;
+ ScanKeyData scankey;
+ SysScanDesc scan;
+ Form_pg_attribute attForm;
+ int         attcnt = 0;
+ bool        tbl_ok = true;

'attForm' can be declared within the while loop.

~~~

12.
+ if (attcnt != MAX_CONFLICT_ATTR_NUM || !tbl_ok)
+ return false;

As per previous review comment, this could emit the WARNING log right
here. But see also #9b.

~~~

build_local_conflicts_json_array:

13.
+ Datum values[MAX_LOCAL_CONFLICT_INFO_ATTRS];
+ bool nulls[MAX_LOCAL_CONFLICT_INFO_ATTRS];
+ char    *origin_name = NULL;
+ HeapTuple tuple;
+ Datum json_datum;
+ int attno;
+
+ memset(values, 0, sizeof(Datum) * MAX_LOCAL_CONFLICT_INFO_ATTRS);
+ memset(nulls, 0, sizeof(bool) * MAX_LOCAL_CONFLICT_INFO_ATTRS);

You could also just use designated initializer syntax here and avoid
the memsets.

e.g. = {0}

~~~

14.
+ memset(values, 0, sizeof(Datum) * MAX_LOCAL_CONFLICT_INFO_ATTRS);
+ memset(nulls, 0, sizeof(bool) * MAX_LOCAL_CONFLICT_INFO_ATTRS);

Another place where you could've avoided memset and just done = {0};

~~~

15.
+ json_datum_array = (Datum *) palloc(num_conflicts * sizeof(Datum));
+ json_null_array = (bool *) palloc0(num_conflicts * sizeof(bool));

- index_value = BuildIndexValueDescription(indexDesc, values, isnull);
+ i = 0;
+ foreach(lc, json_datums)
+ {
+ json_datum_array[i] = (Datum) lfirst(lc);
+ i++;
+ }

Should these be using new palloc_array instead of palloc?

======
src/include/replication/conflict.h

16.
+typedef struct ConflictLogColumnDef
+{
+ const char *attname;    /* Column name */
+ Oid         atttypid;   /* Data type OID */
+} ConflictLogColumnDef;

Add this to typedefs.list

~~~

17.
+/* The single source of truth for the conflict log table schema */
+static const ConflictLogColumnDef ConflictLogSchema[] =
+{
+ { .attname = "relid",            .atttypid = OIDOID },
+ { .attname = "schemaname",       .atttypid = TEXTOID },
+ { .attname = "relname",          .atttypid = TEXTOID },
+ { .attname = "conflict_type",    .atttypid = TEXTOID },
+ { .attname = "remote_xid",       .atttypid = XIDOID },
+ { .attname = "remote_commit_lsn",.atttypid = LSNOID },
+ { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
+ { .attname = "remote_origin",    .atttypid = TEXTOID },
+ { .attname = "replica_identity", .atttypid = JSONOID },
+ { .attname = "remote_tuple",     .atttypid = JSONOID },
+ { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
+};

I like this, but I felt it would be better if all the definitions for
"local_conflicts" were defined here too. Then everythin gis in one
place.
e.g. MAX_LOCAL_CONFLICT_INFO_ATTRS and most of the content of
build_conflict_tupledesc().

~~~

18.
+/* Define the count using the array size */
+#define MAX_CONFLICT_ATTR_NUM (sizeof(ConflictLogSchema) /
sizeof(ConflictLogSchema[0]))

This comment is just saying same as the code so doesn't seem to be useful.

======
src/test/regress/expected/subscription.out

19.
+\dt+ clt.regress_conflict_log3
+                                              List of tables
+ Schema |         Name          | Type  |           Owner           |
Persistence |  Size   | Description
+--------+-----------------------+-------+---------------------------+-------------+---------+-------------
+ clt    | regress_conflict_log3 | table | regress_subscription_user |
permanent   | 0 bytes |
+(1 row)


Since the CLT is auto-created internally, and since there is a
"Description" attribute, I wonder should you also be auto-generating
that description so that here it might say something useful like:
"Conflict Log File for subscription XYZ"

~~~

20.
+-- ok - create subscription with conflict_log_table = NONE
+CREATE SUBSCRIPTION regress_conflict_test1 CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, conflict_log_table = NONE);
+SELECT subname, subconflictlogtable FROM pg_subscription WHERE
subname = 'regress_conflict_test2';
+        subname         |  subconflictlogtable
+------------------------+-----------------------
+ regress_conflict_test2 | regress_conflict_log3
+(1 row)
+

I didn't understand this test case; You are setting a NONE clt for
subscription 'regress_conflict_test1'. But then you are checking
subname 'regress_conflict_test2'.

Is that a typo?

~~~

21.
+ALTER SUBSCRIPTION regress_conflict_test1 DISABLE;
+ALTER SUBSCRIPTION regress_conflict_test1 SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_conflict_test1;
+-- Clean up remaining test subscription
+ALTER SUBSCRIPTION regress_conflict_test2 DISABLE;
+ALTER SUBSCRIPTION regress_conflict_test2 SET (slot_name = NONE);
+DROP SUBSCRIPTION regress_conflict_test2;

Something seems misplaced. Why aren't all of the cleanups under the
'cleanup' comment?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-16T04:21:17Z

On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 15, 2025 at 2:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > > 3)
> > > postgres=# alter subscription sub1 set (conflict_log_table=NULL);
> > > ALTER SUBSCRIPTION
> > > postgres=# alter subscription sub2 set (conflict_log_table=create);
> > > ALTER SUBSCRIPTION
> > > postgres=# \d
> > >          List of relations
> > >  Schema |  Name   | Type  | Owner
> > > --------+---------+-------+--------
> > >  public | create  | table | shveta
> > >  public | null    | table | shveta
> > >
> > >
> > > It takes reserved keywords and creates tables with those names. It
> > > should be restricted.
> >
> > I somehow assume table creation will be restricted with these names,
> > but since we switch from SPI to internal interface its not true
> > anymore, need to see how we can handle this.
>
> While thinking more on this, I was seeing other places where we use
> 'heap_create_with_catalog()' so I noticed that we always use the
> internally generated name, so wouldn't it be nice to make the conflict
> log table as bool and use internally generated name something like
> conflict_log_table_$subid$ and we will always create that in current
> active searchpath?  Thought?
>

We could do this as a first step. See the proposal in email [1] where
we have discussed having two options instead of one. The first option
will be conflict_log_format and the values would be log and table. In
this case, the table would be an internally generated one.

[1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-16T04:24:03Z

On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > I was considering the interdependence between the subscription and the
> > > > conflict log table (CLT). IMHO, it would be logical to establish the
> > > > subscription as dependent on the CLT. This way, if someone attempts to
> > > > drop the CLT, the system would recognize the dependency of the
> > > > subscription and prevent the drop unless the subscription is removed
> > > > first or the CASCADE option is used.
> > > >
> > > > However, while investigating this, I encountered an error [1] stating
> > > > that global objects are not supported in this context. This indicates
> > > > that global objects cannot be made dependent on local objects.
> > > >
> > >
> > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database
> > > objects. For example, consider following case:
> > > postgres=# create table t1(c1 int primary key);
> > > CREATE TABLE
> > > postgres=# \d+ t1
> > >                                            Table "public.t1"
> > >  Column |  Type   | Collation | Nullable | Default | Storage |
> > > Compression | Stats target | Description
> > > --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
> > >  c1     | integer |           | not null |         | plain   |
> > >     |              |
> > > Indexes:
> > >     "t1_pkey" PRIMARY KEY, btree (c1)
> > > Publications:
> > >     "pub1"
> > > Not-null constraints:
> > >     "t1_c1_not_null" NOT NULL "c1"
> > > Access method: heap
> > > postgres=# drop index t1_pkey;
> > > ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
> > > t1 requires it
> > > HINT:  You can drop constraint t1_pkey on table t1 instead.
> > >
> > > Here, the PK index is created as part for CREATE TABLE operation and
> > > pk_index is not allowed to be dropped independently.
> > >
> > > > Although making an object dependent on global/shared objects is
> > > > possible for certain types of shared objects [2], this is not our main
> > > > objective.
> > > >
> > >
> > > As per my understanding from the above example, we need something like
> > > that only for shared object subscription and (internally created)
> > > table.
> >
> > Yeah that seems to be exactly what we want, so I tried doing that by
> > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and
> > it is behaving as we want[2].  And while dropping the subscription or
> > altering CLT we can delete internal dependency so that CLT get dropped
> > automatically[3]
> >
> > I will send an updated patch after testing a few more scenarios and
> > fixing other pending issues.
> >
> > [1]
> > +       ObjectAddressSet(myself, RelationRelationId, relid);
> > +       ObjectAddressSet(subaddr, SubscriptionRelationId, subid);
> > +       recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL);
> >
> >
> > [2]
> > postgres[670778]=# DROP TABLE myschema.conflict_log_history2;
> > ERROR:  2BP01: cannot drop table myschema.conflict_log_history2
> > because subscription sub requires it
> > HINT:  You can drop subscription sub instead.
> > LOCATION:  findDependentObjects, dependency.c:788
> > postgres[670778]=#
> >
> > [3]
> > ObjectAddressSet(object, SubscriptionRelationId, subid);
> > performDeletion(&object, DROP_CASCADE
> >                            PERFORM_DELETION_INTERNAL |
> >                            PERFORM_DELETION_SKIP_ORIGINAL);
> >
> >
>
> Here is the patch which implements the dependency and fixes other
> comments from Shveta.

Thanks for the changes, the new implementation based on dependency
creates a cycle while dumping:
./pg_dump -d postgres -f dump1.txt -p 5433
pg_dump: warning: could not resolve dependency loop among these items:
pg_dump: detail: TABLE conflict  (ID 225 OID 16397)
pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396)
pg_dump: detail: POST-DATA BOUNDARY  (ID 3491)
pg_dump: detail: TABLE DATA t1  (ID 3485 OID 16384)
pg_dump: detail: PRE-DATA BOUNDARY  (ID 3490)

This can be seen with a simple subscription with conflict_log_table.
This was working fine with the v11 version patch.

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-16T05:03:01Z

On Mon, Dec 15, 2025 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Dec 14, 2025 at 9:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > Here is the patch which implements the dependency and fixes other
> > comments from Shveta.
> >
>
> +/*
> + * Check if the specified relation is used as a conflict log table by any
> + * subscription.
> + */
> +bool
> +IsConflictLogTable(Oid relid)
> +{
> + Relation rel;
> + TableScanDesc scan;
> + HeapTuple tup;
> + bool is_clt = false;
> +
> + rel = table_open(SubscriptionRelationId, AccessShareLock);
> + scan = table_beginscan_catalog(rel, 0, NULL);
> +
> + while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection)))
>
> This function has been used at multiple places in the patch, though
> not in any performance-critical paths, but still, it seems like the
> impact can be noticeable for a large number of subscriptions. Also, I
> am not sure it is a good design to scan the entire system table to
> find whether some other relation is publishable or not. I see below
> kinds of usages for it:
>
> + /* Subscription conflict log tables are not published */
> + result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) &&
> + !IsConflictLogTable(relid);
>
> In this regard, I see a comment atop is_publishable_class which
> suggests as follows:
>
> The best
>  * long-term solution may be to add a "relispublishable" bool to pg_class,
>  * and depend on that instead of OID checks.
>  */
> static bool
> is_publishable_class(Oid relid, Form_pg_class reltuple)
>
> I feel that is a good idea for reasons mentioned atop
> is_publishable_class and for the conflict table. What do you think?
>

+1.
The OID check may be unreliable, as mentioned in the comment. I tested
this by dropping and recreating information_schema, and observed that
after recreation it became eligible for publication because its relid
no longer falls under FirstNormalObjectId.  Steps:

****Pub****:
create publication pub1;
ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing;
select * from information_schema.sql_sizing where sizing_id=97;

****Sub****:
create subscription sub1 connection '...' publication pub1 with
(copy_data=false);
select * from information_schema.sql_sizing where sizing_id=97;

****Pub****:
alter table information_schema.sql_sizing replica identity full;
--this is not replicated.
UPDATE information_schema.sql_sizing set supported_value=12 where sizing_id=97;

****Sub****:
postgres=# select supported_value from information_schema.sql_sizing
where sizing_id=97;
 supported_value
-----------------
              0

~~

Then drop and recreate and try to perform the above update again, it
gets replicated:

drop schema information_schema cascade;
./psql -d postgres -f ./../../src/backend/catalog/information_schema.sql -p 5433

****Pub****:
ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing;
select * from information_schema.sql_sizing where sizing_id=97;
alter table information_schema.sql_sizing replica identity full;
--This is replicated
UPDATE information_schema.sql_sizing set supported_value=14 where sizing_id=97;

****Sub****:
--This shows supported_value as 14
postgres=# select supported_value from information_schema.sql_sizing
where sizing_id=97;
 supported_value
-----------------
              14

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-16T05:41:48Z

On Thu, 11 Dec 2025 at 19:50, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > > 2)
> > > > When we do below:
> > > > alter subscription sub1 SET (conflict_log_table=clt2);
> > > >
> > > > the previous conflict log table is dropped. Is this behavior
> > > > intentional and discussed/concluded earlier? It’s possible that a user
> > > > may want to create a new conflict log table for future events while
> > > > still retaining the old one for analysis. If the subscription itself
> > > > is dropped, then dropping the CLT makes sense, but I’m not sure this
> > > > behavior is intended for ALTER SUBSCRIPTION.  I do understand that
> > > > once we unlink CLT from subscription, later even DROP subscription
> > > > cannot drop it, but user can always drop it when not needed.
> > > >
> > > > If we plan to keep existing behavior, it should be clearly documented
> > > > in a CAUTION section, and the command should explicitly log the table
> > > > drop.
> > >
> > > Yeah we discussed this behavior and the conclusion was we would
> > > document this behavior and its user's responsibility to take necessary
> > > backup of the conflict log table data if they are setting a new log
> > > table or NONE for the subscription.
> > >
> >
> > +1. If we don't do this then it will be difficult to track for
> > postgres or users the previous conflict history tables.
>
> Right, it makes sense.
>
> Attached patch fixed most of the open comments
> 1) \dRs+ now show the schema qualified name
> 2) Now key_tuple and replica_identify tuple both are add in conflict
> log tuple wherever applicable
> 3) Refactored the code so that we can define the conflict log table
> schema only once in the header file and both create_conflict_log_table
> and ValidateConflictLogTable use it.
>
> I was considering the interdependence between the subscription and the
> conflict log table (CLT). IMHO, it would be logical to establish the
> subscription as dependent on the CLT. This way, if someone attempts to
> drop the CLT, the system would recognize the dependency of the
> subscription and prevent the drop unless the subscription is removed
> first or the CASCADE option is used.
>
> However, while investigating this, I encountered an error [1] stating
> that global objects are not supported in this context. This indicates
> that global objects cannot be made dependent on local objects.
> Although making an object dependent on global/shared objects is
> possible for certain types of shared objects [2], this is not our main
> objective.
>
> We do not need to make the CLT dependent on the subscription because
> the table can be dropped when the subscription is dropped anyway and
> we are already doing it as part of drop subscription as well as alter
> subscription when CLT is set to NONE or a different table. Therefore,
> extending the functionality of shared dependency is unnecessary for
> this purpose.
>
> Thoughts?
>
> [1]
> doDeletion()
> {
> ....
> /*
> * These global object types are not supported here.
> */
> case AuthIdRelationId:
> case DatabaseRelationId:
> case TableSpaceRelationId:
> case SubscriptionRelationId:
> case ParameterAclRelationId:
> elog(ERROR, "global objects cannot be deleted by doDeletion");
> break;
> }
>
> [2]
> typedef enum SharedDependencyType
> {
> SHARED_DEPENDENCY_OWNER = 'o',
> SHARED_DEPENDENCY_ACL = 'a',
> SHARED_DEPENDENCY_INITACL = 'i',
> SHARED_DEPENDENCY_POLICY = 'r',
> SHARED_DEPENDENCY_TABLESPACE = 't',
> SHARED_DEPENDENCY_INVALID = 0,
> } SharedDependencyType;
>
> Pending Items are:
> 1. Handling dump/upgrade

The attached patch has the changes for handling dump. This works on
top of v11 version, it does not work on v12 because of the issue
reported at [1]. Currently the upgrade does not work because of the
existing issue which is being tracked at [2], upgrade works with the
patch attached at [2].

[1] - https://www.postgresql.org/message-id/CALDaNm1zEYoSdf2Ns-%3DUJRw95E5sbfpB0oaNUWtRJN27Q1Knhw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CALDaNm2x3rd7C0_HjUpJFbxpAqXgm%3DQtoKfkEWDVA8h%2BJFpa_w%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-16T07:17:52Z

On Mon, Dec 15, 2025 at 2:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 15, 2025 at 2:16 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Sun, Dec 14, 2025 at 9:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > 4)
> > postgres=# SELECT c.relname FROM pg_depend d JOIN pg_class c ON c.oid
> > = d.objid JOIN pg_subscription s ON s.oid = d.refobjid WHERE s.subname
> > = 'sub1';
> >  relname
> > ---------
> >  clt
> >
> > postgres=#  select count(*) from pg_shdepend  where refobjid = (select
> > oid from pg_subscription where subname='sub1');
> >  count
> > -------
> >      0
> >
> > Since dependency between sub and clt is a dependency involving
> > shared-object, shouldn't the entry be in pg_shdepend? Or do we allow
> > such entries in pg_depend as well?
>
> The primary reason for recording in pg_depend is that the
> RemoveRelations() function already includes logic to check for and
> report internal dependencies within pg_depends. Consequently, if we
> were to record the dependency in pg_shdepends, we would likely need to
> modify RemoveRelations() to incorporate handling for pg_shdepends
> dependencies.
>
> However, some might argue that when an object ID (objid) is local and
> the referenced object ID (refobjid) is shared, such as when a table is
> created under a ROLE, establishing a dependency with the owner, the
> dependency is currently recorded in pg_shdepend. In this scenario, the
> dependent object (the local table) can be dropped independently, while
> the referenced object (the shared owner) cannot.
>

Yes and same is true for tablespaces. Consider below case:
create tablespace tbs location <tbs_location>;
create table t2(c1 int, c2 int) PARTITION BY RANGE(c1) tablespace tbs;

>
> However, when aiming
> to record an internal dependency, the dependent object should not be
> droppable without first dropping the referencing object. Therefore, I
> believe the dependency record should be placed in pg_depend, as the
> depender is a local object and will check for dependencies there.
>

I think it make sense to add the dependency entry in pg_depend for
this case (dependent object table is db-local and referenced object
subscription is shared among cluster) as there is a fundamental
architectural difference between Tablespaces/Roles and Subscriptions
that determines why one needs pg_shdepend and the other is better off
with pg_depend.

It comes down to cross-database visibility during the DROP command.

1. The "Tablespace" Scenario (Why it needs pg_shdepend)
A Tablespace is a truly global resource. You can connect to postgres
(database A) and try to drop a tablespace that is being used by app_db
(database B).

The Problem: When you run DROP TABLESPACE tbs from Database A, the
system cannot look inside Database B's pg_depend to see if the
tablespace is in use. It would have to connect to every database in
the cluster to check.

The Solution: We explicitly push this dependency up to the global
pg_shdepend. This allows the DROP command in Database A to instantly
see: "Wait, object 123 in Database B needs this. Block the drop."

2. The "Subscription" Scenario (Why it does NOT need pg_shdepend)
Although pg_subscription is a shared catalog, a Subscription is pinned
to a specific database (subdbid). One can only DROP SUBSCRIPTION while
connected to the database that owns it. Consider a scenario where one
creates a subscription sub_1 in app_db. Now, one cannot connect to
postgres DB and run DROP SUBSCRIPTION sub_1. She must connect to
app_db. Since we need to conenct to app_db to drop the subscription,
the system has direct, fast access to the local pg_depend of app_db.
It doesn't need to consult a global "Cross-DB" catalog because there
is no mystery about where the dependencies live.

Does this theory sound more bullet-proof as to why it is desirable to
store dependency entries for this case in pg_depend. If so, I suggest
we can add some comments to explain the difference of subscription
with other shared objects in comments as the future readers may have
the same question.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-17T04:28:52Z

On Tue, Dec 16, 2025 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:

> The OID check may be unreliable, as mentioned in the comment. I tested
> this by dropping and recreating information_schema, and observed that
> after recreation it became eligible for publication because its relid
> no longer falls under FirstNormalObjectId.  Steps:
>
> ****Pub****:
> create publication pub1;
> ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing;
> select * from information_schema.sql_sizing where sizing_id=97;
>
> ****Sub****:
> create subscription sub1 connection '...' publication pub1 with
> (copy_data=false);
> select * from information_schema.sql_sizing where sizing_id=97;
>
> ****Pub****:
> alter table information_schema.sql_sizing replica identity full;
> --this is not replicated.
> UPDATE information_schema.sql_sizing set supported_value=12 where sizing_id=97;
>
> ****Sub****:
> postgres=# select supported_value from information_schema.sql_sizing
> where sizing_id=97;
>  supported_value
> -----------------
>               0
>
> ~~
>
> Then drop and recreate and try to perform the above update again, it
> gets replicated:
>
> drop schema information_schema cascade;
> ./psql -d postgres -f ./../../src/backend/catalog/information_schema.sql -p 5433
>
> ****Pub****:
> ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing;
> select * from information_schema.sql_sizing where sizing_id=97;
> alter table information_schema.sql_sizing replica identity full;
> --This is replicated
> UPDATE information_schema.sql_sizing set supported_value=14 where sizing_id=97;
>
> ****Sub****:
> --This shows supported_value as 14
> postgres=# select supported_value from information_schema.sql_sizing
> where sizing_id=97;
>  supported_value
> -----------------
>               14

Hmm, I might be missing something what why we do not want to publish
which is in information_shcema, especially when the internally created
schema is dropped then user can create his own schema with name
information-schema and create a bunch of tables in that so why do we
want to block those?  I mean the example you showed here is pretty
much like a user created schema and table no? Or am I missing
something important?

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-17T09:44:04Z

On Wed, Dec 17, 2025 at 9:59 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 16, 2025 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> > The OID check may be unreliable, as mentioned in the comment. I tested
> > this by dropping and recreating information_schema, and observed that
> > after recreation it became eligible for publication because its relid
> > no longer falls under FirstNormalObjectId.  Steps:
> >
> > ****Pub****:
> > create publication pub1;
> > ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing;
> > select * from information_schema.sql_sizing where sizing_id=97;
> >
> > ****Sub****:
> > create subscription sub1 connection '...' publication pub1 with
> > (copy_data=false);
> > select * from information_schema.sql_sizing where sizing_id=97;
> >
> > ****Pub****:
> > alter table information_schema.sql_sizing replica identity full;
> > --this is not replicated.
> > UPDATE information_schema.sql_sizing set supported_value=12 where sizing_id=97;
> >
> > ****Sub****:
> > postgres=# select supported_value from information_schema.sql_sizing
> > where sizing_id=97;
> >  supported_value
> > -----------------
> >               0
> >
> > ~~
> >
> > Then drop and recreate and try to perform the above update again, it
> > gets replicated:
> >
> > drop schema information_schema cascade;
> > ./psql -d postgres -f ./../../src/backend/catalog/information_schema.sql -p 5433
> >
> > ****Pub****:
> > ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing;
> > select * from information_schema.sql_sizing where sizing_id=97;
> > alter table information_schema.sql_sizing replica identity full;
> > --This is replicated
> > UPDATE information_schema.sql_sizing set supported_value=14 where sizing_id=97;
> >
> > ****Sub****:
> > --This shows supported_value as 14
> > postgres=# select supported_value from information_schema.sql_sizing
> > where sizing_id=97;
> >  supported_value
> > -----------------
> >               14
>
> Hmm, I might be missing something what why we do not want to publish
> which is in information_shcema, especially when the internally created
> schema is dropped then user can create his own schema with name
> information-schema and create a bunch of tables in that so why do we
> want to block those?  I mean the example you showed here is pretty
> much like a user created schema and table no? Or am I missing
> something important?
>

I don’t think a user intentionally dropping information_schema and
creating their own schema (with different definitions and tables) is a
practical scenario. While it isn’t explicitly restricted, I don’t see
a strong need for it. OTOH, there are scenarios where, after fixing
issues that affect the definition of information_schema on stable
branches, users may be asked to reload information_schema to apply the
updated definitions. One such case can be seen in [1].

Additionally, while reviewing the code, I noticed places where the
logic does not rely solely on relid being less than
FirstNormalObjectId. Instead, it performs name-based comparisons,
explicitly accounting for the possibility that information_schema may
have been dropped and reloaded. This further indicates that such
scenarios are considered practical. See [2].
And if such scenarios are possible, it might be worth considering
keeping the publish behavior consistent, both before and after a
reload of information_schema.

[1]:
https://www.postgresql.org/docs/9.1/release-9-1-2.html

[2]:
pg_upgrade has this:
static DataTypesUsageChecks data_types_usage_checks[] =
{
        /*
         * Look for composite types that were made during initdb *or* belong to
         * information_schema; that's important in case information_schema was
         * dropped and reloaded.
         *
         * The cutoff OID here should match the source cluster's value of
         * FirstNormalObjectId.  We hardcode it rather than using that C #define
         * because, if that #define is ever changed, our own version's value is
         * NOT what to use.  Eventually we may need a test on the
source cluster's
         * version to select the correct value.
         */
        {
                .status = gettext_noop("Checking for system-defined
composite types in user tables"),
                .report_filename = "tables_using_composite.txt",
                .base_query =
                "SELECT t.oid FROM pg_catalog.pg_type t "
                "LEFT JOIN pg_catalog.pg_namespace n ON t.typnamespace = n.oid "
                " WHERE typtype = 'c' AND (t.oid < 16384 OR nspname =
'information_schema')",

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-17T09:58:49Z

On Wed, Dec 17, 2025 at 3:14 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> I don’t think a user intentionally dropping information_schema and
> creating their own schema (with different definitions and tables) is a
> practical scenario. While it isn’t explicitly restricted, I don’t see
> a strong need for it. OTOH, there are scenarios where, after fixing
> issues that affect the definition of information_schema on stable
> branches, users may be asked to reload information_schema to apply the
> updated definitions. One such case can be seen in [1].
>
> Additionally, while reviewing the code, I noticed places where the
> logic does not rely solely on relid being less than
> FirstNormalObjectId. Instead, it performs name-based comparisons,
> explicitly accounting for the possibility that information_schema may
> have been dropped and reloaded. This further indicates that such
> scenarios are considered practical. See [2].
> And if such scenarios are possible, it might be worth considering
> keeping the publish behavior consistent, both before and after a
> reload of information_schema.
>
> [1]:
> https://www.postgresql.org/docs/9.1/release-9-1-2.html
>
> [2]:
> pg_upgrade has this:
> static DataTypesUsageChecks data_types_usage_checks[] =
> {
>         /*
>          * Look for composite types that were made during initdb *or* belong to
>          * information_schema; that's important in case information_schema was
>          * dropped and reloaded.
>          *
>          * The cutoff OID here should match the source cluster's value of
>          * FirstNormalObjectId.  We hardcode it rather than using that C #define
>          * because, if that #define is ever changed, our own version's value is
>          * NOT what to use.  Eventually we may need a test on the
> source cluster's
>          * version to select the correct value.
>          */
>         {
>                 .status = gettext_noop("Checking for system-defined
> composite types in user tables"),
>                 .report_filename = "tables_using_composite.txt",
>                 .base_query =
>                 "SELECT t.oid FROM pg_catalog.pg_type t "
>                 "LEFT JOIN pg_catalog.pg_namespace n ON t.typnamespace = n.oid "
>                 " WHERE typtype = 'c' AND (t.oid < 16384 OR nspname =
> 'information_schema')",

Yeah I agree with your theory.  While the system allows users to
manually create an information_schema or place objects within it, we
are establishing that anything inside this schema will be treated as
an internal object. If a user chooses to bypass these conventions and
then finds the objects are not handled like standard user tables, it
constitutes a usage error rather than a system bug.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-17T10:19:58Z

On Wed, Dec 17, 2025 at 3:29 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Dec 17, 2025 at 3:14 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > I don’t think a user intentionally dropping information_schema and
> > creating their own schema (with different definitions and tables) is a
> > practical scenario. While it isn’t explicitly restricted, I don’t see
> > a strong need for it. OTOH, there are scenarios where, after fixing
> > issues that affect the definition of information_schema on stable
> > branches, users may be asked to reload information_schema to apply the
> > updated definitions. One such case can be seen in [1].
> >
> > Additionally, while reviewing the code, I noticed places where the
> > logic does not rely solely on relid being less than
> > FirstNormalObjectId. Instead, it performs name-based comparisons,
> > explicitly accounting for the possibility that information_schema may
> > have been dropped and reloaded. This further indicates that such
> > scenarios are considered practical. See [2].
> > And if such scenarios are possible, it might be worth considering
> > keeping the publish behavior consistent, both before and after a
> > reload of information_schema.
> >
> > [1]:
> > https://www.postgresql.org/docs/9.1/release-9-1-2.html
> >
> > [2]:
> > pg_upgrade has this:
> > static DataTypesUsageChecks data_types_usage_checks[] =
> > {
> >         /*
> >          * Look for composite types that were made during initdb *or* belong to
> >          * information_schema; that's important in case information_schema was
> >          * dropped and reloaded.
> >          *
> >          * The cutoff OID here should match the source cluster's value of
> >          * FirstNormalObjectId.  We hardcode it rather than using that C #define
> >          * because, if that #define is ever changed, our own version's value is
> >          * NOT what to use.  Eventually we may need a test on the
> > source cluster's
> >          * version to select the correct value.
> >          */
> >         {
> >                 .status = gettext_noop("Checking for system-defined
> > composite types in user tables"),
> >                 .report_filename = "tables_using_composite.txt",
> >                 .base_query =
> >                 "SELECT t.oid FROM pg_catalog.pg_type t "
> >                 "LEFT JOIN pg_catalog.pg_namespace n ON t.typnamespace = n.oid "
> >                 " WHERE typtype = 'c' AND (t.oid < 16384 OR nspname =
> > 'information_schema')",
>
> Yeah I agree with your theory.  While the system allows users to
> manually create an information_schema or place objects within it, we
> are establishing that anything inside this schema will be treated as
> an internal object. If a user chooses to bypass these conventions and
> then finds the objects are not handled like standard user tables, it
> constitutes a usage error rather than a system bug.

Yes, I think so as well. IIUC, we wouldn’t be establishing anything
new here; this behavior is already established. If we look at the code
paths that reference information_schema, it is consistently treated as
similar to system schema rather than a user schema. A few examples
include XML_VISIBLE_SCHEMAS_EXCLUDE, selectDumpableNamespace,
data_types_usage_checks, describeFunctions, describeAggregates, and
others.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-18T09:09:18Z

On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

>
> We could do this as a first step. See the proposal in email [1] where
> we have discussed having two options instead of one. The first option
> will be conflict_log_format and the values would be log and table. In
> this case, the table would be an internally generated one.
>
> [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com

So I have put more thought on this and here is what I am proposing

1) Subscription Parameter: Son in first version the subscription
parameter will be named 'conflict_log_format' which will accept
'log/table/both' default option would be log.
2) If conflict_log_format = log is provided then we do not need to do
anything as this would work by default
3) If conflict_log_format = table/both is provided then we will
generate a internal table name i.e. conflict_log_table_$subid$ and the
table will be created in the current schema
4) in pg_subscription we will still keep 2 field a) namespace id of
the conflict log table b) the conflict log format = 'log/table'both'
5) If option is table/both the name can be generated on the fly
whether we are creating the table or inserting conflict into the
table.

Question:
1) Shall we create a conflict log table in the current schema or we
should consider anything else, IMHO the current schema should be fine
and in the future when we add an option for conflict_log_table we will
support schema qualified names as well?
2) In catalog I am storing the "conflict_log_format" option as a text
field, is there any better way so that we can store in fixed format
maybe enum value as an integer we can do e.g. from below enum we can
store the integer value in system catalog for "conflict_log_format"
field, not sure if we have done such think anywhere else?

typedef enum ConflictLogFormat
{
CONFLICT_LOG_FORMAT_DEFAULT = 0,
CONFLICT_LOG_FORMAT_LOG,
CONFLICT_LOG_FORMAT_TABLE,
CONFLICT_LOG_FORMAT_BOTH
} ConflictLogFormat;

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-18T09:55:26Z

On Thu, Dec 18, 2025 at 2:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> >
> > We could do this as a first step. See the proposal in email [1] where
> > we have discussed having two options instead of one. The first option
> > will be conflict_log_format and the values would be log and table. In
> > this case, the table would be an internally generated one.
> >
> > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com
>
> So I have put more thought on this and here is what I am proposing
>
> 1) Subscription Parameter: Son in first version the subscription
> parameter will be named 'conflict_log_format' which will accept
> 'log/table/both' default option would be log.
> 2) If conflict_log_format = log is provided then we do not need to do
> anything as this would work by default
> 3) If conflict_log_format = table/both is provided then we will
> generate a internal table name i.e. conflict_log_table_$subid$ and the
> table will be created in the current schema
> 4) in pg_subscription we will still keep 2 field a) namespace id of
> the conflict log table b) the conflict log format = 'log/table'both'
> 5) If option is table/both the name can be generated on the fly
> whether we are creating the table or inserting conflict into the
> table.
>
> Question:
> 1) Shall we create a conflict log table in the current schema or we
> should consider anything else, IMHO the current schema should be fine
> and in the future when we add an option for conflict_log_table we will
> support schema qualified names as well?
> 2) In catalog I am storing the "conflict_log_format" option as a text
> field, is there any better way so that we can store in fixed format
> maybe enum value as an integer we can do e.g. from below enum we can
> store the integer value in system catalog for "conflict_log_format"
> field, not sure if we have done such think anywhere else?
>
> typedef enum ConflictLogFormat
> {
> CONFLICT_LOG_FORMAT_DEFAULT = 0,
> CONFLICT_LOG_FORMAT_LOG,
> CONFLICT_LOG_FORMAT_TABLE,
> CONFLICT_LOG_FORMAT_BOTH
> } ConflictLogFormat;

While exploring other kinds of options I think we can make it a char
something like relkind as shown below, any other opinion on the same?

#define CONFLICT_LOG_FORMAT_LOG = 'l'
#define CONFLICT_LOG_FORMAT_TABLE = 't'
#define CONFLICT_LOG_FORMAT_BOTH = 'b'

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-18T11:06:15Z

On Thu, Dec 18, 2025 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Dec 18, 2025 at 2:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > >
> > > We could do this as a first step. See the proposal in email [1] where
> > > we have discussed having two options instead of one. The first option
> > > will be conflict_log_format and the values would be log and table. In
> > > this case, the table would be an internally generated one.
> > >
> > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com
> >
> > So I have put more thought on this and here is what I am proposing
> >
> > 1) Subscription Parameter: Son in first version the subscription
> > parameter will be named 'conflict_log_format' which will accept
> > 'log/table/both' default option would be log.
> > 2) If conflict_log_format = log is provided then we do not need to do
> > anything as this would work by default
> > 3) If conflict_log_format = table/both is provided then we will
> > generate a internal table name i.e. conflict_log_table_$subid$ and the
> > table will be created in the current schema
> > 4) in pg_subscription we will still keep 2 field a) namespace id of
> > the conflict log table b) the conflict log format = 'log/table'both'
> > 5) If option is table/both the name can be generated on the fly
> > whether we are creating the table or inserting conflict into the
> > table.
> >
> > Question:
> > 1) Shall we create a conflict log table in the current schema or we
> > should consider anything else, IMHO the current schema should be fine
> > and in the future when we add an option for conflict_log_table we will
> > support schema qualified names as well?
> > 2) In catalog I am storing the "conflict_log_format" option as a text
> > field, is there any better way so that we can store in fixed format
> > maybe enum value as an integer we can do e.g. from below enum we can
> > store the integer value in system catalog for "conflict_log_format"
> > field, not sure if we have done such think anywhere else?
> >
> > typedef enum ConflictLogFormat
> > {
> > CONFLICT_LOG_FORMAT_DEFAULT = 0,
> > CONFLICT_LOG_FORMAT_LOG,
> > CONFLICT_LOG_FORMAT_TABLE,
> > CONFLICT_LOG_FORMAT_BOTH
> > } ConflictLogFormat;
>
> While exploring other kinds of options I think we can make it a char
> something like relkind as shown below, any other opinion on the same?
>
> #define CONFLICT_LOG_FORMAT_LOG = 'l'
> #define CONFLICT_LOG_FORMAT_TABLE = 't'
> #define CONFLICT_LOG_FORMAT_BOTH = 'b'
>

+1. Also, we should expose this to users with a type as enum similar
to auto_explain.log_format or publish_generated_columns.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Masahiko Sawada <sawada.mshk@gmail.com> — 2025-12-18T23:07:53Z

On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> >
> > We could do this as a first step. See the proposal in email [1] where
> > we have discussed having two options instead of one. The first option
> > will be conflict_log_format and the values would be log and table. In
> > this case, the table would be an internally generated one.
> >
> > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com
>
> So I have put more thought on this and here is what I am proposing
>
> 1) Subscription Parameter: Son in first version the subscription
> parameter will be named 'conflict_log_format' which will accept
> 'log/table/both' default option would be log.
> 2) If conflict_log_format = log is provided then we do not need to do
> anything as this would work by default
> 3) If conflict_log_format = table/both is provided then we will
> generate a internal table name i.e. conflict_log_table_$subid$ and the
> table will be created in the current schema
> 4) in pg_subscription we will still keep 2 field a) namespace id of
> the conflict log table b) the conflict log format = 'log/table'both'
> 5) If option is table/both the name can be generated on the fly
> whether we are creating the table or inserting conflict into the
> table.

I have a question: who will be the owner of the conflict log table? I
assume that the subscription owner would own the conflict log table
and the conflict logs are inserted by the owner but not by the table
owner, is that right?

>
> Question:
> 1) Shall we create a conflict log table in the current schema or we
> should consider anything else, IMHO the current schema should be fine
> and in the future when we add an option for conflict_log_table we will
> support schema qualified names as well?

Some questions:

If the same name table already exists, CREATE SUBSCRIPTION will fail, right?

Can the conflict log table be used like normal user tables (e.g.,
creating a trigger/a foreign key, running vacuum, ALTER TABLE etc.)?

> 2) In catalog I am storing the "conflict_log_format" option as a text
> field, is there any better way so that we can store in fixed format
> maybe enum value as an integer we can do e.g. from below enum we can
> store the integer value in system catalog for "conflict_log_format"
> field, not sure if we have done such think anywhere else?
>
> typedef enum ConflictLogFormat
> {
> CONFLICT_LOG_FORMAT_DEFAULT = 0,
> CONFLICT_LOG_FORMAT_LOG,
> CONFLICT_LOG_FORMAT_TABLE,
> CONFLICT_LOG_FORMAT_BOTH
> } ConflictLogFormat;

How about making conflict_log_format accept a list of destinations
instead of having the 'both' option in case where we might add more
destination options in the future?

It seems to me that conflict_log_destination sounds better.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-12-19T00:04:47Z

On Thu, Dec 18, 2025 at 8:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> >
> > We could do this as a first step. See the proposal in email [1] where
> > we have discussed having two options instead of one. The first option
> > will be conflict_log_format and the values would be log and table. In
> > this case, the table would be an internally generated one.
> >
> > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com
>
> So I have put more thought on this and here is what I am proposing
>
> 1) Subscription Parameter: Son in first version the subscription
> parameter will be named 'conflict_log_format' which will accept
> 'log/table/both' default option would be log.
> 2) If conflict_log_format = log is provided then we do not need to do
> anything as this would work by default
> 3) If conflict_log_format = table/both is provided then we will
> generate a internal table name i.e. conflict_log_table_$subid$ and the
> table will be created in the current schema
> 4) in pg_subscription we will still keep 2 field a) namespace id of
> the conflict log table b) the conflict log format = 'log/table'both'
> 5) If option is table/both the name can be generated on the fly
> whether we are creating the table or inserting conflict into the
> table.

IIUC, previously you had a "none" value which was a way to "turn off"
any CLT previously defined. How can users do that now with
log/table/both? Would they have to reassign (the default) "log"? That
seems a bit strange.

The word "both" option is too restrictive. What if in the future you
added a 3rd kind of destination -- then what does "both" mean?

Maybe the destination list idea of Sawda-San's is better.
a) it resolves the "none" issue -- e.g., empty string means revert to
default CLT behaviour
b) it resolves the "both" issue.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-12-19T00:28:31Z

On Thu, Dec 18, 2025 at 8:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
...
>
> Question:
> 1) Shall we create a conflict log table in the current schema or we
> should consider anything else, IMHO the current schema should be fine
> and in the future when we add an option for conflict_log_table we will
> support schema qualified names as well?

You might be able to avoid a proliferation of related options (such as
conflict_log_table) if you renamed the main option to
"conflict_log_destination" like Sawada-San was suggesting.

e.g.

conflict_log_destimation="table" --> use default table named by code
conflict_log_destimation="table=myschema.mytable" --> table name
nominated by user

e.g. if wanted maybe this idea can extend to logs too.

conflict_log_destimation="log" --> use default pg log files
conflict_log_destimation="log=my_clt_log.txt" --> write conflicts to a
separate log file nominated by user

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-19T04:09:48Z

On Fri, Dec 19, 2025 at 4:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
>
> > 2) In catalog I am storing the "conflict_log_format" option as a text
> > field, is there any better way so that we can store in fixed format
> > maybe enum value as an integer we can do e.g. from below enum we can
> > store the integer value in system catalog for "conflict_log_format"
> > field, not sure if we have done such think anywhere else?
> >
> > typedef enum ConflictLogFormat
> > {
> > CONFLICT_LOG_FORMAT_DEFAULT = 0,
> > CONFLICT_LOG_FORMAT_LOG,
> > CONFLICT_LOG_FORMAT_TABLE,
> > CONFLICT_LOG_FORMAT_BOTH
> > } ConflictLogFormat;
>
> How about making conflict_log_format accept a list of destinations
> instead of having the 'both' option in case where we might add more
> destination options in the future?
>
> It seems to me that conflict_log_destination sounds better.
>

Yeah, this is worth considering. But say, we need to extend it so that
the conflict data goes in xml format file instead of standard log then
won't it look a bit odd to specify via conflict_log_destination. I
thought we could name it similar to the existing
auto_explain.log_format.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-19T04:23:23Z

On Fri, Dec 19, 2025 at 9:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 4:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> >
> > > 2) In catalog I am storing the "conflict_log_format" option as a text
> > > field, is there any better way so that we can store in fixed format
> > > maybe enum value as an integer we can do e.g. from below enum we can
> > > store the integer value in system catalog for "conflict_log_format"
> > > field, not sure if we have done such think anywhere else?
> > >
> > > typedef enum ConflictLogFormat
> > > {
> > > CONFLICT_LOG_FORMAT_DEFAULT = 0,
> > > CONFLICT_LOG_FORMAT_LOG,
> > > CONFLICT_LOG_FORMAT_TABLE,
> > > CONFLICT_LOG_FORMAT_BOTH
> > > } ConflictLogFormat;
> >
> > How about making conflict_log_format accept a list of destinations
> > instead of having the 'both' option in case where we might add more
> > destination options in the future?
> >
> > It seems to me that conflict_log_destination sounds better.
> >
>
> Yeah, this is worth considering. But say, we need to extend it so that
> the conflict data goes in xml format file instead of standard log then
> won't it look a bit odd to specify via conflict_log_destination. I
> thought we could name it similar to the existing
> auto_explain.log_format.

IMHO conflict_log_destination sounds more appropriate considering we
are talking about the log destination instead of format no?  And the
option could be log/table/file etc, and for now we can just stick to
log/table.  And in future we can extend it by supporting extra options
like destination_name, where we can provide table name or file name
etc.  So let me list down all the points which need consensus.

1. What should be the name of the option 'conflict_log_destination' vs
'conflict_log_format'
2. Do we want to support multi destination then providing string like
'conflict_log_destination = 'log,table,..' make more sense but then we
would have to store as a string in catalog and parse it everytime we
insert conflicts or alter subscription OTOH currently I have just
support single option log/table/both which make things much easy
because then in catalog we can store as a single char field and don't
need any parsing.  And since the input are taken as a string itself,
even if in future we want to support more options like  'log,table,..'
it would be backward compatible with old options.
3. Do we want to support 'none' destinations? i.e. do not log to anywhere?

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-19T04:25:03Z

On Fri, Dec 19, 2025 at 5:35 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Thu, Dec 18, 2025 at 8:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > >
> > > We could do this as a first step. See the proposal in email [1] where
> > > we have discussed having two options instead of one. The first option
> > > will be conflict_log_format and the values would be log and table. In
> > > this case, the table would be an internally generated one.
> > >
> > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com
> >
> > So I have put more thought on this and here is what I am proposing
> >
> > 1) Subscription Parameter: Son in first version the subscription
> > parameter will be named 'conflict_log_format' which will accept
> > 'log/table/both' default option would be log.
> > 2) If conflict_log_format = log is provided then we do not need to do
> > anything as this would work by default
> > 3) If conflict_log_format = table/both is provided then we will
> > generate a internal table name i.e. conflict_log_table_$subid$ and the
> > table will be created in the current schema
> > 4) in pg_subscription we will still keep 2 field a) namespace id of
> > the conflict log table b) the conflict log format = 'log/table'both'
> > 5) If option is table/both the name can be generated on the fly
> > whether we are creating the table or inserting conflict into the
> > table.
>
> IIUC, previously you had a "none" value which was a way to "turn off"
> any CLT previously defined. How can users do that now with
> log/table/both? Would they have to reassign (the default) "log"? That
> seems a bit strange.

Previously we were supporting only conflict log tables and by default
it was always sent to log.  And "none" was used for clearing the
conflict log table option; it was never meant for not logging anywhere
it was meant to say that there is no conflict log table.  Now also we
can have another option as none but I intentionally avoided it
considering we want to support the case where we don't want to log it
at all, maybe that's not a bad idea either.  Let's see what others
think about it.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-19T04:52:59Z

On Fri, Dec 19, 2025 at 9:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 4:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> >
> > > 2) In catalog I am storing the "conflict_log_format" option as a text
> > > field, is there any better way so that we can store in fixed format
> > > maybe enum value as an integer we can do e.g. from below enum we can
> > > store the integer value in system catalog for "conflict_log_format"
> > > field, not sure if we have done such think anywhere else?
> > >
> > > typedef enum ConflictLogFormat
> > > {
> > > CONFLICT_LOG_FORMAT_DEFAULT = 0,
> > > CONFLICT_LOG_FORMAT_LOG,
> > > CONFLICT_LOG_FORMAT_TABLE,
> > > CONFLICT_LOG_FORMAT_BOTH
> > > } ConflictLogFormat;
> >
> > How about making conflict_log_format accept a list of destinations
> > instead of having the 'both' option in case where we might add more
> > destination options in the future?
> >
> > It seems to me that conflict_log_destination sounds better.
> >
>
> Yeah, this is worth considering. But say, we need to extend it so that
> the conflict data goes in xml format file instead of standard log then
> won't it look a bit odd to specify via conflict_log_destination. I
> thought we could name it similar to the existing
> auto_explain.log_format.
>

One option could be to separate destination and format:
conflict_log_history.destination : log/table
conflict_log_history.format : xml/json/text etc

Another option could be to use a single parameter,
'conflict_log_destination', with values such as:
table, xmllog, jsonlog, stderr/textlog

(where stderr corresponds to logging to log/postgresql.log, similar to
log_destination at [1]). I prefer this approach.

[1]: https://www.postgresql.org/docs/18/runtime-config-logging.html

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-19T05:10:28Z

On Fri, Dec 19, 2025 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 9:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Dec 19, 2025 at 4:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > >
> > > > 2) In catalog I am storing the "conflict_log_format" option as a text
> > > > field, is there any better way so that we can store in fixed format
> > > > maybe enum value as an integer we can do e.g. from below enum we can
> > > > store the integer value in system catalog for "conflict_log_format"
> > > > field, not sure if we have done such think anywhere else?
> > > >
> > > > typedef enum ConflictLogFormat
> > > > {
> > > > CONFLICT_LOG_FORMAT_DEFAULT = 0,
> > > > CONFLICT_LOG_FORMAT_LOG,
> > > > CONFLICT_LOG_FORMAT_TABLE,
> > > > CONFLICT_LOG_FORMAT_BOTH
> > > > } ConflictLogFormat;
> > >
> > > How about making conflict_log_format accept a list of destinations
> > > instead of having the 'both' option in case where we might add more
> > > destination options in the future?
> > >
> > > It seems to me that conflict_log_destination sounds better.
> > >
> >
> > Yeah, this is worth considering. But say, we need to extend it so that
> > the conflict data goes in xml format file instead of standard log then
> > won't it look a bit odd to specify via conflict_log_destination. I
> > thought we could name it similar to the existing
> > auto_explain.log_format.
>
> IMHO conflict_log_destination sounds more appropriate considering we
> are talking about the log destination instead of format no?  And the
> option could be log/table/file etc, and for now we can just stick to
> log/table.  And in future we can extend it by supporting extra options
> like destination_name, where we can provide table name or file name
> etc.  So let me list down all the points which need consensus.
>
> 1. What should be the name of the option 'conflict_log_destination' vs
> 'conflict_log_format'

I prefer conflcit_log_destination.

> 2. Do we want to support multi destination then providing string like
> 'conflict_log_destination = 'log,table,..' make more sense but then we
> would have to store as a string in catalog and parse it everytime we
> insert conflicts or alter subscription OTOH currently I have just
> support single option log/table/both which make things much easy
> because then in catalog we can store as a single char field and don't
> need any parsing.  And since the input are taken as a string itself,
> even if in future we want to support more options like  'log,table,..'
> it would be backward compatible with old options.

I feel, combination of options might be a good idea, similar to how
'log_destination' provides. But it can be done in future versions and
the first draft can be a simple one.

> 3. Do we want to support 'none' destinations? i.e. do not log to anywhere?

IMO, conflict information is an important piece of information to
diagnose data divergence and thus should be logged always.

Let's wait for others' opinions.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-12-19T05:42:15Z

On Fri, Dec 19, 2025 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 5:35 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Thu, Dec 18, 2025 at 8:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > >
> > > > We could do this as a first step. See the proposal in email [1] where
> > > > we have discussed having two options instead of one. The first option
> > > > will be conflict_log_format and the values would be log and table. In
> > > > this case, the table would be an internally generated one.
> > > >
> > > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com
> > >
> > > So I have put more thought on this and here is what I am proposing
> > >
> > > 1) Subscription Parameter: Son in first version the subscription
> > > parameter will be named 'conflict_log_format' which will accept
> > > 'log/table/both' default option would be log.
> > > 2) If conflict_log_format = log is provided then we do not need to do
> > > anything as this would work by default
> > > 3) If conflict_log_format = table/both is provided then we will
> > > generate a internal table name i.e. conflict_log_table_$subid$ and the
> > > table will be created in the current schema
> > > 4) in pg_subscription we will still keep 2 field a) namespace id of
> > > the conflict log table b) the conflict log format = 'log/table'both'
> > > 5) If option is table/both the name can be generated on the fly
> > > whether we are creating the table or inserting conflict into the
> > > table.
> >
> > IIUC, previously you had a "none" value which was a way to "turn off"
> > any CLT previously defined. How can users do that now with
> > log/table/both? Would they have to reassign (the default) "log"? That
> > seems a bit strange.
>
> Previously we were supporting only conflict log tables and by default
> it was always sent to log.  And "none" was used for clearing the
> conflict log table option; it was never meant for not logging anywhere
> it was meant to say that there is no conflict log table.  Now also we
> can have another option as none but I intentionally avoided it
> considering we want to support the case where we don't want to log it
> at all, maybe that's not a bad idea either.  Let's see what others
> think about it.
>

I didn't mean to suggest we should allow "not logging anywhere". I
only wanted to ask how the user is expected to revert the conflict
logging back to the default after they had set it to something else.

e.g.

CREATE SUBSCRIPTION mysub2 ... WITH(conflict_log_destination=table)
Now, how to ALTER SUBSCRIPTION to revert that back to default?

It seems there is no "reset to default" so is the user required to do
this explicitly?
ALTER SUBSCRIPTION mysub2 SET (conflict_log_destination=log);

Maybe that's fine --- I was just looking for some examples/clarification.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-19T06:14:05Z

On Fri, Dec 19, 2025 at 11:12 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> I didn't mean to suggest we should allow "not logging anywhere". I
> only wanted to ask how the user is expected to revert the conflict
> logging back to the default after they had set it to something else.

Okay understood, thanks for the clarification.

> e.g.
>
> CREATE SUBSCRIPTION mysub2 ... WITH(conflict_log_destination=table)
> Now, how to ALTER SUBSCRIPTION to revert that back to default?
>
> It seems there is no "reset to default" so is the user required to do
> this explicitly?
> ALTER SUBSCRIPTION mysub2 SET (conflict_log_destination=log);
>
> Maybe that's fine --- I was just looking for some examples/clarification.

Yeah this is the way, IMHO it looks fine to me.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-19T06:19:35Z

On Fri, Dec 19, 2025 at 10:40 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
>
> > 2. Do we want to support multi destination then providing string like
> > 'conflict_log_destination = 'log,table,..' make more sense but then we
> > would have to store as a string in catalog and parse it everytime we
> > insert conflicts or alter subscription OTOH currently I have just
> > support single option log/table/both which make things much easy
> > because then in catalog we can store as a single char field and don't
> > need any parsing.  And since the input are taken as a string itself,
> > even if in future we want to support more options like  'log,table,..'
> > it would be backward compatible with old options.
>
> I feel, combination of options might be a good idea, similar to how
> 'log_destination' provides. But it can be done in future versions and
> the first draft can be a simple one.
>

Considering the future extension of storing conflict information in
multiple places, it would be good to follow log_destination. Yes, it
is more work now but I feel that will be future-proof.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-19T06:22:12Z

On Fri, Dec 19, 2025 at 11:44 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 11:12 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > I didn't mean to suggest we should allow "not logging anywhere". I
> > only wanted to ask how the user is expected to revert the conflict
> > logging back to the default after they had set it to something else.
>
> Okay understood, thanks for the clarification.
>
> > e.g.
> >
> > CREATE SUBSCRIPTION mysub2 ... WITH(conflict_log_destination=table)
> > Now, how to ALTER SUBSCRIPTION to revert that back to default?
> >
> > It seems there is no "reset to default" so is the user required to do
> > this explicitly?
> > ALTER SUBSCRIPTION mysub2 SET (conflict_log_destination=log);
> >
> > Maybe that's fine --- I was just looking for some examples/clarification.
>
> Yeah this is the way, IMHO it looks fine to me.
>

How about considering log as default, so even if the user resets it
via "ALTER SUBSCRIPTION mysub2 SET (conflict_log_destination='');", we
send it to LOG as we are doing currently in HEAD? This means
conflict_log_destination='' or conflict_log_destination='log' means
the same.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-19T06:24:33Z

On Fri, Dec 19, 2025 at 10:40 AM shveta malik <shveta.malik@gmail.com> wrote:

> > 1. What should be the name of the option 'conflict_log_destination' vs
> > 'conflict_log_format'
>
> I prefer conflcit_log_destination.
>
> > 2. Do we want to support multi destination then providing string like
> > 'conflict_log_destination = 'log,table,..' make more sense but then we
> > would have to store as a string in catalog and parse it everytime we
> > insert conflicts or alter subscription OTOH currently I have just
> > support single option log/table/both which make things much easy
> > because then in catalog we can store as a single char field and don't
> > need any parsing.  And since the input are taken as a string itself,
> > even if in future we want to support more options like  'log,table,..'
> > it would be backward compatible with old options.
>
> I feel, combination of options might be a good idea, similar to how
> 'log_destination' provides. But it can be done in future versions and
> the first draft can be a simple one.
>
> > 3. Do we want to support 'none' destinations? i.e. do not log to anywhere?
>
> IMO, conflict information is an important piece of information to
> diagnose data divergence and thus should be logged always.
>
> Let's wait for others' opinions.

Thanks Shveta for you opinion,

Here is what I propose considering balance between simplicity with
future scalability:

1. Retain 'conflict_log_destination' as the option name.
2. Current supported values include 'log', 'table', or 'all' (which
directs output to both locations).  But we will not support comma
separated values in the first version.
3. By treating this as a string, we can eventually support
comma-separated values like 'log, table, new_option'. This approach
maintains a simple design by avoiding immediate need of parsing the
comma separated options while ensuring extensibility.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Masahiko Sawada <sawada.mshk@gmail.com> — 2025-12-19T08:27:24Z

On Thu, Dec 18, 2025 at 10:24 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 10:40 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> > > 1. What should be the name of the option 'conflict_log_destination' vs
> > > 'conflict_log_format'
> >
> > I prefer conflcit_log_destination.
> >
> > > 2. Do we want to support multi destination then providing string like
> > > 'conflict_log_destination = 'log,table,..' make more sense but then we
> > > would have to store as a string in catalog and parse it everytime we
> > > insert conflicts or alter subscription OTOH currently I have just
> > > support single option log/table/both which make things much easy
> > > because then in catalog we can store as a single char field and don't
> > > need any parsing.  And since the input are taken as a string itself,
> > > even if in future we want to support more options like  'log,table,..'
> > > it would be backward compatible with old options.
> >
> > I feel, combination of options might be a good idea, similar to how
> > 'log_destination' provides. But it can be done in future versions and
> > the first draft can be a simple one.
> >
> > > 3. Do we want to support 'none' destinations? i.e. do not log to anywhere?
> >
> > IMO, conflict information is an important piece of information to
> > diagnose data divergence and thus should be logged always.
> >
> > Let's wait for others' opinions.
>
> Thanks Shveta for you opinion,
>
> Here is what I propose considering balance between simplicity with
> future scalability:
>
> 1. Retain 'conflict_log_destination' as the option name.
> 2. Current supported values include 'log', 'table', or 'all' (which
> directs output to both locations).  But we will not support comma
> separated values in the first version.

If users set conflict_log_destination='table', we don't report
anything related to conflict to the server logs while all other errors
generated by apply workers go to the server logs? or do we write
ERRORs without the conflict details while writing full conflict logs
to the table? If we go with the former idea, monitoring tools would
not be able to  catch ERROR logs. Users can set
conflict_log_destination='all' in this case, but they might want to
avoid bloating the server logs by the detailed conflict information. I
wonder if there might be cases where monitoring tools want to detect
at least the fact that errors occur in the system.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-20T09:47:11Z

On Tue, 16 Dec 2025 at 09:54, vignesh C <vignesh21@gmail.com> wrote:
>
> On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > I was considering the interdependence between the subscription and the
> > > > > conflict log table (CLT). IMHO, it would be logical to establish the
> > > > > subscription as dependent on the CLT. This way, if someone attempts to
> > > > > drop the CLT, the system would recognize the dependency of the
> > > > > subscription and prevent the drop unless the subscription is removed
> > > > > first or the CASCADE option is used.
> > > > >
> > > > > However, while investigating this, I encountered an error [1] stating
> > > > > that global objects are not supported in this context. This indicates
> > > > > that global objects cannot be made dependent on local objects.
> > > > >
> > > >
> > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database
> > > > objects. For example, consider following case:
> > > > postgres=# create table t1(c1 int primary key);
> > > > CREATE TABLE
> > > > postgres=# \d+ t1
> > > >                                            Table "public.t1"
> > > >  Column |  Type   | Collation | Nullable | Default | Storage |
> > > > Compression | Stats target | Description
> > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
> > > >  c1     | integer |           | not null |         | plain   |
> > > >     |              |
> > > > Indexes:
> > > >     "t1_pkey" PRIMARY KEY, btree (c1)
> > > > Publications:
> > > >     "pub1"
> > > > Not-null constraints:
> > > >     "t1_c1_not_null" NOT NULL "c1"
> > > > Access method: heap
> > > > postgres=# drop index t1_pkey;
> > > > ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
> > > > t1 requires it
> > > > HINT:  You can drop constraint t1_pkey on table t1 instead.
> > > >
> > > > Here, the PK index is created as part for CREATE TABLE operation and
> > > > pk_index is not allowed to be dropped independently.
> > > >
> > > > > Although making an object dependent on global/shared objects is
> > > > > possible for certain types of shared objects [2], this is not our main
> > > > > objective.
> > > > >
> > > >
> > > > As per my understanding from the above example, we need something like
> > > > that only for shared object subscription and (internally created)
> > > > table.
> > >
> > > Yeah that seems to be exactly what we want, so I tried doing that by
> > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and
> > > it is behaving as we want[2].  And while dropping the subscription or
> > > altering CLT we can delete internal dependency so that CLT get dropped
> > > automatically[3]
> > >
> > > I will send an updated patch after testing a few more scenarios and
> > > fixing other pending issues.
> > >
> > > [1]
> > > +       ObjectAddressSet(myself, RelationRelationId, relid);
> > > +       ObjectAddressSet(subaddr, SubscriptionRelationId, subid);
> > > +       recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL);
> > >
> > >
> > > [2]
> > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2;
> > > ERROR:  2BP01: cannot drop table myschema.conflict_log_history2
> > > because subscription sub requires it
> > > HINT:  You can drop subscription sub instead.
> > > LOCATION:  findDependentObjects, dependency.c:788
> > > postgres[670778]=#
> > >
> > > [3]
> > > ObjectAddressSet(object, SubscriptionRelationId, subid);
> > > performDeletion(&object, DROP_CASCADE
> > >                            PERFORM_DELETION_INTERNAL |
> > >                            PERFORM_DELETION_SKIP_ORIGINAL);
> > >
> > >
> >
> > Here is the patch which implements the dependency and fixes other
> > comments from Shveta.
>
> Thanks for the changes, the new implementation based on dependency
> creates a cycle while dumping:
> ./pg_dump -d postgres -f dump1.txt -p 5433
> pg_dump: warning: could not resolve dependency loop among these items:
> pg_dump: detail: TABLE conflict  (ID 225 OID 16397)
> pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396)
> pg_dump: detail: POST-DATA BOUNDARY  (ID 3491)
> pg_dump: detail: TABLE DATA t1  (ID 3485 OID 16384)
> pg_dump: detail: PRE-DATA BOUNDARY  (ID 3490)
>
> This can be seen with a simple subscription with conflict_log_table.
> This was working fine with the v11 version patch.

The attached v13 patch includes the fix for this issue. In addition,
it now raises an error when attempting to configure a conflict log
table that belongs to a temporary schema or is not a permanent
(persistent) relation.

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-20T11:20:53Z

On Sat, Dec 20, 2025 at 3:17 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 16 Dec 2025 at 09:54, vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > >
> > > > > > I was considering the interdependence between the subscription and the
> > > > > > conflict log table (CLT). IMHO, it would be logical to establish the
> > > > > > subscription as dependent on the CLT. This way, if someone attempts to
> > > > > > drop the CLT, the system would recognize the dependency of the
> > > > > > subscription and prevent the drop unless the subscription is removed
> > > > > > first or the CASCADE option is used.
> > > > > >
> > > > > > However, while investigating this, I encountered an error [1] stating
> > > > > > that global objects are not supported in this context. This indicates
> > > > > > that global objects cannot be made dependent on local objects.
> > > > > >
> > > > >
> > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database
> > > > > objects. For example, consider following case:
> > > > > postgres=# create table t1(c1 int primary key);
> > > > > CREATE TABLE
> > > > > postgres=# \d+ t1
> > > > >                                            Table "public.t1"
> > > > >  Column |  Type   | Collation | Nullable | Default | Storage |
> > > > > Compression | Stats target | Description
> > > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
> > > > >  c1     | integer |           | not null |         | plain   |
> > > > >     |              |
> > > > > Indexes:
> > > > >     "t1_pkey" PRIMARY KEY, btree (c1)
> > > > > Publications:
> > > > >     "pub1"
> > > > > Not-null constraints:
> > > > >     "t1_c1_not_null" NOT NULL "c1"
> > > > > Access method: heap
> > > > > postgres=# drop index t1_pkey;
> > > > > ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
> > > > > t1 requires it
> > > > > HINT:  You can drop constraint t1_pkey on table t1 instead.
> > > > >
> > > > > Here, the PK index is created as part for CREATE TABLE operation and
> > > > > pk_index is not allowed to be dropped independently.
> > > > >
> > > > > > Although making an object dependent on global/shared objects is
> > > > > > possible for certain types of shared objects [2], this is not our main
> > > > > > objective.
> > > > > >
> > > > >
> > > > > As per my understanding from the above example, we need something like
> > > > > that only for shared object subscription and (internally created)
> > > > > table.
> > > >
> > > > Yeah that seems to be exactly what we want, so I tried doing that by
> > > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and
> > > > it is behaving as we want[2].  And while dropping the subscription or
> > > > altering CLT we can delete internal dependency so that CLT get dropped
> > > > automatically[3]
> > > >
> > > > I will send an updated patch after testing a few more scenarios and
> > > > fixing other pending issues.
> > > >
> > > > [1]
> > > > +       ObjectAddressSet(myself, RelationRelationId, relid);
> > > > +       ObjectAddressSet(subaddr, SubscriptionRelationId, subid);
> > > > +       recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL);
> > > >
> > > >
> > > > [2]
> > > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2;
> > > > ERROR:  2BP01: cannot drop table myschema.conflict_log_history2
> > > > because subscription sub requires it
> > > > HINT:  You can drop subscription sub instead.
> > > > LOCATION:  findDependentObjects, dependency.c:788
> > > > postgres[670778]=#
> > > >
> > > > [3]
> > > > ObjectAddressSet(object, SubscriptionRelationId, subid);
> > > > performDeletion(&object, DROP_CASCADE
> > > >                            PERFORM_DELETION_INTERNAL |
> > > >                            PERFORM_DELETION_SKIP_ORIGINAL);
> > > >
> > > >
> > >
> > > Here is the patch which implements the dependency and fixes other
> > > comments from Shveta.
> >
> > Thanks for the changes, the new implementation based on dependency
> > creates a cycle while dumping:
> > ./pg_dump -d postgres -f dump1.txt -p 5433
> > pg_dump: warning: could not resolve dependency loop among these items:
> > pg_dump: detail: TABLE conflict  (ID 225 OID 16397)
> > pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396)
> > pg_dump: detail: POST-DATA BOUNDARY  (ID 3491)
> > pg_dump: detail: TABLE DATA t1  (ID 3485 OID 16384)
> > pg_dump: detail: PRE-DATA BOUNDARY  (ID 3490)
> >
> > This can be seen with a simple subscription with conflict_log_table.
> > This was working fine with the v11 version patch.
>
> The attached v13 patch includes the fix for this issue. In addition,
> it now raises an error when attempting to configure a conflict log
> table that belongs to a temporary schema or is not a permanent
> (persistent) relation.

I have updated the patch and here are changes done
1. Splitted into 2 patches, 0001- for catalog related changes
0002-inserting conflict into the conflict table, Vignesh need to
rebase the dump and upgrade related patch on this latest changes
2. Subscription option changed to conflict_log_destination=(log/table/all/'')
3. For internal processing we will use ConflictLogDest enum whereas
for taking input or storing into catalog we will use string [1].
4. As suggested by Sawada San, if conflict_log_destination is 'table'
we log the information about conflict but don't log the tuple
details[3]

Pending:
1. tap test for conflict insertion
2. Still need to work on caching related changes discussed at [2], so
currently we don't allow conflict log tables to be added to
publication at all and might change this behavior as discussed at [2]
and for that we will need to implement the caching.
3. Need to add conflict insertion test and doc changes.
4. Still need to check on the latest comments from Peter Smith.


[1]
typedef enum ConflictLogDest
{
CONFLICT_LOG_DEST_INVALID = 0,
CONFLICT_LOG_DEST_LOG, /* "log" (default) */
CONFLICT_LOG_DEST_TABLE, /* "table" */
CONFLICT_LOG_DEST_ALL /* "all" */
} ConflictLogDest;

/*
* Array mapping for converting internal enum to string.
*/
static const char *const ConflictLogDestLabels[] = {
[CONFLICT_LOG_DEST_LOG] = "log",
[CONFLICT_LOG_DEST_TABLE] = "table",
[CONFLICT_LOG_DEST_ALL] = "all"
};

[2] https://www.postgresql.org/message-id/CAA4eK1LNjWigHb5YKz2nBwcGQr18WnNZHv3Gyo8GNCshSkAb-A%40mail.gmail.com

[3]
/* Decide what detail to show in server logs. */
if (dest == CONFLICT_LOG_DEST_LOG || dest == CONFLICT_LOG_DEST_ALL)
{
/* Standard reporting with full internal details. */
ereport(elevel,
errcode_apply_conflict(type),
errmsg("conflict detected on relation \"%s.%s\": conflict=%s",
get_namespace_name(RelationGetNamespace(localrel)),
RelationGetRelationName(localrel),
ConflictTypeNames[type]),
errdetail_internal("%s", err_detail.data));
}
else
{
/*
* 'table' only: Report the error msg but omit raw tuple data from
* server logs since it's already captured in the internal table.
*/
ereport(elevel,
errcode_apply_conflict(type),
errmsg("conflict detected on relation \"%s.%s\": conflict=%s",
get_namespace_name(RelationGetNamespace(localrel)),
RelationGetRelationName(localrel),
ConflictTypeNames[type]),
errdetail("Conflict details logged to internal table with OID %u.",
MySubscription->conflictrelid));
}

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-21T15:46:33Z

On Sat, 20 Dec 2025 at 16:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I have updated the patch and here are changes done
> 1. Splitted into 2 patches, 0001- for catalog related changes
> 0002-inserting conflict into the conflict table, Vignesh need to
> rebase the dump and upgrade related patch on this latest changes
> 2. Subscription option changed to conflict_log_destination=(log/table/all/'')
> 3. For internal processing we will use ConflictLogDest enum whereas
> for taking input or storing into catalog we will use string [1].
> 4. As suggested by Sawada San, if conflict_log_destination is 'table'
> we log the information about conflict but don't log the tuple
> details[3]
>
> Pending:
> 2. Still need to work on caching related changes discussed at [2], so
> currently we don't allow conflict log tables to be added to
> publication at all and might change this behavior as discussed at [2]
> and for that we will need to implement the caching.

This point is addressed in the attached patch. A new shared index on
pg_subscription (subconflictlogrelid) is introduced and used to
efficiently determine whether a relation is a conflict log table,
avoiding full catalog scans. Additionally, a conflict log table can be
explicitly added to a TABLE publication and will be published when
specified directly. At the same time, such relations are excluded from
implicit publication paths (FOR ALL TABLES and schema publications).
The patch also exposes pg_relation_is_conflict_log_table() as a
SQL-visible helper, which is used by psql \d+ to filter out conflict
log tables from implicit publication listings. This avoids querying
pg_subscription directly, which is generally inaccessible to
non-superusers.

These changes are included in v14-003. There are no changes in v14-001
and v14-002; those versions are identical to the patch previously
shared by Dilip at [1].

[1] - https://www.postgresql.org/message-id/CAFiTN-sNg9ghLNkB2Kn0SwBGOub9acc99XZZU_d5NAcyW-yrEg%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-22T06:48:32Z

On Sat, 20 Dec 2025 at 16:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, Dec 20, 2025 at 3:17 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 16 Dec 2025 at 09:54, vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > >
> > > > > > > I was considering the interdependence between the subscription and the
> > > > > > > conflict log table (CLT). IMHO, it would be logical to establish the
> > > > > > > subscription as dependent on the CLT. This way, if someone attempts to
> > > > > > > drop the CLT, the system would recognize the dependency of the
> > > > > > > subscription and prevent the drop unless the subscription is removed
> > > > > > > first or the CASCADE option is used.
> > > > > > >
> > > > > > > However, while investigating this, I encountered an error [1] stating
> > > > > > > that global objects are not supported in this context. This indicates
> > > > > > > that global objects cannot be made dependent on local objects.
> > > > > > >
> > > > > >
> > > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database
> > > > > > objects. For example, consider following case:
> > > > > > postgres=# create table t1(c1 int primary key);
> > > > > > CREATE TABLE
> > > > > > postgres=# \d+ t1
> > > > > >                                            Table "public.t1"
> > > > > >  Column |  Type   | Collation | Nullable | Default | Storage |
> > > > > > Compression | Stats target | Description
> > > > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
> > > > > >  c1     | integer |           | not null |         | plain   |
> > > > > >     |              |
> > > > > > Indexes:
> > > > > >     "t1_pkey" PRIMARY KEY, btree (c1)
> > > > > > Publications:
> > > > > >     "pub1"
> > > > > > Not-null constraints:
> > > > > >     "t1_c1_not_null" NOT NULL "c1"
> > > > > > Access method: heap
> > > > > > postgres=# drop index t1_pkey;
> > > > > > ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
> > > > > > t1 requires it
> > > > > > HINT:  You can drop constraint t1_pkey on table t1 instead.
> > > > > >
> > > > > > Here, the PK index is created as part for CREATE TABLE operation and
> > > > > > pk_index is not allowed to be dropped independently.
> > > > > >
> > > > > > > Although making an object dependent on global/shared objects is
> > > > > > > possible for certain types of shared objects [2], this is not our main
> > > > > > > objective.
> > > > > > >
> > > > > >
> > > > > > As per my understanding from the above example, we need something like
> > > > > > that only for shared object subscription and (internally created)
> > > > > > table.
> > > > >
> > > > > Yeah that seems to be exactly what we want, so I tried doing that by
> > > > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and
> > > > > it is behaving as we want[2].  And while dropping the subscription or
> > > > > altering CLT we can delete internal dependency so that CLT get dropped
> > > > > automatically[3]
> > > > >
> > > > > I will send an updated patch after testing a few more scenarios and
> > > > > fixing other pending issues.
> > > > >
> > > > > [1]
> > > > > +       ObjectAddressSet(myself, RelationRelationId, relid);
> > > > > +       ObjectAddressSet(subaddr, SubscriptionRelationId, subid);
> > > > > +       recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL);
> > > > >
> > > > >
> > > > > [2]
> > > > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2;
> > > > > ERROR:  2BP01: cannot drop table myschema.conflict_log_history2
> > > > > because subscription sub requires it
> > > > > HINT:  You can drop subscription sub instead.
> > > > > LOCATION:  findDependentObjects, dependency.c:788
> > > > > postgres[670778]=#
> > > > >
> > > > > [3]
> > > > > ObjectAddressSet(object, SubscriptionRelationId, subid);
> > > > > performDeletion(&object, DROP_CASCADE
> > > > >                            PERFORM_DELETION_INTERNAL |
> > > > >                            PERFORM_DELETION_SKIP_ORIGINAL);
> > > > >
> > > > >
> > > >
> > > > Here is the patch which implements the dependency and fixes other
> > > > comments from Shveta.
> > >
> > > Thanks for the changes, the new implementation based on dependency
> > > creates a cycle while dumping:
> > > ./pg_dump -d postgres -f dump1.txt -p 5433
> > > pg_dump: warning: could not resolve dependency loop among these items:
> > > pg_dump: detail: TABLE conflict  (ID 225 OID 16397)
> > > pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396)
> > > pg_dump: detail: POST-DATA BOUNDARY  (ID 3491)
> > > pg_dump: detail: TABLE DATA t1  (ID 3485 OID 16384)
> > > pg_dump: detail: PRE-DATA BOUNDARY  (ID 3490)
> > >
> > > This can be seen with a simple subscription with conflict_log_table.
> > > This was working fine with the v11 version patch.
> >
> > The attached v13 patch includes the fix for this issue. In addition,
> > it now raises an error when attempting to configure a conflict log
> > table that belongs to a temporary schema or is not a permanent
> > (persistent) relation.
>
> I have updated the patch and here are changes done
> 1. Splitted into 2 patches, 0001- for catalog related changes
> 0002-inserting conflict into the conflict table, Vignesh need to
> rebase the dump and upgrade related patch on this latest changes
> 2. Subscription option changed to conflict_log_destination=(log/table/all/'')
> 3. For internal processing we will use ConflictLogDest enum whereas
> for taking input or storing into catalog we will use string [1].
> 4. As suggested by Sawada San, if conflict_log_destination is 'table'
> we log the information about conflict but don't log the tuple
> details[3]
>
> Pending:
> 1. tap test for conflict insertion
> 2. Still need to work on caching related changes discussed at [2], so
> currently we don't allow conflict log tables to be added to
> publication at all and might change this behavior as discussed at [2]
> and for that we will need to implement the caching.
> 3. Need to add conflict insertion test and doc changes.
> 4. Still need to check on the latest comments from Peter Smith.
>
>
> [1]
> typedef enum ConflictLogDest
> {
> CONFLICT_LOG_DEST_INVALID = 0,
> CONFLICT_LOG_DEST_LOG, /* "log" (default) */
> CONFLICT_LOG_DEST_TABLE, /* "table" */
> CONFLICT_LOG_DEST_ALL /* "all" */
> } ConflictLogDest;

Consider the following scenario. Initially, the subscription was
configured with conflict_log_destination set to a table. As conflicts
occurred, entries were generated and recorded in that table, for
example:
postgres=# SELECT * FROM conflict_log_table_16399;
 relid | schemaname | relname | conflict_type | remote_xid |
remote_commit_lsn |         remote_commit_ts         | remote_origin |
replica_identity | remote_tuple |
                local_conflicts
-------+------------+---------+---------------+------------+-------------------+----------------------------------+---------------+------------------+--------------+-------------------------
-------------------------------------------------------------------------
 16384 | public     | t1      | insert_exists |        765 |
0/0178A718        | 2025-12-22 12:06:57.417789+05:30 | pg_16399      |
                 | {"c1":1}     | {"{\"xid\":\"781\",\"com
mit_ts\":null,\"origin\":null,\"key\":{\"c1\":1},\"tuple\":{\"c1\":1}}"}
 16384 | public     | t1      | insert_exists |        765 |
0/0178A718        | 2025-12-22 12:06:57.417789+05:30 | pg_16399      |
                 | {"c1":1}     | {"{\"xid\":\"781\",\"com
mit_ts\":null,\"origin\":null,\"key\":{\"c1\":1},\"tuple\":{\"c1\":1}}"}
(2 rows)

Subsequently, the conflict log destination was changed from table to log:
ALTER SUBSCRIPTION sub1 SET (conflict_log_destination = 'log');

As a result, the conflict log table is dropped, and there is no longer
any way to access the previously recorded conflict entries. This
effectively causes the loss of historical conflict data.

It is unclear whether this behavior is desirable or expected. Should
we consider a way to preserve the historical conflict data in this
case?

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-22T09:39:38Z

On Sat, Dec 20, 2025 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I have updated the patch and here are changes done

Thank You for the patch.  Few comments on 001 alone:

1)
postgres=# create subscription sub1 connection ...' publication pub1
WITH(conflict_log_destination = 'table');
ERROR:  could not generate conflict log table "conflict_log_table_16395"
DETAIL:  Conflict log tables cannot be created in a temporary namespace.
HINT:  Ensure your 'search_path' is set to permanent schema.

Based on such existing errors:
errmsg("cannot create relations in temporary schemas of other sessions")));
errmsg("cannot create temporary relation in non-temporary schema")));
errmsg("cannot create relations in temporary schemas of other sessions")));

Shall we tweak:
--temporary namespace --> temporary schema
--permanent --> non-temporary

2)
postgres=# drop schema shveta cascade;
NOTICE:  drop cascades to subscription sub1
ERROR:  global objects cannot be deleted by doDeletion

Is this expected? Is the user supposed to see this error?

3)
ConflictLogDestLabels enum starts from 0/INVALID while mapping
ConflictLogDestLabels has values starting from index 1. The index 0
has no value. Thus IMO, wherever we access ConflictLogDestLabels, we
should make a sanity check that index accessed is not
CONFLICT_LOG_DEST_INVALID i.e. opts.logdest !=
CONFLICT_LOG_DEST_INVALID

4)
I find 'Labels' in ConflictLogDestLabels slightly odd. There could be
other names for this variables such as ConflictLogDestValues,
ConflictLogDestStrings or ConflictLogDestNames.

See similar: ConflictTypeNames, SlotInvalidationCauses

5)
+ /*
+ * Strategy for logging replication conflicts:
+ * log - server log only,
+ * table - internal table only,
+ * all - both log and table.
+ */
+ text sublogdestination;

sublogdestination can be confused with regular log_destination. Shall
we rename to subconflictlogdest.

6)
Should the \dRs+ command display the 'Conflict Log Table:' at the end?
This would be similar to how \dRp+ shows 'Tables:', even though the
relation IDs can already be obtained from pg_publication_rel. I think
this would be a useful improvement.

7)
One observation, not sure if it needs any fix, please review and share thoughts.

--CLT created in default public schema present in serach_path
create subscription sub1 connection '..' publication pub1
WITH(conflict_log_destination = 'table');

--Change search path
create schema sch1;
SET search_path=sch1, "$user";

After this, if I create a new sub with destination as 'table', CLT is
generated in sch1. But if I do below:
alter subscription sub1 set (conflict_log_destination='table');

It does not move the table to sch1. This is because
conflict_log_destination is not changed; and as per current
implementation, alter-sub becomes no-op. But search_path is changed.
So what should be the behaviour here?

--let the table be in the old schema, which is currently not in
search_path (existing behaviour)?
--drop the table in the old schema and create a new one present in
search_path?

I could not find a similar case in postgres to compare the behaviour.

If we do
alter subscription sub1 set (conflict_log_destination='log');
alter subscription sub1 set (conflict_log_destination='table');

Then it moves the table to a new schema as internally setting
destination to 'log' drops the table.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-22T10:24:57Z

On Sat, 20 Dec 2025 at 16:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I have updated the patch and here are changes done
> 1. Splitted into 2 patches, 0001- for catalog related changes
> 0002-inserting conflict into the conflict table, Vignesh need to
> rebase the dump and upgrade related patch on this latest changes
> 2. Subscription option changed to conflict_log_destination=(log/table/all/'')
> 3. For internal processing we will use ConflictLogDest enum whereas
> for taking input or storing into catalog we will use string [1].
> 4. As suggested by Sawada San, if conflict_log_destination is 'table'
> we log the information about conflict but don't log the tuple
> details[3]

Few comments:
1) when a conflict_log_destination is specified as log:
create subscription sub1 connection 'dbname=postgres host=localhost
port=5432' publication pub1 with ( conflict_log_destination='log');
postgres=# select subname, subconflictlogrelid,sublogdestination from
pg_subscription where subname = 'sub4';
 subname | subconflictlogrelid | sublogdestination
---------+---------------------+-------------------
 sub4    |                   0 | log
(1 row)

Currently it displays as 0, instead we can show as NULL in this case

2) can we include displaying of conflict log table also  in describe
subscriptions:
+               /* Conflict log destination is supported in v19 and higher */
+               if (pset.sversion >= 190000)
+               {
+                       appendPQExpBuffer(&buf,
+                                                         ",
sublogdestination AS \"%s\"\n",
+
gettext_noop("Conflict log destination"));
+               }

3) Can we include pg_ in the conflict table to indicate it is an
internally created table:
+/*
+ * Format the standardized internal conflict log table name for a subscription
+ *
+ * Use the OID to prevent collisions during rename operations.
+ */
+void
+GetConflictLogTableName(char *dest, Oid subid)
+{
+       snprintf(dest, NAMEDATALEN, "conflict_log_table_%u", subid);
+}

4) Can the table be deleted now with the dependency associated between
the table and the subscription?
+       conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
+
+       /* Conflict log table is dropped or not accessible. */
+       if (conflictlogrel == NULL)
+               ereport(WARNING,
+                               (errcode(ERRCODE_UNDEFINED_TABLE),
+                                errmsg("conflict log table with OID
%u does not exist",
+                                               conflictlogrelid)));
+
+       return conflictlogrel;

5) Should this code be changed to just prepare the conflict log tuple
here, validation and insertion can happen at start_apply if elevel >=
ERROR to avoid ValidateConflictLogTable here as well as at start_apply
function:
+               if (ValidateConflictLogTable(conflictlogrel))
+               {
+                       /*
+                        * Prepare the conflict log tuple. If the
error level is below
+                        * ERROR, insert it immediately. Otherwise,
defer the insertion to
+                        * a new transaction after the current one
aborts, ensuring the
+                        * insertion of the log tuple is not rolled back.
+                        */
+                       prepare_conflict_log_tuple(estate,
+
    relinfo->ri_RelationDesc,
+
    conflictlogrel,
+                                                                          type,
+
    searchslot,
+
    conflicttuples,
+
    remoteslot);
+                       if (elevel < ERROR)
+                               InsertConflictLogTuple(conflictlogrel);
+               }
+               else
+                       ereport(WARNING,
+
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                                       errmsg("conflict log table
\"%s.%s\" structure changed, skipping insertion",
+
get_namespace_name(RelationGetNamespace(conflictlogrel)),
+
RelationGetRelationName(conflictlogrel)));

to:
prepare_conflict_log_tuple(estate,
   relinfo->ri_RelationDesc,
   conflictlogrel,
   type,
   searchslot,
   conflicttuples,
   remoteslot);
if (elevel < ERROR)
{
if (ValidateConflictLogTable(conflictlogrel))
InsertConflictLogTuple(conflictlogrel);
else
ereport(WARNING,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("conflict log table \"%s.%s\" structure changed, skipping insertion",
get_namespace_name(RelationGetNamespace(conflictlogrel)),
RelationGetRelationName(conflictlogrel)));
}

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-22T10:31:11Z

On Sat, Dec 20, 2025 at 4:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, Dec 20, 2025 at 3:17 PM vignesh C <vignesh21@gmail.com> wrote:

> I have updated the patch and here are changes done
> 1. Splitted into 2 patches, 0001- for catalog related changes
> 0002-inserting conflict into the conflict table, Vignesh need to
> rebase the dump and upgrade related patch on this latest changes
> 2. Subscription option changed to conflict_log_destination=(log/table/all/'')
> 3. For internal processing we will use ConflictLogDest enum whereas
> for taking input or storing into catalog we will use string [1].
> 4. As suggested by Sawada San, if conflict_log_destination is 'table'
> we log the information about conflict but don't log the tuple
> details[3]
>
> Pending:
> 1. tap test for conflict insertion

Done in V15
> 2. Still need to work on caching related changes discussed at [2], so
> currently we don't allow conflict log tables to be added to
> publication at all and might change this behavior as discussed at [2]
> and for that we will need to implement the caching.

Pending

> 3. Need to add conflict insertion test and doc changes.

Done

> 4. Still need to check on the latest comments from Peter Smith.

Done

While planning to send the patch, I have noticed some latest comments
from Shveta and Vignesh, so I will analyze those in the next version.

V15-0004 is Vignesh's patch which is attached as it is and I am going
to review that soon.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-22T10:34:05Z

On Mon, Dec 22, 2025 at 3:55 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Few comments:
> 1) when a conflict_log_destination is specified as log:
> create subscription sub1 connection 'dbname=postgres host=localhost
> port=5432' publication pub1 with ( conflict_log_destination='log');
> postgres=# select subname, subconflictlogrelid,sublogdestination from
> pg_subscription where subname = 'sub4';
>  subname | subconflictlogrelid | sublogdestination
> ---------+---------------------+-------------------
>  sub4    |                   0 | log
> (1 row)
>
> Currently it displays as 0, instead we can show as NULL in this case

I also thought about it while reviewing, but I feel 0 makes more sense
as it is 'relid'. This is how it is shown currently in other tables.
See 'reltoastrelid':

postgres=# select relname, reltoastrelid from  pg_class where relname='tab1';
 relname | reltoastrelid
---------+---------------
 tab1    |             0
(1 row)


>
> 3) Can we include pg_ in the conflict table to indicate it is an
> internally created table:
> +/*
> + * Format the standardized internal conflict log table name for a subscription
> + *
> + * Use the OID to prevent collisions during rename operations.
> + */
> +void
> +GetConflictLogTableName(char *dest, Oid subid)
> +{
> +       snprintf(dest, NAMEDATALEN, "conflict_log_table_%u", subid);
> +}
>

There is already a discussion about it in [1]

[1]: https://www.postgresql.org/message-id/CAA4eK1KE%3DtNHcN3Qp0FZVwDnt4rF2zwHy8NgAdG3oPqixdzOsA%40mail.gmail.com

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-22T15:41:03Z

On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote:

I think this needs more thought, others can be fixed.

> 2)
> postgres=# drop schema shveta cascade;
> NOTICE:  drop cascades to subscription sub1
> ERROR:  global objects cannot be deleted by doDeletion
>
> Is this expected? Is the user supposed to see this error?
>
See below code, so this says if the object being dropped is the
outermost object (i.e. if we are dropping the table directly) then it
will disallow dropping the object on which it has INTERNAL DEPENDENCY,
OTOH if the object is being dropped via recursive drop (i.e. the table
is being dropped while dropping the schema) then object on which it
has INTERNAL dependency will also be added to the deletion list and
later will be dropped via doDeletion and later we are getting error as
subscription is a global object.  I thought maybe we can handle an
additional case that the INTERNAL DEPENDENCY, is on subscription the
disallow dropping it irrespective of whether it is being called
directly or via recursive drop but then it will give an issue even
when we are trying to drop table during subscription drop, we can make
handle this case as well via 'flags' passed in findDependentObjects()
but need more investigation.

Seeing this complexity makes me think more on is it really worth it to
maintain this dependency?  Because during subscription drop we anyway
have to call performDeletion externally because this dependency is
local so we are just disallowing the conflict table drop, however the
ALTER table is allowed so what we are really protecting by protecting
the table drop, I think it can be just documented that if user try to
drop the table then conflict will not be inserted anymore?

findDependentObjects()
{
...
     switch (foundDep->deptype)
     {
         ....
         case DEPENDENCY_INTERNAL:
            * 1. At the outermost recursion level, we must disallow the
            * DROP. However, if the owning object is listed in
            * pendingObjects, just release the caller's lock and return;
            * we'll eventually complete the DROP when we reach that entry
            * in the pending list.
     }
}

[1]
postgres[1333899]=# select * from pg_depend where objid > 16410;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    1259 | 16420 |        0 |       2615 |    16410 |           0 | n
    1259 | 16420 |        0 |       6100 |    16419 |           0 | i
(4 rows)

16420 -> conflict_log_table_16419
16419 -> subscription
16410 -> schema s1



-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-23T05:25:08Z

On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> I think this needs more thought, others can be fixed.
>
> > 2)
> > postgres=# drop schema shveta cascade;
> > NOTICE:  drop cascades to subscription sub1
> > ERROR:  global objects cannot be deleted by doDeletion
> >
> > Is this expected? Is the user supposed to see this error?
> >
> See below code, so this says if the object being dropped is the
> outermost object (i.e. if we are dropping the table directly) then it
> will disallow dropping the object on which it has INTERNAL DEPENDENCY,
> OTOH if the object is being dropped via recursive drop (i.e. the table
> is being dropped while dropping the schema) then object on which it
> has INTERNAL dependency will also be added to the deletion list and
> later will be dropped via doDeletion and later we are getting error as
> subscription is a global object.  I thought maybe we can handle an
> additional case that the INTERNAL DEPENDENCY, is on subscription the
> disallow dropping it irrespective of whether it is being called
> directly or via recursive drop but then it will give an issue even
> when we are trying to drop table during subscription drop, we can make
> handle this case as well via 'flags' passed in findDependentObjects()
> but need more investigation.
>
> Seeing this complexity makes me think more on is it really worth it to
> maintain this dependency?  Because during subscription drop we anyway
> have to call performDeletion externally because this dependency is
> local so we are just disallowing the conflict table drop, however the
> ALTER table is allowed so what we are really protecting by protecting
> the table drop, I think it can be just documented that if user try to
> drop the table then conflict will not be inserted anymore?
>
> findDependentObjects()
> {
> ...
>      switch (foundDep->deptype)
>      {
>          ....
>          case DEPENDENCY_INTERNAL:
>             * 1. At the outermost recursion level, we must disallow the
>             * DROP. However, if the owning object is listed in
>             * pendingObjects, just release the caller's lock and return;
>             * we'll eventually complete the DROP when we reach that entry
>             * in the pending list.
>      }
> }
>
> [1]
> postgres[1333899]=# select * from pg_depend where objid > 16410;
>  classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> ---------+-------+----------+------------+----------+-------------+---------
>     1259 | 16420 |        0 |       2615 |    16410 |           0 | n
>     1259 | 16420 |        0 |       6100 |    16419 |           0 | i
> (4 rows)
>
> 16420 -> conflict_log_table_16419
> 16419 -> subscription
> 16410 -> schema s1
>

One approach could be to use something similar to
PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive
drops. The effect would be that 'DROP SCHEMA ... CASCADE' would
proceed without error, i.e., it would drop the tables as well without
including the subscription in the dependency list. But if we try to
drop a table directly (e.g., DROP TABLE CLT), it will still result in:
ERROR: cannot drop table because subscription sub1 requires it

The behavior will resemble a dependency somewhere between type 'n' and
type 'i'. That said, I’m not sure if this is worth the effort, even
though it prevents direct drop of table, it still does not prevent
table from being dropped as part of a schema drop.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-23T06:11:24Z

On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> One approach could be to use something similar to
> PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive
> drops. The effect would be that 'DROP SCHEMA ... CASCADE' would
> proceed without error, i.e., it would drop the tables as well without
> including the subscription in the dependency list. But if we try to
> drop a table directly (e.g., DROP TABLE CLT), it will still result in:
> ERROR: cannot drop table because subscription sub1 requires it
>
> The behavior will resemble a dependency somewhere between type 'n' and
> type 'i'. That said, I’m not sure if this is worth the effort, even
> though it prevents direct drop of table, it still does not prevent
> table from being dropped as part of a schema drop.

Yeah but that would be inconsistent behavior.  Anyway here is what I
got with what I was proposing yesterday.[1], so basically drop schema
and drop table are giving the same behavior as expected and drop
subscription is internally dropping the table as we would want.
Although this need more thought to see what else it might break.

postgres[1553010]=# CREATE SCHEMA s1;
postgres[1553010]=# SET search_path TO s1;
postgres[1553010]=# CREATE SUBSCRIPTION sub1 CONNECTION
'dbname=postgres port=5432' PUBLICATION pub WITH
(conflict_log_destination = table);
postgres[1553010]=# \d
                    List of relations
 Schema |           Name           | Type  |    Owner
--------+--------------------------+-------+-------------
 s1     | conflict_log_table_16428 | table | dilipkumarb
(1 row)

postgres[1553010]=# DROP SCHEMA s1;
ERROR:  2BP01: cannot drop table conflict_log_table_16428 because
subscription sub1 requires it
HINT:  You can drop subscription sub1 instead.
LOCATION:  findDependentObjects, dependency.c:843

postgres[1553010]=# DROP TABLE conflict_log_table_16428 ;
ERROR:  2BP01: cannot drop table conflict_log_table_16428 because
subscription sub1 requires it
HINT:  You can drop subscription sub1 instead.
LOCATION:  findDependentObjects, dependency.c:843

postgres[1553010]=# DROP SUBSCRIPTION sub1;
NOTICE:  00000: dropped replication slot
"pg_16428_sync_16385_7586930395971240479" on publisher
LOCATION:  ReplicationSlotDropAtPubNode, subscriptioncmds.c:2469
NOTICE:  00000: dropped replication slot "sub1" on publisher
LOCATION:  ReplicationSlotDropAtPubNode, subscriptioncmds.c:2469
DROP SUBSCRIPTION

[1]
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index 7489bbd5fb3..14184d076d3 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -662,6 +662,11 @@ findDependentObjects(const ObjectAddress *object,
                                 * However, no inconsistency can
result: since we're at outer
                                 * level, there is no object depending
on this one.
                                 */
+                               if
(IsSharedRelation(otherObject.classId) && !(flags &
PERFORM_DELETION_INTERNAL))
+                               {
+                                       owningObject = otherObject;
+                                       break;
+                               }
                                if (stack == NULL)
                                {
                                        if (pendingObjects &&



-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-23T06:15:48Z

On Sat, 20 Dec 2025 at 16:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, Dec 20, 2025 at 3:17 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 16 Dec 2025 at 09:54, vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > >
> > > > > > > I was considering the interdependence between the subscription and the
> > > > > > > conflict log table (CLT). IMHO, it would be logical to establish the
> > > > > > > subscription as dependent on the CLT. This way, if someone attempts to
> > > > > > > drop the CLT, the system would recognize the dependency of the
> > > > > > > subscription and prevent the drop unless the subscription is removed
> > > > > > > first or the CASCADE option is used.
> > > > > > >
> > > > > > > However, while investigating this, I encountered an error [1] stating
> > > > > > > that global objects are not supported in this context. This indicates
> > > > > > > that global objects cannot be made dependent on local objects.
> > > > > > >
> > > > > >
> > > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database
> > > > > > objects. For example, consider following case:
> > > > > > postgres=# create table t1(c1 int primary key);
> > > > > > CREATE TABLE
> > > > > > postgres=# \d+ t1
> > > > > >                                            Table "public.t1"
> > > > > >  Column |  Type   | Collation | Nullable | Default | Storage |
> > > > > > Compression | Stats target | Description
> > > > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
> > > > > >  c1     | integer |           | not null |         | plain   |
> > > > > >     |              |
> > > > > > Indexes:
> > > > > >     "t1_pkey" PRIMARY KEY, btree (c1)
> > > > > > Publications:
> > > > > >     "pub1"
> > > > > > Not-null constraints:
> > > > > >     "t1_c1_not_null" NOT NULL "c1"
> > > > > > Access method: heap
> > > > > > postgres=# drop index t1_pkey;
> > > > > > ERROR:  cannot drop index t1_pkey because constraint t1_pkey on table
> > > > > > t1 requires it
> > > > > > HINT:  You can drop constraint t1_pkey on table t1 instead.
> > > > > >
> > > > > > Here, the PK index is created as part for CREATE TABLE operation and
> > > > > > pk_index is not allowed to be dropped independently.
> > > > > >
> > > > > > > Although making an object dependent on global/shared objects is
> > > > > > > possible for certain types of shared objects [2], this is not our main
> > > > > > > objective.
> > > > > > >
> > > > > >
> > > > > > As per my understanding from the above example, we need something like
> > > > > > that only for shared object subscription and (internally created)
> > > > > > table.
> > > > >
> > > > > Yeah that seems to be exactly what we want, so I tried doing that by
> > > > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and
> > > > > it is behaving as we want[2].  And while dropping the subscription or
> > > > > altering CLT we can delete internal dependency so that CLT get dropped
> > > > > automatically[3]
> > > > >
> > > > > I will send an updated patch after testing a few more scenarios and
> > > > > fixing other pending issues.
> > > > >
> > > > > [1]
> > > > > +       ObjectAddressSet(myself, RelationRelationId, relid);
> > > > > +       ObjectAddressSet(subaddr, SubscriptionRelationId, subid);
> > > > > +       recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL);
> > > > >
> > > > >
> > > > > [2]
> > > > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2;
> > > > > ERROR:  2BP01: cannot drop table myschema.conflict_log_history2
> > > > > because subscription sub requires it
> > > > > HINT:  You can drop subscription sub instead.
> > > > > LOCATION:  findDependentObjects, dependency.c:788
> > > > > postgres[670778]=#
> > > > >
> > > > > [3]
> > > > > ObjectAddressSet(object, SubscriptionRelationId, subid);
> > > > > performDeletion(&object, DROP_CASCADE
> > > > >                            PERFORM_DELETION_INTERNAL |
> > > > >                            PERFORM_DELETION_SKIP_ORIGINAL);
> > > > >
> > > > >
> > > >
> > > > Here is the patch which implements the dependency and fixes other
> > > > comments from Shveta.
> > >
> > > Thanks for the changes, the new implementation based on dependency
> > > creates a cycle while dumping:
> > > ./pg_dump -d postgres -f dump1.txt -p 5433
> > > pg_dump: warning: could not resolve dependency loop among these items:
> > > pg_dump: detail: TABLE conflict  (ID 225 OID 16397)
> > > pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396)
> > > pg_dump: detail: POST-DATA BOUNDARY  (ID 3491)
> > > pg_dump: detail: TABLE DATA t1  (ID 3485 OID 16384)
> > > pg_dump: detail: PRE-DATA BOUNDARY  (ID 3490)
> > >
> > > This can be seen with a simple subscription with conflict_log_table.
> > > This was working fine with the v11 version patch.
> >
> > The attached v13 patch includes the fix for this issue. In addition,
> > it now raises an error when attempting to configure a conflict log
> > table that belongs to a temporary schema or is not a permanent
> > (persistent) relation.
>
> I have updated the patch and here are changes done
> 1. Splitted into 2 patches, 0001- for catalog related changes
> 0002-inserting conflict into the conflict table, Vignesh need to
> rebase the dump and upgrade related patch on this latest changes

Here is a rebased version of the dump/upgrade patch based on the v15
version posted at [1].
After replacing conflict_log_table with conflict_log_destination, we
don't specify a fully qualified table name directly. Instead, the
conflict log behavior is controlled via conflict_log_destination
(table, log, or all). Since pg_dump resets search_path, it must
explicitly set the schema in which the conflict log table should be
created or reused. To handle this, pg_dump temporarily sets and then
restores search_path around the ALTER SUBSCRIPTION ... SET
(conflict_log_destination ...) command, ensuring the conflict log
table is resolved in the intended schema.
Additionally, in non-upgrade dump/restore scenarios, the conflict log
table is not dumped as in non-upgrade mode it does not make sense to
link with the older conflict log table.

v15-0001 to v15-0004 is the same as the patches posted at [1].
dump/upgrade changes are present in v15-0005 patch.

[1] - https://www.postgresql.org/message-id/CAFiTN-uKn7mix8BkOOmJQ2cF5yKdfQUg2mX_w9vEC4787VZ_xQ%40mail.gmail.com

Regards.
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-12-23T06:49:21Z

Hi Dilip.

Here are some review comments after a first pass of patch v15-0001.

======
Commit Message

1.
If user choose to log into the table the table will automatically created while
creating the subscription with internal name i.e.
conflict_log_table_$subid$.  The
table will be created in the current search path and table would be
automatically
dropped while dropping the subscription.

English:

/If user choose/
/the table the table/
/and table would/

======
src/backend/commands/subscriptioncmds.c

2.
+#define SUBOPT_CONFLICT_LOG_DESTINATION 0x00040000

For the values, you are using DEST instead of DESTINATION. You can do
the same here to keep the macro name a bit shorter.

~~~

parse_subscription_options:

3.
+ dest = GetLogDestination(val);
+
+ if (dest == CONFLICT_LOG_DEST_INVALID)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("unrecognized conflict_log_destination value: \"%s\"", val),
+ errhint("Valid values are \"log\", \"table\", and \"all\".")));

I don't think CONFLICT_LOG_DEST_INVALID should even exist as an enum
value. Instead, the validation and the ereport(ERROR) should all be
done within GetLogDestination function. So, it should only return
valid values, else give an error.

~~~

CreateSubscription:

4.
+ /* Always set the destination, default will be log. */
+ values[Anum_pg_subscription_sublogdestination - 1] =
+ CStringGetTextDatum(ConflictLogDestLabels[opts.logdest]);
+
+ /*
+ * If the conflict log destination includes 'table', generate an internal
+ * name using the subscription OID and determine the target namespace based
+ * on the current search path. Store the namespace OID and the conflict log
+ * format in the pg_subscription catalog tuple., then  physically create
+ * the table.
+ */

4a.
When referring to these parameter values, you should always
consistently quote them. Currently, there is a mix of lots of formats.
(e.g. log (unquoted), 'table' (single-quoted), "log" (double-quoted)).

Pick one style, and make them all the same. Check for the same everywhere.

~

4b.
Typo "tuple.,"

~~~

5.
+ if (opts.logdest == CONFLICT_LOG_DEST_TABLE ||
+ opts.logdest == CONFLICT_LOG_DEST_ALL)

IIUC, you are effectively treating these parameter values like bits
that can be OR-ed together. And if in the future a "list" is
supported, then that's exactly what you will be doing. So, IMO, they
should be defined that way. See a review comment later in this post.

e.g. this condition would be written more like:
if ((opts.logdest & CONFLICT_LOG_DEST_TABLE) != 0)
or, using the macro
if (IsSet(opts.logdest, CONFLICT_LOG_DEST_TABLE))

~~~

AlterSubscription:

6.
+ if (opts.logdest != old_dest)
+ {
+ bool want_table =
+ (opts.logdest == CONFLICT_LOG_DEST_TABLE ||
+ opts.logdest == CONFLICT_LOG_DEST_ALL);
+ bool has_oldtable =
+ (old_dest == CONFLICT_LOG_DEST_TABLE ||
+ old_dest == CONFLICT_LOG_DEST_ALL);
+


This is more of the same kind of logic that convinces me the code
should be using bitmasks.

SUGGESTION
bool want_table = IsSet(opts.logdest, CONFLICT_LOG_DEST_TABLE);
bool has_oldtable = IsSet(olddest, CONFLICT_LOG_DEST_TABLE);

~~~

create_conflict_log_table:

7.
+/*
+ * Create conflict log table.
+ *
+ * The subscription owner becomes the owner of this table and has all
+ * privileges on it.
+ */
+static Oid
+create_conflict_log_table(Oid subid, char *subname, Oid namespaceId,
+   char *conflictrel)


I felt something like 'relname' is a better name for the char *
conflictrel param. It clearly is the name of the conflict relation
because of the name of the function.

~~~

8.
+ /* Add a comments for the conflict log table. */
+ snprintf(comment, sizeof(comment),
+ "Conflict log table for subscription \"%s\"", subname);
+ CreateComments(relid, RelationRelationId, 0, comment);
+

8a.
typo /Add a comments/Add a comment/

~

8b.
My (previous review) suggestion for adding a table comment/description
made more sense when the CLT was some arbitrary name chosen by the
user. But, now that the CLT is a name like "conflict_log_table_%u",
the idea for a comment seems redundant.

~~~

9.
+/*
+ * Format the standardized internal conflict log table name for a subscription
+ *
+ * Use the OID to prevent collisions during rename operations.
+ */
+void
+GetConflictLogTableName(char *dest, Oid subid)
+{
+ snprintf(dest, NAMEDATALEN, "conflict_log_table_%u", subid);
+}
+

9a.
To emphasise that this is an "internal" table, IMO there should be a
"pg_" prefix for this table name.

~

9b.
Since it is internal anyway, why not make the tablename descriptive to
clarify what that number means?
e.g. "pg_conflict_log_table_for_subid_%u"

BTW, since it is already a TABLE, then why is "table" even part of
this name? Why not just "pg_conflict_log_for_subid_%u"
~~~

10.
+/*
+ * GetLogDestination
+ *
+ * Convert string to enum by comparing against standardized labels.
+ */
+ConflictLogDest
+GetLogDestination(const char *dest)
+{
+ /* Empty string or NULL defaults to LOG. */
+ if (dest == NULL || dest[0] == '\0')
+ return CONFLICT_LOG_DEST_LOG;
+
+ for (int i = CONFLICT_LOG_DEST_LOG; i <= CONFLICT_LOG_DEST_ALL; i++)
+ {
+ if (pg_strcasecmp(dest, ConflictLogDestLabels[i]) == 0)
+ return (ConflictLogDest) i;
+ }
+
+ /* Unrecognized string. */
+ return CONFLICT_LOG_DEST_INVALID;
+}

Mentioned previously: I think there should be no such thing as
CONFLICT_LOG_DEST_INVALID. I also think this function should be
responsible for the ereport(ERROR).


======
src/include/catalog/pg_subscription.h

11.
+ /*
+ * Strategy for logging replication conflicts:
+ * log - server log only,
+ * table - internal table only,
+ * all - both log and table.
+ */
+ text sublogdestination;
+

SUGGEST 'subconflictlogdest'

(see next review comment #12 for why)

~~~

12.
+ Oid conflictrelid; /* conflict log table Oid */
  char    *conninfo; /* Connection string to the publisher */
  char    *slotname; /* Name of the replication slot */
  char    *synccommit; /* Synchronous commit setting for worker */
  List    *publications; /* List of publication names to subscribe to */
  char    *origin; /* Only publish data originating from the
  * specified origin */
+ char    *logdestination; /* Conflict log destination */
 } Subscription;

These don't seem very good member names:

Maybe 'conflictrelid' -> 'conflictlogrelid' (because it's rel of the
log; not the conflict)
Maybe 'logdestination' -> 'conflictlogdest' (because in future there
might be other kinds of subscription logs)

======
src/include/replication/conflict.h

13.
+typedef enum ConflictLogDest
+{
+ CONFLICT_LOG_DEST_INVALID = 0,
+ CONFLICT_LOG_DEST_LOG, /* "log" (default) */
+ CONFLICT_LOG_DEST_TABLE, /* "table" */
+ CONFLICT_LOG_DEST_ALL /* "all" */
+} ConflictLogDest;
+

I didn't like this enum much.

Suggest removing CONFLICT_LOG_DEST_INVALID.
And use bits for the other values.
And you can still have a default enum if you want.

SUGGESTION
typedef enum ConflictLogDest
{
  CONFLICT_LOG_DEST_LOG = 0x001,
  CONFLICT_LOG_DEST_TABLE = 0x010,
  CONFLICT_LOG_DEST_DEFAULT = CONFLICT_LOG_DEST_LOG,
  CONFLICT_LOG_DEST_ALL = CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE,
} ConflictLogDest;

BTW, there are only a few values that the array won't exceed length
0x11, so I guess you can still keep your same designated initialiser
for the dest labels.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-23T10:03:56Z

On Mon, Dec 22, 2025 at 4:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Done in V15

Thanks for the patches. A few comments on v15-002 for the part I have
reviewed so far:

1)
Defined twice:

+#define MAX_LOCAL_CONFLICT_INFO_ATTRS 5

+#define MAX_LOCAL_CONFLICT_INFO_ATTRS \
+ (sizeof(LocalConflictSchema) / sizeof(LocalConflictSchema[0]))


2)
GetConflictLogTableInfo:
+ *log_dest = GetLogDestination(MySubscription->logdestination);
+ conflictlogrelid = MySubscription->conflictrelid;
+
+ /* If destination is 'log' only, no table to open. */
+ if (*log_dest == CONFLICT_LOG_DEST_LOG)
+ return NULL;

We can get conflictlogrelid after the if-check for DEST_LOG.

3)
In ReportApplyConflict(), we form err_detail by calling
errdetail_apply_conflict(). But when dest is TABLE, we don't use
err_detail. Shall we skip creating it for dest=TABLE case?

4)
ReportApplyConflict():
+ /*
+ * Get both the conflict log destination and the opened conflict log
+ * relation for insertion.
+ */
+ conflictlogrel = GetConflictLogTableInfo(&dest);
+

We can move it after errdetail_apply_conflict(), closer to where we
actually use it.

5)
start_apply:
+ /* Open conflict log table and insert the tuple. */
+ conflictlogrel = GetConflictLogTableInfo(&dest);
+ if (ValidateConflictLogTable(conflictlogrel))
+ InsertConflictLogTuple(conflictlogrel);

We can have Assert here too before we call Validate:
Assert(dest == CONFLICT_LOG_DEST_TABLE || dest == CONFLICT_LOG_DEST_ALL);

6)
start_apply:
+ if (ValidateConflictLogTable(conflictlogrel))
+ InsertConflictLogTuple(conflictlogrel);
+ MyLogicalRepWorker->conflict_log_tuple = NULL;

InsertConflictLogTuple() already sets conflict_log_tuple to NULL.
Above is not needed.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2025-12-23T11:48:34Z

On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > I think this needs more thought, others can be fixed.
> >
> > > 2)
> > > postgres=# drop schema shveta cascade;
> > > NOTICE:  drop cascades to subscription sub1
> > > ERROR:  global objects cannot be deleted by doDeletion
> > >
> > > Is this expected? Is the user supposed to see this error?
> > >
> > See below code, so this says if the object being dropped is the
> > outermost object (i.e. if we are dropping the table directly) then it
> > will disallow dropping the object on which it has INTERNAL DEPENDENCY,
> > OTOH if the object is being dropped via recursive drop (i.e. the table
> > is being dropped while dropping the schema) then object on which it
> > has INTERNAL dependency will also be added to the deletion list and
> > later will be dropped via doDeletion and later we are getting error as
> > subscription is a global object.  I thought maybe we can handle an
> > additional case that the INTERNAL DEPENDENCY, is on subscription the
> > disallow dropping it irrespective of whether it is being called
> > directly or via recursive drop but then it will give an issue even
> > when we are trying to drop table during subscription drop, we can make
> > handle this case as well via 'flags' passed in findDependentObjects()
> > but need more investigation.
> >
> > Seeing this complexity makes me think more on is it really worth it to
> > maintain this dependency?  Because during subscription drop we anyway
> > have to call performDeletion externally because this dependency is
> > local so we are just disallowing the conflict table drop, however the
> > ALTER table is allowed so what we are really protecting by protecting
> > the table drop, I think it can be just documented that if user try to
> > drop the table then conflict will not be inserted anymore?
> >
> > findDependentObjects()
> > {
> > ...
> >      switch (foundDep->deptype)
> >      {
> >          ....
> >          case DEPENDENCY_INTERNAL:
> >             * 1. At the outermost recursion level, we must disallow the
> >             * DROP. However, if the owning object is listed in
> >             * pendingObjects, just release the caller's lock and return;
> >             * we'll eventually complete the DROP when we reach that entry
> >             * in the pending list.
> >      }
> > }
> >
> > [1]
> > postgres[1333899]=# select * from pg_depend where objid > 16410;
> >  classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> > ---------+-------+----------+------------+----------+-------------+---------
> >     1259 | 16420 |        0 |       2615 |    16410 |           0 | n
> >     1259 | 16420 |        0 |       6100 |    16419 |           0 | i
> > (4 rows)
> >
> > 16420 -> conflict_log_table_16419
> > 16419 -> subscription
> > 16410 -> schema s1
> >
>
> One approach could be to use something similar to
> PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive
> drops. The effect would be that 'DROP SCHEMA ... CASCADE' would
> proceed without error, i.e., it would drop the tables as well without
> including the subscription in the dependency list. But if we try to
> drop a table directly (e.g., DROP TABLE CLT), it will still result in:
> ERROR: cannot drop table because subscription sub1 requires it
>

I think this way of allowing dropping the conflict table without
caring for the parent object (subscription) is not a good idea. How
about creating a dedicated schema, say pg_conflict for the purpose of
storing conflict tables? This will be similar to the pg_toast schema
for toast tables. So, similar to that each database will have a
pg_conflict schema. It prevents the "orphan" problem where a user
accidentally drops the logging schema but the Subscription is still
trying to write to it. pg_dump needs to ignore all system schemas
EXCEPT pg_conflict. This ensures the history is preserved during
migrations while still protecting the tables from accidental user
deletion. About permissions, I think we need to set the schema
permissions so that USAGE is public (so users can SELECT from their
logs) but CREATE is restricted to the superuser/subscription owner. We
may need to think some more about permissions.

I also tried to reason out if we can allow storing the conflict table
in pg_catalog but here are a few reasons why it won't be a good idea.
I think by default, pg_dump completely ignores the pg_catalog schema.
It assumes pg_catalog contains static system definitions (like
pg_class, pg_proc, etc.) that are re-generated by the initdb process,
not user data. If we place a conflict table in pg_catalog, it will not
be backed up. If a user runs pg_dump/all to migrate to a new server,
their subscription definition will survive, but their entire history
of conflict logs will vanish. Also from the permissions angle, If a
user wants to write a custom PL/pgSQL function to "retry" conflicts,
they might need to DELETE rows from the conflict table after fixing
them. Granting DELETE permissions on a table inside pg_catalog is
non-standard and often frowned upon by security auditors. It blurs the
line between "System Internals" (immutable) and "User Data" (mutable).

So, in short a separate pg_conflict schema appears to be a better solution.

Thoughts?

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-23T12:22:14Z

On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > I think this needs more thought, others can be fixed.
> > >
> > > > 2)
> > > > postgres=# drop schema shveta cascade;
> > > > NOTICE:  drop cascades to subscription sub1
> > > > ERROR:  global objects cannot be deleted by doDeletion
> > > >
> > > > Is this expected? Is the user supposed to see this error?
> > > >
> > > See below code, so this says if the object being dropped is the
> > > outermost object (i.e. if we are dropping the table directly) then it
> > > will disallow dropping the object on which it has INTERNAL DEPENDENCY,
> > > OTOH if the object is being dropped via recursive drop (i.e. the table
> > > is being dropped while dropping the schema) then object on which it
> > > has INTERNAL dependency will also be added to the deletion list and
> > > later will be dropped via doDeletion and later we are getting error as
> > > subscription is a global object.  I thought maybe we can handle an
> > > additional case that the INTERNAL DEPENDENCY, is on subscription the
> > > disallow dropping it irrespective of whether it is being called
> > > directly or via recursive drop but then it will give an issue even
> > > when we are trying to drop table during subscription drop, we can make
> > > handle this case as well via 'flags' passed in findDependentObjects()
> > > but need more investigation.
> > >
> > > Seeing this complexity makes me think more on is it really worth it to
> > > maintain this dependency?  Because during subscription drop we anyway
> > > have to call performDeletion externally because this dependency is
> > > local so we are just disallowing the conflict table drop, however the
> > > ALTER table is allowed so what we are really protecting by protecting
> > > the table drop, I think it can be just documented that if user try to
> > > drop the table then conflict will not be inserted anymore?
> > >
> > > findDependentObjects()
> > > {
> > > ...
> > >      switch (foundDep->deptype)
> > >      {
> > >          ....
> > >          case DEPENDENCY_INTERNAL:
> > >             * 1. At the outermost recursion level, we must disallow the
> > >             * DROP. However, if the owning object is listed in
> > >             * pendingObjects, just release the caller's lock and return;
> > >             * we'll eventually complete the DROP when we reach that entry
> > >             * in the pending list.
> > >      }
> > > }
> > >
> > > [1]
> > > postgres[1333899]=# select * from pg_depend where objid > 16410;
> > >  classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> > > ---------+-------+----------+------------+----------+-------------+---------
> > >     1259 | 16420 |        0 |       2615 |    16410 |           0 | n
> > >     1259 | 16420 |        0 |       6100 |    16419 |           0 | i
> > > (4 rows)
> > >
> > > 16420 -> conflict_log_table_16419
> > > 16419 -> subscription
> > > 16410 -> schema s1
> > >
> >
> > One approach could be to use something similar to
> > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive
> > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would
> > proceed without error, i.e., it would drop the tables as well without
> > including the subscription in the dependency list. But if we try to
> > drop a table directly (e.g., DROP TABLE CLT), it will still result in:
> > ERROR: cannot drop table because subscription sub1 requires it
> >
>
> I think this way of allowing dropping the conflict table without
> caring for the parent object (subscription) is not a good idea. How
> about creating a dedicated schema, say pg_conflict for the purpose of
> storing conflict tables? This will be similar to the pg_toast schema
> for toast tables. So, similar to that each database will have a
> pg_conflict schema. It prevents the "orphan" problem where a user
> accidentally drops the logging schema but the Subscription is still
> trying to write to it. pg_dump needs to ignore all system schemas
> EXCEPT pg_conflict. This ensures the history is preserved during
> migrations while still protecting the tables from accidental user
> deletion. About permissions, I think we need to set the schema
> permissions so that USAGE is public (so users can SELECT from their
> logs) but CREATE is restricted to the superuser/subscription owner. We
> may need to think some more about permissions.
>
> I also tried to reason out if we can allow storing the conflict table
> in pg_catalog but here are a few reasons why it won't be a good idea.
> I think by default, pg_dump completely ignores the pg_catalog schema.
> It assumes pg_catalog contains static system definitions (like
> pg_class, pg_proc, etc.) that are re-generated by the initdb process,
> not user data. If we place a conflict table in pg_catalog, it will not
> be backed up. If a user runs pg_dump/all to migrate to a new server,
> their subscription definition will survive, but their entire history
> of conflict logs will vanish. Also from the permissions angle, If a
> user wants to write a custom PL/pgSQL function to "retry" conflicts,
> they might need to DELETE rows from the conflict table after fixing
> them. Granting DELETE permissions on a table inside pg_catalog is
> non-standard and often frowned upon by security auditors. It blurs the
> line between "System Internals" (immutable) and "User Data" (mutable).
> So, in short a separate pg_conflict schema appears to be a better solution.

Yeah that makes sense.  Although I haven't thought about all cases
whether it can be a problem anywhere, but meanwhile I tried
prototyping with this and it behaves what we want.

postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ;
 relid | schemaname | relname |     conflict_type     | remote_xid |
remote_commit_lsn |       remote_commit_ts        | remote_origin |
replica_identity |  remote_tuple
|
local_conflicts
-------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+----------------
+------------------------------------------------------------------------------------------------------------------------------------
 16385 | public     | test    | update_origin_differs |        761 |
0/01760BD8        | 2025-12-23 11:08:30.583816+00 | pg_16406      |
{"a":1}          | {"a":1,"b":20}
| {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"}
(1 row)

-- Case1: Alter is not allowed
postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406
ADD COLUMN a int;
ERROR:  42501: permission denied: "conflict_log_table_16406" is a system catalog
LOCATION:  RangeVarCallbackForAlterRelation, tablecmds.c:19634

-- Case2: drop is not allowed
postgres[1651968]=# drop table pg_conflict.conflict_log_table_16406;
ERROR:  42501: permission denied: "conflict_log_table_16406" is a system catalog
LOCATION:  RangeVarCallbackForDropRelation, tablecmds.c:1803

--Case3: Drop subscription drops it internally
postgres[1651968]=# DROP SUBSCRIPTION sub ;
NOTICE:  00000: dropped replication slot "sub" on publisher
LOCATION:  ReplicationSlotDropAtPubNode, subscriptioncmds.c:2470
DROP SUBSCRIPTION
postgres[1651968]=# \d pg_conflict.conflict_log_table_16406
Did not find any relation named "pg_conflict.conflict_log_table_16406".

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2025-12-24T07:41:57Z

On Tue, Dec 23, 2025 at 5:49 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Dilip.
>
> Here are some review comments after a first pass of patch v15-0001.
>

And, some more review comments for patch v15-0001.

======
src/backend/catalog/pg_subscription.c

1.
+ /* Always set the destination, default will be log. */
+ values[Anum_pg_subscription_sublogdestination - 1] =
+ CStringGetTextDatum(ConflictLogDestLabels[opts.logdest]);
+

None of the other values[] assignments here have a comment talking
about defaults, etc, so I don't think this needs one either.

======
src/backend/commands/subscriptioncmds.c

CreateSubscription:

2.
+ {
+ char    conflict_table_name[NAMEDATALEN];
+ Oid     namespaceId, logrelid;

In similar code in AlterSubscription, this was just called 'relname'.
Better to be consistent where possible. I think 'relname' would be
fine here too.

~~~

3.
+ else
+ {
+ /* Destination is "log"; no table is needed. */
+ values[Anum_pg_subscription_subconflictlogrelid - 1] =
+ ObjectIdGetDatum(InvalidOid);
+ }

I think it's better to say this using coded Asserts instead of just
assertions in comments.

e.g.

/* There is no conflict log table */
Assert(opts.logdest == CONFLICT_LOG_DEST_LOG)
values[...] = ObjectIdGetDatum(InvalidOid);

~~~

4.
+ if (isTempNamespace(namespaceId))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("could not generate conflict log table \"%s\"",
+ conflictrel),
+ errdetail("Conflict log tables cannot be created in a temporary namespace."),
+ errhint("Ensure your 'search_path' is set to permanent schema.")));
+
+ /* Report an error if the specified conflict log table already exists. */
+ if (OidIsValid(get_relname_relid(conflictrel, namespaceId)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_TABLE),
+ errmsg("could not generate conflict log table \"%s.%s\"",
+ get_namespace_name(namespaceId), conflictrel),
+ errdetail("A table with the internally generated name already exists."),
+ errhint("Drop the existing table or change your 'search_path' to use
a different schema.")));

I'm not sure about these messages:

4a.
"could not generate conflict log table".
- Why say "generate"?
- We don't need to say "conflict log table" -- that's already in the detail

SUGGESTION (something like)
"could not create relation \"%s\""

~

4b.
For the 2nd error, I think errmsg should look like below, same as any
other duplicate table error.
"relation \"%s.%s\" already exists"

~

4c.
+ errdetail("A table with the internally generated name already exists."),

I don't think this errdetail added anything useful. It already exists
-- that's all you need to know. Why does it matter that the name was
generated automatically?

~~~

GetLogDestination:

5.
+ for (int i = CONFLICT_LOG_DEST_LOG; i <= CONFLICT_LOG_DEST_ALL; i++)
+ {
+ if (pg_strcasecmp(dest, ConflictLogDestLabels[i]) == 0)
+ return (ConflictLogDest) i;
+ }
+
+ /* Unrecognized string. */
+ return CONFLICT_LOG_DEST_INVALID;

This code is making rash assumptions about the enums values being the
same as ordinals.

IMO it should be written like:

if (strcmp(dest, "log") == 0)
return CONFLICT_LOG_DEST_LOG;

if (strcmp(dest, "table") == 0)
return CONFLICT_LOG_DEST_TABLE;

if (strcmp(dest, "all") == 0)
return CONFLICT_LOG_DEST_ALL;

/* Unrecognized dest. */
ereport(ERROR, ...);

~~~

IsConflictLogTable

6.
+bool
+IsConflictLogTable(Oid relid)
+{
+ Relation        rel;

If you enforce (as I've suggested elsewhere previously) a name
convention that the CLT must have "pg_" prefix, then perhaps you can
exit early from this function without having to scan all the OIDs,
just by checking first that the RelationGetRelationName(rel) must
start with "pg_".

======
src/test/regress/sql/subscription.sql

7.
+-- fail - unrecognized format value

/format/parameter/

~~

8.
Some of these tests are grouped together like

"ALTER: State transitions"
and
"Ensure drop table is not allowed, and DROP SUBSCRIPTION reaps the table"
etc.

These group boundaries should be identified more clearly with more
substantial comments.
e.g
#-- ==================================
#-- ALTER - state transition tests
#-- ==================================

~~~

9.
The "pg_relation_is_publishable" seems misplaced because it is buried
among the drop/reap tests. Maybe it should come before all that.

======
src/tools/pgindent/typedefs.list

10.
What about "typedef enum ConflictLogDest"

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-24T10:32:15Z

On Tue, Dec 23, 2025 at 5:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > I think this needs more thought, others can be fixed.
> > > >
> > > > > 2)
> > > > > postgres=# drop schema shveta cascade;
> > > > > NOTICE:  drop cascades to subscription sub1
> > > > > ERROR:  global objects cannot be deleted by doDeletion
> > > > >
> > > > > Is this expected? Is the user supposed to see this error?
> > > > >
> > > > See below code, so this says if the object being dropped is the
> > > > outermost object (i.e. if we are dropping the table directly) then it
> > > > will disallow dropping the object on which it has INTERNAL DEPENDENCY,
> > > > OTOH if the object is being dropped via recursive drop (i.e. the table
> > > > is being dropped while dropping the schema) then object on which it
> > > > has INTERNAL dependency will also be added to the deletion list and
> > > > later will be dropped via doDeletion and later we are getting error as
> > > > subscription is a global object.  I thought maybe we can handle an
> > > > additional case that the INTERNAL DEPENDENCY, is on subscription the
> > > > disallow dropping it irrespective of whether it is being called
> > > > directly or via recursive drop but then it will give an issue even
> > > > when we are trying to drop table during subscription drop, we can make
> > > > handle this case as well via 'flags' passed in findDependentObjects()
> > > > but need more investigation.
> > > >
> > > > Seeing this complexity makes me think more on is it really worth it to
> > > > maintain this dependency?  Because during subscription drop we anyway
> > > > have to call performDeletion externally because this dependency is
> > > > local so we are just disallowing the conflict table drop, however the
> > > > ALTER table is allowed so what we are really protecting by protecting
> > > > the table drop, I think it can be just documented that if user try to
> > > > drop the table then conflict will not be inserted anymore?
> > > >
> > > > findDependentObjects()
> > > > {
> > > > ...
> > > >      switch (foundDep->deptype)
> > > >      {
> > > >          ....
> > > >          case DEPENDENCY_INTERNAL:
> > > >             * 1. At the outermost recursion level, we must disallow the
> > > >             * DROP. However, if the owning object is listed in
> > > >             * pendingObjects, just release the caller's lock and return;
> > > >             * we'll eventually complete the DROP when we reach that entry
> > > >             * in the pending list.
> > > >      }
> > > > }
> > > >
> > > > [1]
> > > > postgres[1333899]=# select * from pg_depend where objid > 16410;
> > > >  classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> > > > ---------+-------+----------+------------+----------+-------------+---------
> > > >     1259 | 16420 |        0 |       2615 |    16410 |           0 | n
> > > >     1259 | 16420 |        0 |       6100 |    16419 |           0 | i
> > > > (4 rows)
> > > >
> > > > 16420 -> conflict_log_table_16419
> > > > 16419 -> subscription
> > > > 16410 -> schema s1
> > > >
> > >
> > > One approach could be to use something similar to
> > > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive
> > > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would
> > > proceed without error, i.e., it would drop the tables as well without
> > > including the subscription in the dependency list. But if we try to
> > > drop a table directly (e.g., DROP TABLE CLT), it will still result in:
> > > ERROR: cannot drop table because subscription sub1 requires it
> > >
> >
> > I think this way of allowing dropping the conflict table without
> > caring for the parent object (subscription) is not a good idea. How
> > about creating a dedicated schema, say pg_conflict for the purpose of
> > storing conflict tables? This will be similar to the pg_toast schema
> > for toast tables. So, similar to that each database will have a
> > pg_conflict schema. It prevents the "orphan" problem where a user
> > accidentally drops the logging schema but the Subscription is still
> > trying to write to it. pg_dump needs to ignore all system schemas
> > EXCEPT pg_conflict. This ensures the history is preserved during
> > migrations while still protecting the tables from accidental user
> > deletion. About permissions, I think we need to set the schema
> > permissions so that USAGE is public (so users can SELECT from their
> > logs) but CREATE is restricted to the superuser/subscription owner. We
> > may need to think some more about permissions.
> >
> > I also tried to reason out if we can allow storing the conflict table
> > in pg_catalog but here are a few reasons why it won't be a good idea.
> > I think by default, pg_dump completely ignores the pg_catalog schema.
> > It assumes pg_catalog contains static system definitions (like
> > pg_class, pg_proc, etc.) that are re-generated by the initdb process,
> > not user data. If we place a conflict table in pg_catalog, it will not
> > be backed up. If a user runs pg_dump/all to migrate to a new server,
> > their subscription definition will survive, but their entire history
> > of conflict logs will vanish. Also from the permissions angle, If a
> > user wants to write a custom PL/pgSQL function to "retry" conflicts,
> > they might need to DELETE rows from the conflict table after fixing
> > them. Granting DELETE permissions on a table inside pg_catalog is
> > non-standard and often frowned upon by security auditors. It blurs the
> > line between "System Internals" (immutable) and "User Data" (mutable).
> > So, in short a separate pg_conflict schema appears to be a better solution.
>
> Yeah that makes sense.  Although I haven't thought about all cases
> whether it can be a problem anywhere, but meanwhile I tried
> prototyping with this and it behaves what we want.
>
> postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ;
>  relid | schemaname | relname |     conflict_type     | remote_xid |
> remote_commit_lsn |       remote_commit_ts        | remote_origin |
> replica_identity |  remote_tuple
> |
> local_conflicts
> -------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+----------------
> +------------------------------------------------------------------------------------------------------------------------------------
>  16385 | public     | test    | update_origin_differs |        761 |
> 0/01760BD8        | 2025-12-23 11:08:30.583816+00 | pg_16406      |
> {"a":1}          | {"a":1,"b":20}
> | {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"}
> (1 row)
>
> -- Case1: Alter is not allowed
> postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406
> ADD COLUMN a int;
> ERROR:  42501: permission denied: "conflict_log_table_16406" is a system catalog
> LOCATION:  RangeVarCallbackForAlterRelation, tablecmds.c:19634
>

How was this achieved? Did you modify IsSystemClass to behave
similarly to IsToastClass?

I tried to analyze whether there are alternative approaches. The
possible options I see are:

1)
heap_create_with_catalog() provides the boolean argument use_user_acl,
which is meant to apply user-defined default privileges. In theory, we
could predefine default ACLs for our schema and then invoke
heap_create_with_catalog() with use_user_acl = true. But it’s not
clear how to do this purely from internal code. We would need to mimic
or reuse the logic behind SetDefaultACLsInSchemas.

2)
Another option is to create the table using heap_create_with_catalog()
with use_user_acl = false, and then explicitly update pg_class.relacl
for that table, similar to what ExecGrant_Relation does when
processing GRANT/REVOKE. But I couldn’t find any existing internal
code paths (outside of the GRANT/REVOKE implementation itself) that do
this kind of post-creation ACL manipulation.
~~

So overall, I feel changing IsSystemClass is the simpler way right
now. To set ACL before/after/during heap_create_with_catalog is a
tricky thing, at-least I could not find an easier way to do this,
unless I have missed something.
Thoughts on possible approaches?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-24T11:59:15Z

On Fri, 19 Dec 2025 at 11:49, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 19, 2025 at 10:40 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Dec 19, 2025 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> >
> > > 2. Do we want to support multi destination then providing string like
> > > 'conflict_log_destination = 'log,table,..' make more sense but then we
> > > would have to store as a string in catalog and parse it everytime we
> > > insert conflicts or alter subscription OTOH currently I have just
> > > support single option log/table/both which make things much easy
> > > because then in catalog we can store as a single char field and don't
> > > need any parsing.  And since the input are taken as a string itself,
> > > even if in future we want to support more options like  'log,table,..'
> > > it would be backward compatible with old options.
> >
> > I feel, combination of options might be a good idea, similar to how
> > 'log_destination' provides. But it can be done in future versions and
> > the first draft can be a simple one.
> >
>
> Considering the future extension of storing conflict information in
> multiple places, it would be good to follow log_destination. Yes, it
> is more work now but I feel that will be future-proof.

The attached patch has the changes to specify conflict_log_destination
with a combination of table, log and all. This is implemented in
v15-0006 patch, there is no change in other patched v15-0001 ...
v15-0005 patches which are the same as the patches attached from [1].

[1] - https://www.postgresql.org/message-id/CALDaNm1zR1L2oq-LqYEcc8-wTZYjfJsiaTC_jQ8pGwbm0fv%2B3Q%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-25T07:40:34Z

On Wed, Dec 24, 2025 at 4:02 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Dec 23, 2025 at 5:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > I think this needs more thought, others can be fixed.
> > > > >
> > > > > > 2)
> > > > > > postgres=# drop schema shveta cascade;
> > > > > > NOTICE:  drop cascades to subscription sub1
> > > > > > ERROR:  global objects cannot be deleted by doDeletion
> > > > > >
> > > > > > Is this expected? Is the user supposed to see this error?
> > > > > >
> > > > > See below code, so this says if the object being dropped is the
> > > > > outermost object (i.e. if we are dropping the table directly) then it
> > > > > will disallow dropping the object on which it has INTERNAL DEPENDENCY,
> > > > > OTOH if the object is being dropped via recursive drop (i.e. the table
> > > > > is being dropped while dropping the schema) then object on which it
> > > > > has INTERNAL dependency will also be added to the deletion list and
> > > > > later will be dropped via doDeletion and later we are getting error as
> > > > > subscription is a global object.  I thought maybe we can handle an
> > > > > additional case that the INTERNAL DEPENDENCY, is on subscription the
> > > > > disallow dropping it irrespective of whether it is being called
> > > > > directly or via recursive drop but then it will give an issue even
> > > > > when we are trying to drop table during subscription drop, we can make
> > > > > handle this case as well via 'flags' passed in findDependentObjects()
> > > > > but need more investigation.
> > > > >
> > > > > Seeing this complexity makes me think more on is it really worth it to
> > > > > maintain this dependency?  Because during subscription drop we anyway
> > > > > have to call performDeletion externally because this dependency is
> > > > > local so we are just disallowing the conflict table drop, however the
> > > > > ALTER table is allowed so what we are really protecting by protecting
> > > > > the table drop, I think it can be just documented that if user try to
> > > > > drop the table then conflict will not be inserted anymore?
> > > > >
> > > > > findDependentObjects()
> > > > > {
> > > > > ...
> > > > >      switch (foundDep->deptype)
> > > > >      {
> > > > >          ....
> > > > >          case DEPENDENCY_INTERNAL:
> > > > >             * 1. At the outermost recursion level, we must disallow the
> > > > >             * DROP. However, if the owning object is listed in
> > > > >             * pendingObjects, just release the caller's lock and return;
> > > > >             * we'll eventually complete the DROP when we reach that entry
> > > > >             * in the pending list.
> > > > >      }
> > > > > }
> > > > >
> > > > > [1]
> > > > > postgres[1333899]=# select * from pg_depend where objid > 16410;
> > > > >  classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> > > > > ---------+-------+----------+------------+----------+-------------+---------
> > > > >     1259 | 16420 |        0 |       2615 |    16410 |           0 | n
> > > > >     1259 | 16420 |        0 |       6100 |    16419 |           0 | i
> > > > > (4 rows)
> > > > >
> > > > > 16420 -> conflict_log_table_16419
> > > > > 16419 -> subscription
> > > > > 16410 -> schema s1
> > > > >
> > > >
> > > > One approach could be to use something similar to
> > > > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive
> > > > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would
> > > > proceed without error, i.e., it would drop the tables as well without
> > > > including the subscription in the dependency list. But if we try to
> > > > drop a table directly (e.g., DROP TABLE CLT), it will still result in:
> > > > ERROR: cannot drop table because subscription sub1 requires it
> > > >
> > >
> > > I think this way of allowing dropping the conflict table without
> > > caring for the parent object (subscription) is not a good idea. How
> > > about creating a dedicated schema, say pg_conflict for the purpose of
> > > storing conflict tables? This will be similar to the pg_toast schema
> > > for toast tables. So, similar to that each database will have a
> > > pg_conflict schema. It prevents the "orphan" problem where a user
> > > accidentally drops the logging schema but the Subscription is still
> > > trying to write to it. pg_dump needs to ignore all system schemas
> > > EXCEPT pg_conflict. This ensures the history is preserved during
> > > migrations while still protecting the tables from accidental user
> > > deletion. About permissions, I think we need to set the schema
> > > permissions so that USAGE is public (so users can SELECT from their
> > > logs) but CREATE is restricted to the superuser/subscription owner. We
> > > may need to think some more about permissions.
> > >
> > > I also tried to reason out if we can allow storing the conflict table
> > > in pg_catalog but here are a few reasons why it won't be a good idea.
> > > I think by default, pg_dump completely ignores the pg_catalog schema.
> > > It assumes pg_catalog contains static system definitions (like
> > > pg_class, pg_proc, etc.) that are re-generated by the initdb process,
> > > not user data. If we place a conflict table in pg_catalog, it will not
> > > be backed up. If a user runs pg_dump/all to migrate to a new server,
> > > their subscription definition will survive, but their entire history
> > > of conflict logs will vanish. Also from the permissions angle, If a
> > > user wants to write a custom PL/pgSQL function to "retry" conflicts,
> > > they might need to DELETE rows from the conflict table after fixing
> > > them. Granting DELETE permissions on a table inside pg_catalog is
> > > non-standard and often frowned upon by security auditors. It blurs the
> > > line between "System Internals" (immutable) and "User Data" (mutable).
> > > So, in short a separate pg_conflict schema appears to be a better solution.
> >
> > Yeah that makes sense.  Although I haven't thought about all cases
> > whether it can be a problem anywhere, but meanwhile I tried
> > prototyping with this and it behaves what we want.
> >
> > postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ;
> >  relid | schemaname | relname |     conflict_type     | remote_xid |
> > remote_commit_lsn |       remote_commit_ts        | remote_origin |
> > replica_identity |  remote_tuple
> > |
> > local_conflicts
> > -------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+----------------
> > +------------------------------------------------------------------------------------------------------------------------------------
> >  16385 | public     | test    | update_origin_differs |        761 |
> > 0/01760BD8        | 2025-12-23 11:08:30.583816+00 | pg_16406      |
> > {"a":1}          | {"a":1,"b":20}
> > | {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"}
> > (1 row)
> >
> > -- Case1: Alter is not allowed
> > postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406
> > ADD COLUMN a int;
> > ERROR:  42501: permission denied: "conflict_log_table_16406" is a system catalog
> > LOCATION:  RangeVarCallbackForAlterRelation, tablecmds.c:19634
> >
>
> How was this achieved? Did you modify IsSystemClass to behave
> similarly to IsToastClass?

Right

> I tried to analyze whether there are alternative approaches. The
> possible options I see are:
>
> 1)
> heap_create_with_catalog() provides the boolean argument use_user_acl,
> which is meant to apply user-defined default privileges. In theory, we
> could predefine default ACLs for our schema and then invoke
> heap_create_with_catalog() with use_user_acl = true. But it’s not
> clear how to do this purely from internal code. We would need to mimic
> or reuse the logic behind SetDefaultACLsInSchemas.
> 2)
> Another option is to create the table using heap_create_with_catalog()
> with use_user_acl = false, and then explicitly update pg_class.relacl
> for that table, similar to what ExecGrant_Relation does when
> processing GRANT/REVOKE. But I couldn’t find any existing internal
> code paths (outside of the GRANT/REVOKE implementation itself) that do
> this kind of post-creation ACL manipulation.

I haven't analyzed this options, I will do that but not before Jan 3rd
as I will be away from my laptop for a week.

> So overall, I feel changing IsSystemClass is the simpler way right
> now. To set ACL before/after/during heap_create_with_catalog is a
> tricky thing, at-least I could not find an easier way to do this,
> unless I have missed something.
> Thoughts on possible approaches?

Here is the patches I have changed by using IsSystemClass(), based on
this many other things changed like we don't need to check for the
temp schema and also the caller of create_conflict_log_table() now
don't need to find the creation schema so it don't need to generate
the relname so that part is also moved within
create_conflict_log_table().  Fixed most of the comments given by
Peter and Shveta, although some of them are still open e.g. the name
of the conflict log table as of now I have kept as
conflict_log_table_<subid> other options are

1. pg_conflict_<subid>
2. conflict_log_<subid>
3. sub_conflict_log_<subid>

I prefer 3, considering it says this table holds subscription conflict
logs.  Thoughts?

Vignesh, your patches have to be rebased on the new version.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2025-12-26T15:27:57Z

On Thu, 25 Dec 2025 at 13:10, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Dec 24, 2025 at 4:02 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Dec 23, 2025 at 5:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > > > >
> > > > > > I think this needs more thought, others can be fixed.
> > > > > >
> > > > > > > 2)
> > > > > > > postgres=# drop schema shveta cascade;
> > > > > > > NOTICE:  drop cascades to subscription sub1
> > > > > > > ERROR:  global objects cannot be deleted by doDeletion
> > > > > > >
> > > > > > > Is this expected? Is the user supposed to see this error?
> > > > > > >
> > > > > > See below code, so this says if the object being dropped is the
> > > > > > outermost object (i.e. if we are dropping the table directly) then it
> > > > > > will disallow dropping the object on which it has INTERNAL DEPENDENCY,
> > > > > > OTOH if the object is being dropped via recursive drop (i.e. the table
> > > > > > is being dropped while dropping the schema) then object on which it
> > > > > > has INTERNAL dependency will also be added to the deletion list and
> > > > > > later will be dropped via doDeletion and later we are getting error as
> > > > > > subscription is a global object.  I thought maybe we can handle an
> > > > > > additional case that the INTERNAL DEPENDENCY, is on subscription the
> > > > > > disallow dropping it irrespective of whether it is being called
> > > > > > directly or via recursive drop but then it will give an issue even
> > > > > > when we are trying to drop table during subscription drop, we can make
> > > > > > handle this case as well via 'flags' passed in findDependentObjects()
> > > > > > but need more investigation.
> > > > > >
> > > > > > Seeing this complexity makes me think more on is it really worth it to
> > > > > > maintain this dependency?  Because during subscription drop we anyway
> > > > > > have to call performDeletion externally because this dependency is
> > > > > > local so we are just disallowing the conflict table drop, however the
> > > > > > ALTER table is allowed so what we are really protecting by protecting
> > > > > > the table drop, I think it can be just documented that if user try to
> > > > > > drop the table then conflict will not be inserted anymore?
> > > > > >
> > > > > > findDependentObjects()
> > > > > > {
> > > > > > ...
> > > > > >      switch (foundDep->deptype)
> > > > > >      {
> > > > > >          ....
> > > > > >          case DEPENDENCY_INTERNAL:
> > > > > >             * 1. At the outermost recursion level, we must disallow the
> > > > > >             * DROP. However, if the owning object is listed in
> > > > > >             * pendingObjects, just release the caller's lock and return;
> > > > > >             * we'll eventually complete the DROP when we reach that entry
> > > > > >             * in the pending list.
> > > > > >      }
> > > > > > }
> > > > > >
> > > > > > [1]
> > > > > > postgres[1333899]=# select * from pg_depend where objid > 16410;
> > > > > >  classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> > > > > > ---------+-------+----------+------------+----------+-------------+---------
> > > > > >     1259 | 16420 |        0 |       2615 |    16410 |           0 | n
> > > > > >     1259 | 16420 |        0 |       6100 |    16419 |           0 | i
> > > > > > (4 rows)
> > > > > >
> > > > > > 16420 -> conflict_log_table_16419
> > > > > > 16419 -> subscription
> > > > > > 16410 -> schema s1
> > > > > >
> > > > >
> > > > > One approach could be to use something similar to
> > > > > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive
> > > > > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would
> > > > > proceed without error, i.e., it would drop the tables as well without
> > > > > including the subscription in the dependency list. But if we try to
> > > > > drop a table directly (e.g., DROP TABLE CLT), it will still result in:
> > > > > ERROR: cannot drop table because subscription sub1 requires it
> > > > >
> > > >
> > > > I think this way of allowing dropping the conflict table without
> > > > caring for the parent object (subscription) is not a good idea. How
> > > > about creating a dedicated schema, say pg_conflict for the purpose of
> > > > storing conflict tables? This will be similar to the pg_toast schema
> > > > for toast tables. So, similar to that each database will have a
> > > > pg_conflict schema. It prevents the "orphan" problem where a user
> > > > accidentally drops the logging schema but the Subscription is still
> > > > trying to write to it. pg_dump needs to ignore all system schemas
> > > > EXCEPT pg_conflict. This ensures the history is preserved during
> > > > migrations while still protecting the tables from accidental user
> > > > deletion. About permissions, I think we need to set the schema
> > > > permissions so that USAGE is public (so users can SELECT from their
> > > > logs) but CREATE is restricted to the superuser/subscription owner. We
> > > > may need to think some more about permissions.
> > > >
> > > > I also tried to reason out if we can allow storing the conflict table
> > > > in pg_catalog but here are a few reasons why it won't be a good idea.
> > > > I think by default, pg_dump completely ignores the pg_catalog schema.
> > > > It assumes pg_catalog contains static system definitions (like
> > > > pg_class, pg_proc, etc.) that are re-generated by the initdb process,
> > > > not user data. If we place a conflict table in pg_catalog, it will not
> > > > be backed up. If a user runs pg_dump/all to migrate to a new server,
> > > > their subscription definition will survive, but their entire history
> > > > of conflict logs will vanish. Also from the permissions angle, If a
> > > > user wants to write a custom PL/pgSQL function to "retry" conflicts,
> > > > they might need to DELETE rows from the conflict table after fixing
> > > > them. Granting DELETE permissions on a table inside pg_catalog is
> > > > non-standard and often frowned upon by security auditors. It blurs the
> > > > line between "System Internals" (immutable) and "User Data" (mutable).
> > > > So, in short a separate pg_conflict schema appears to be a better solution.
> > >
> > > Yeah that makes sense.  Although I haven't thought about all cases
> > > whether it can be a problem anywhere, but meanwhile I tried
> > > prototyping with this and it behaves what we want.
> > >
> > > postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ;
> > >  relid | schemaname | relname |     conflict_type     | remote_xid |
> > > remote_commit_lsn |       remote_commit_ts        | remote_origin |
> > > replica_identity |  remote_tuple
> > > |
> > > local_conflicts
> > > -------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+----------------
> > > +------------------------------------------------------------------------------------------------------------------------------------
> > >  16385 | public     | test    | update_origin_differs |        761 |
> > > 0/01760BD8        | 2025-12-23 11:08:30.583816+00 | pg_16406      |
> > > {"a":1}          | {"a":1,"b":20}
> > > | {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"}
> > > (1 row)
> > >
> > > -- Case1: Alter is not allowed
> > > postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406
> > > ADD COLUMN a int;
> > > ERROR:  42501: permission denied: "conflict_log_table_16406" is a system catalog
> > > LOCATION:  RangeVarCallbackForAlterRelation, tablecmds.c:19634
> > >
> >
> > How was this achieved? Did you modify IsSystemClass to behave
> > similarly to IsToastClass?
>
> Right
>
> > I tried to analyze whether there are alternative approaches. The
> > possible options I see are:
> >
> > 1)
> > heap_create_with_catalog() provides the boolean argument use_user_acl,
> > which is meant to apply user-defined default privileges. In theory, we
> > could predefine default ACLs for our schema and then invoke
> > heap_create_with_catalog() with use_user_acl = true. But it’s not
> > clear how to do this purely from internal code. We would need to mimic
> > or reuse the logic behind SetDefaultACLsInSchemas.
> > 2)
> > Another option is to create the table using heap_create_with_catalog()
> > with use_user_acl = false, and then explicitly update pg_class.relacl
> > for that table, similar to what ExecGrant_Relation does when
> > processing GRANT/REVOKE. But I couldn’t find any existing internal
> > code paths (outside of the GRANT/REVOKE implementation itself) that do
> > this kind of post-creation ACL manipulation.
>
> I haven't analyzed this options, I will do that but not before Jan 3rd
> as I will be away from my laptop for a week.
>
> > So overall, I feel changing IsSystemClass is the simpler way right
> > now. To set ACL before/after/during heap_create_with_catalog is a
> > tricky thing, at-least I could not find an easier way to do this,
> > unless I have missed something.
> > Thoughts on possible approaches?
>
> Here is the patches I have changed by using IsSystemClass(), based on
> this many other things changed like we don't need to check for the
> temp schema and also the caller of create_conflict_log_table() now
> don't need to find the creation schema so it don't need to generate
> the relname so that part is also moved within
> create_conflict_log_table().  Fixed most of the comments given by
> Peter and Shveta, although some of them are still open e.g. the name
> of the conflict log table as of now I have kept as
> conflict_log_table_<subid> other options are
>
> 1. pg_conflict_<subid>
> 2. conflict_log_<subid>
> 3. sub_conflict_log_<subid>
>
> I prefer 3, considering it says this table holds subscription conflict
> logs.  Thoughts?
>
> Vignesh, your patches have to be rebased on the new version.

Here is a rebased version of the remaining patches.

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2025-12-29T06:02:38Z

On Thu, Dec 25, 2025 at 1:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Here is the patches I have changed by using IsSystemClass(), based on
> this many other things changed like we don't need to check for the
> temp schema and also the caller of create_conflict_log_table() now
> don't need to find the creation schema so it don't need to generate
> the relname so that part is also moved within
> create_conflict_log_table().  Fixed most of the comments given by
> Peter and Shveta, although some of them are still open e.g. the name
> of the conflict log table as of now I have kept as
> conflict_log_table_<subid> other options are
>
> 1. pg_conflict_<subid>
> 2. conflict_log_<subid>
> 3. sub_conflict_log_<subid>
>
> I prefer 3, considering it says this table holds subscription conflict
> logs.  Thoughts?
>

I was checking how pg_toast does it. It creates tables with names:
"pg_toast_%u", relOid

We can do similar i.e., the schema name as pg_conflict and table name
as pg_conflict_<subid>. Thoughts?

Few comments on 001:

1)
It will be good to display conflict tablename in \dRs command

2)
postgres=# ALTER TABLE  sch1.t3 set schema pg_toast;
ERROR:  cannot move objects into or out of TOAST schema

But when we move to pg_conflict, it works. It should error out as well.
postgres=# ALTER TABLE  sch1.t1 set schema pg_conflict;
ALTER TABLE

3)
Shall we LOG CLT creation and drop during create/alter sub?

4)
create_conflict_log_table()
+ /* Report an error if the specified conflict log table already exists. */
+ if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_TABLE),
+ errmsg("relation \"%s.%s\" already exists",
+ get_namespace_name(PG_CONFLICT_NAMESPACE), relname)));

I am unable to think of a valid user-scenario when the above will be
hit. Do we need this as a user-error or simply an Assert or
internal-error will do?

5)
+ /*
+ * Establish an internal dependency between the conflict log table and the
+ * subscription.  By using DEPENDENCY_INTERNAL, we ensure the table is
+ * automatically reaped when the subscription is dropped. This also
+ * prevents the table from being dropped independently unless the
+ * subscription itself is removed.
+ */
+ ObjectAddressSet(myself, RelationRelationId, relid);
+ ObjectAddressSet(subaddr, SubscriptionRelationId, subid);
+ recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL);

Now that we have pg_conflict, which is treated similarly to a system
catalog, I’m wondering whether we actually need to maintain this
dependency to prevent the CLT table or schema from being dropped.
Also, given that this currently goes against the convention that a
shared object cannot be present in pg_depend, could DropSubscription()
and AlterSubscription() instead handle dropping the table explicitly
in required scenarios?

6)
+  descr => 'reserved schema for conflict tables',

Shall we say:  'reserved schema for subscription-specific conflict tables'

or anything better to include that it is subscription related?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-01T13:46:00Z

On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > 2.
> > > > > +typedef enum ConflictLogDest
> > > > > +{
> > > > > + /* Log conflicts to the server logs */
> > > > > + CONFLICT_LOG_DEST_LOG   = 1 << 0,   /* 0x01 */
> > > > > +
> > > > > + /* Log conflicts to an internally managed conflict log table */
> > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1,   /* 0x02 */
> > > > > +
> > > > > + /* Convenience bitmask for all supported destinations */
> > > > > + CONFLICT_LOG_DEST_ALL   = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE)
> > > > > +} ConflictLogDest;
> > > > > +
> > > > > +/*
> > > > > + * Array mapping for converting internal enum to string.
> > > > > + */
> > > > > +static const char *const ConflictLogDestNames[] = {
> > > > > + [CONFLICT_LOG_DEST_LOG] = "log",
> > > > > + [CONFLICT_LOG_DEST_TABLE] = "table",
> > > > > + [CONFLICT_LOG_DEST_ALL] = "all"
> > > > > +};
> > > > >
> > > > > Defining an array this way could be an Array size issue. Actually the
> > > > > array has just three elements so the last element should be at
> > > > > ConflictLogDestNames[2] but if we go by the above definition, it will
> > > > > be ConflictLogDestNames[3]. Can we define by referring the following
> > > > > existing way:
> > >
> > > I was analyzing this because I remember we were initially using the
> > > format you suggested and switched to the bit format to enable direct
> > > bitwise operations elsewhere.  I think Peter suggested that [1], and
> > > the argument was that the bitwise operation is easy if we represent
> > > them as a bit. Also, since we would not have too many options, the
> > > array size shouldn't be an issue.  But I understand your point: adding
> > > more elements will cause the array size to grow very fast as this is
> > > using sparse array.  Let's see what others think about this, and then
> > > we can decide whether to change it back?
> > >
> >
> > The benefit of the current approach is that checking whether the
> > destination is TABLE becomes straightforward:
> >
> > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE)
> >
> > if we go by regular enum values (simialr to XLogSource), then it will be:
> >
> >  if (opts.logdest == CONFLICT_LOG_DEST_TABLE ||
> >      opts.logdest == CONFLICT_LOG_DEST_ALL)
>
> Right
>
> > For ease of extending the enum and its corresponding text mappings, my
> > personal preference is still the regular (non-bitwise) enum approach.
>
> Yeah, that's my personal preference too.  But Peter had strong stand
> on keeping as bitwise so that we can directly use
> IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations.
> Since this array shouldn't have many options, a sparse array is not an
> issue.  So lets see what @Peter Smith has to say here and then we can
> build a concensus on this.
>
> > But if we anticipate adding more destination options in the future
> > that would be covered by ALL, checking for those in code could lead to
> > growing chains of OR conditions, whereas the bitwise approach scales
> > more cleanly in that respect. So I think the choice depends on what
> > kinds of future extensions we expect.
> >
> > Do we have plans to add more options that would naturally fall under
> > ALL? Or do we instead expect additions that are mutually exclusive;
> > for example, splitting CONFLICT_LOG_DEST_LOG into something like
> > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may
> > not make sense to group under ALL in the same way?
>
> Currently, I haven't considered which options would naturally fall
> under "ALL." Perhaps if we plan targets other than logs and files,
> those might also fall under "ALL."

I have fixed all the reported comments except these four.
1. I'm changing the ConflictLogDest enum from bitmap to integer. I can
revert this in the next version but I want to see Peter's opinion
first, as he suggested using a bitmap to easily apply bitwise
operators.

2. Change how to display conflict log table in \dRs+ as suggested by
Shveta and Amit have agreement on the same, I will update that in next
version.

3. As Vignesh reported, we are still determining the best way to
change the client's ownership when the subscription ownership changes.

4. pg_conflict is the catalog schema and as Nisha reported,
non-superusers aren't allowed to access the objects within it. Because
of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
subscription owner if that owner is a non-superuser. I am working on
the fix.


Note: I have included the base patch for reporting the schema
qualified name, which is also being discussed in other thread,
@vignesh C you need to rebase your patch and might need to fix the
table name, as we are now using `pg_conflict_log_<subid>` for the
conflict log table.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-02T09:10:02Z

On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> 4. pg_conflict is the catalog schema and as Nisha reported,
> non-superusers aren't allowed to access the objects within it. Because
> of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
> subscription owner if that owner is a non-superuser. I am working on
> the fix.

While analyzing this, I realized that the schema ACL check happens
very early in analyze phase [1]. I'm not sure if we can bypass the
subscription owner from this check at that stage without implementing
a hacky solution.  Another option is to remove restrictions from the
pg_conflict schema for all users and keep only table-level
restrictions within that schema. I am exploring how to implement this.


#1  0x0000561b547713fe in aclcheck_error (aclerr=ACLCHECK_NO_PRIV,
objtype=OBJECT_SCHEMA, objectname=0x561b8299a4d0 "pg_conflict") at
aclchk.c:2813
#2  0x0000561b54790fe7 in LookupExplicitNamespace
(nspname=0x561b8299a4d0 "pg_conflict", missing_ok=true) at
namespace.c:3481
#3  0x0000561b5478ca48 in RangeVarGetRelidExtended
(relation=0x561b8299a590, lockmode=1, flags=1, callback=0x0,
callback_arg=0x0) at namespace.c:531
#4  0x0000561b54645779 in relation_openrv_extended
(relation=0x561b8299a590, lockmode=1, missing_ok=true) at
relation.c:186
#5  0x0000561b5470e7ba in table_openrv_extended
(relation=0x561b8299a590, lockmode=1, missing_ok=true) at table.c:108
#6  0x0000561b548383a2 in parserOpenTable (pstate=0x561b8299a7e0,
relation=0x561b8299a590, lockmode=1) at parse_relation.c:1433


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-04T05:48:37Z

On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > 4. pg_conflict is the catalog schema and as Nisha reported,
> > non-superusers aren't allowed to access the objects within it. Because
> > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
> > subscription owner if that owner is a non-superuser. I am working on
> > the fix.
>
> While analyzing this, I realized that the schema ACL check happens
> very early in analyze phase [1]. I'm not sure if we can bypass the
> subscription owner from this check at that stage without implementing
> a hacky solution.  Another option is to remove restrictions from the
> pg_conflict schema for all users and keep only table-level
> restrictions within that schema. I am exploring how to implement this.

Dilip, instead of granting permission (or removing restrictions) on
the pg_conflict schema to all users, is there a way to grant USAGE on
the schema only to the subscription owner when the conflict log table
is created and when the owner is altered for the subscription? I think
it should resolve the problem in a better way. Thoughts?  Let me know
if I am missing something.


thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-04T05:51:53Z

On Mon, May 4, 2026 at 11:18 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > 4. pg_conflict is the catalog schema and as Nisha reported,
> > > non-superusers aren't allowed to access the objects within it. Because
> > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
> > > subscription owner if that owner is a non-superuser. I am working on
> > > the fix.
> >
> > While analyzing this, I realized that the schema ACL check happens
> > very early in analyze phase [1]. I'm not sure if we can bypass the
> > subscription owner from this check at that stage without implementing
> > a hacky solution.  Another option is to remove restrictions from the
> > pg_conflict schema for all users and keep only table-level
> > restrictions within that schema. I am exploring how to implement this.
>
> Dilip, instead of granting permission (or removing restrictions) on
> the pg_conflict schema to all users, is there a way to grant USAGE on
> the schema only to the subscription owner when the conflict log table
> is created and when the owner is altered for the subscription? I think
> it should resolve the problem in a better way. Thoughts?  Let me know
> if I am missing something.

Yeah I thought about that but when you create a subscription, you
connected using the subscription owner user, who doesn't have the
necessary permission to GRANT usage on pg_conflict schema.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-04T09:06:49Z

On Mon, 4 May 2026 at 11:21 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

> On Mon, May 4, 2026 at 11:18 AM shveta malik <shveta.malik@gmail.com>
> wrote:
> >
> > On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com>
> wrote:
> > >
> > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com>
> wrote:
> > > >
> > > > 4. pg_conflict is the catalog schema and as Nisha reported,
> > > > non-superusers aren't allowed to access the objects within it.
> Because
> > > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
> > > > subscription owner if that owner is a non-superuser. I am working on
> > > > the fix.
> > >
> > > While analyzing this, I realized that the schema ACL check happens
> > > very early in analyze phase [1]. I'm not sure if we can bypass the
> > > subscription owner from this check at that stage without implementing
> > > a hacky solution.  Another option is to remove restrictions from the
> > > pg_conflict schema for all users and keep only table-level
> > > restrictions within that schema. I am exploring how to implement this.
> >
> > Dilip, instead of granting permission (or removing restrictions) on
> > the pg_conflict schema to all users, is there a way to grant USAGE on
> > the schema only to the subscription owner when the conflict log table
> > is created and when the owner is altered for the subscription? I think
> > it should resolve the problem in a better way. Thoughts?  Let me know
> > if I am missing something.
>
> Yeah I thought about that but when you create a subscription, you
> connected using the subscription owner user, who doesn't have the
> necessary permission to GRANT usage on pg_conflict schema.


After putting more thoughts I think we should be able to execute internal
GRAN function which do not checks whether the user has permission to GRANT
or not.

—
Dilip

>

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-04T09:13:13Z

On Mon, May 4, 2026 at 2:37 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, 4 May 2026 at 11:21 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>>
>> On Mon, May 4, 2026 at 11:18 AM shveta malik <shveta.malik@gmail.com> wrote:
>> >
>> > On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> > >
>> > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> > > >
>> > > > 4. pg_conflict is the catalog schema and as Nisha reported,
>> > > > non-superusers aren't allowed to access the objects within it. Because
>> > > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
>> > > > subscription owner if that owner is a non-superuser. I am working on
>> > > > the fix.
>> > >
>> > > While analyzing this, I realized that the schema ACL check happens
>> > > very early in analyze phase [1]. I'm not sure if we can bypass the
>> > > subscription owner from this check at that stage without implementing
>> > > a hacky solution.  Another option is to remove restrictions from the
>> > > pg_conflict schema for all users and keep only table-level
>> > > restrictions within that schema. I am exploring how to implement this.
>> >
>> > Dilip, instead of granting permission (or removing restrictions) on
>> > the pg_conflict schema to all users, is there a way to grant USAGE on
>> > the schema only to the subscription owner when the conflict log table
>> > is created and when the owner is altered for the subscription? I think
>> > it should resolve the problem in a better way. Thoughts?  Let me know
>> > if I am missing something.
>>
>> Yeah I thought about that but when you create a subscription, you
>> connected using the subscription owner user, who doesn't have the
>> necessary permission to GRANT usage on pg_conflict schema.
>
>
> After putting more thoughts I think we should be able to execute internal GRAN function which do not checks whether the user has permission to GRANT or not.
>

I have been trying to find an existing code example that does
somethign similar, but could not find one. But if you think it is
feasible and found a way, then it is the reasonable solution here.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-04T09:19:08Z

On Mon, May 4, 2026 at 2:43 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, May 4, 2026 at 2:37 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, 4 May 2026 at 11:21 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >>
> >> On Mon, May 4, 2026 at 11:18 AM shveta malik <shveta.malik@gmail.com> wrote:
> >> >
> >> > On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >> > >
> >> > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >> > > >
> >> > > > 4. pg_conflict is the catalog schema and as Nisha reported,
> >> > > > non-superusers aren't allowed to access the objects within it. Because
> >> > > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
> >> > > > subscription owner if that owner is a non-superuser. I am working on
> >> > > > the fix.
> >> > >
> >> > > While analyzing this, I realized that the schema ACL check happens
> >> > > very early in analyze phase [1]. I'm not sure if we can bypass the
> >> > > subscription owner from this check at that stage without implementing
> >> > > a hacky solution.  Another option is to remove restrictions from the
> >> > > pg_conflict schema for all users and keep only table-level
> >> > > restrictions within that schema. I am exploring how to implement this.
> >> >
> >> > Dilip, instead of granting permission (or removing restrictions) on
> >> > the pg_conflict schema to all users, is there a way to grant USAGE on
> >> > the schema only to the subscription owner when the conflict log table
> >> > is created and when the owner is altered for the subscription? I think
> >> > it should resolve the problem in a better way. Thoughts?  Let me know
> >> > if I am missing something.
> >>
> >> Yeah I thought about that but when you create a subscription, you
> >> connected using the subscription owner user, who doesn't have the
> >> necessary permission to GRANT usage on pg_conflict schema.
> >
> >
> > After putting more thoughts I think we should be able to execute internal GRAN function which do not checks whether the user has permission to GRANT or not.
> >
>
> I have been trying to find an existing code example that does
> somethign similar, but could not find one. But if you think it is
> feasible and found a way, then it is the reasonable solution here.

Even I am not sure but I am going to experiment with this by calling
ExecGrantStmt_oids() while creating the subscription to see if we can
come up with something reasonable.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-04T11:28:53Z

On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > 4. pg_conflict is the catalog schema and as Nisha reported,
> > non-superusers aren't allowed to access the objects within it. Because
> > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
> > subscription owner if that owner is a non-superuser. I am working on
> > the fix.
>
> While analyzing this, I realized that the schema ACL check happens
> very early in analyze phase [1]. I'm not sure if we can bypass the
> subscription owner from this check at that stage without implementing
> a hacky solution.  Another option is to remove restrictions from the
> pg_conflict schema for all users and keep only table-level
> restrictions within that schema. I am exploring how to implement this.
>

How about if we grant usage privilege on pg_conflict schema to
pg_create_subscription role and then allow only select, delete,
truncate to table_owners on tables in pg_conflict schema? Internally
the apply_worker can still make inserts to clt table in pg_conflict
schema similar to what we do for toast tables.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-04T13:11:05Z

On Mon, May 4, 2026 at 4:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > 4. pg_conflict is the catalog schema and as Nisha reported,
> > > non-superusers aren't allowed to access the objects within it. Because
> > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the
> > > subscription owner if that owner is a non-superuser. I am working on
> > > the fix.
> >
> > While analyzing this, I realized that the schema ACL check happens
> > very early in analyze phase [1]. I'm not sure if we can bypass the
> > subscription owner from this check at that stage without implementing
> > a hacky solution.  Another option is to remove restrictions from the
> > pg_conflict schema for all users and keep only table-level
> > restrictions within that schema. I am exploring how to implement this.
> >
>
> How about if we grant usage privilege on pg_conflict schema to
> pg_create_subscription role and then allow only select, delete,
> truncate to table_owners on tables in pg_conflict schema? Internally
> the apply_worker can still make inserts to clt table in pg_conflict
> schema similar to what we do for toast tables.

I am still testing, but I quickly prototyped this approach and basic
things seem to be working.

<Test case Start>
dilipkumarb@dilipkumarb:~/PG/install$ psql -p 5433
postgres[3614939]=# CREATE USER dilip LOGIN ;
GRANT pg_create_subscription TO dilip;
GRANT ALL ON DATABASE postgres TO dilip;
postgres[3614939]=# \q

-- Connect to nonsuper user--
dilipkumarb@dilipkumarb:~/PG/install$ psql -p 5433 -U dilip
postgres[3615002]=> CREATE SUBSCRIPTION regress_clt_perm_test CONNECTION
'dbname=regress_doesnotexist password=pass' PUBLICATION testpub WITH
(connect = false, conflict_log_destination = 'table');

postgres[3615002]=> select * from pg_conflict.pg_conflict_log_164
pg_conflict.pg_conflict_log_16406  pg_conflict.pg_conflict_log_16412
postgres[3615002]=> select * from pg_conflict.pg_conflict_log_16412;
 relid | schemaname | relname | conflict_type | remote_xid |
remote_commit_lsn | remote_commit_ts | remote_origin |
replica_identity | remote_tuple | local
_conflicts
-------+------------+---------+---------------+------------+-------------------+------------------+---------------+------------------+--------------+------
-----------
(0 rows)

postgres[3615002]=> delete from pg_conflict.pg_conflict_log_16412;
DELETE 0
postgres[3615002]=> TRUNCATE pg_conflict.pg_conflict_log_16412;
TRUNCATE TABLE
postgres[3615002]=> \q
dilipkumarb@dilipkumarb:~/PG/install$ psql -p 5433
psql (19devel)
Type "help" for help.

--Create another user to test non subscription owner which has
pg_create_subscription role granted do not have access on another
subscription's conflict log tables
postgres[3615293]=# CREATE USER dilip1 LOGIN;
GRANT pg_create_subscription TO dilip1;
GRANT ALL ON DATABASE postgres TO dilip1;
dilipkumarb@dilipkumarb:~/PG/install$ psql -p 5433 -U dilip1
psql (19devel)
Type "help" for help.

postgres[3615370]=> select * from pg_conflict.pg_conflict_log_16412;
ERROR:  42501: permission denied for table pg_conflict_log_16412
LOCATION:  aclcheck_error, aclchk.c:2813
postgres[3615370]=> delete from pg_conflict.pg_conflict_log_16412;
ERROR:  42501: permission denied for table pg_conflict_log_16412
LOCATION:  aclcheck_error, aclchk.c:2813

<Test Case Ends>

PFA, poc patch for the same.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-05T02:56:38Z

On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> PFA, poc patch for the same.
>

I know it is POC but I think you need more work to prevent manual
inserts/updates on conflict tables.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-05T04:06:55Z

On Tue, May 5, 2026 at 8:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > PFA, poc patch for the same.

I like the idea of PoC. It simplifies the implementation.

> >
>
> I know it is POC but I think you need more work to prevent manual
> inserts/updates on conflict tables.
>

I think CheckValidResultRel() handles it.

postgres=# insert into pg_conflict.pg_conflict_16391 values (0);
ERROR:  cannot modify or insert data into conflict log table "pg_conflict_16391"
DETAIL:  Conflict log tables are system-managed and only support
cleanup via DELETE or TRUNCATE

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-05T05:25:44Z

On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> I have fixed all the reported comments except these four.
> 1. I'm changing the ConflictLogDest enum from bitmap to integer. I can
> revert this in the next version but I want to see Peter's opinion
> first, as he suggested using a bitmap to easily apply bitwise
> operators.
>

But that created an array size inconvenience. If you want to wait for
more comments, I suggest you can keep it as a top-up patch immediately
after the patch where it is introduced.

Other points:
*
subscriptionâ€™s lifecycle.

I saw the above funny character in 0002's commit message.

*
+
+ ereport(NOTICE,
+ (errmsg("created conflict log table pg_conflict.\"%s\" for
subscription \"%s\"",
+ relname, subname)));

I think we can use a new function introduced by 0001 to get a
qualified relname instead of doing it manually here.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-05T12:55:28Z

On Tue, May 5, 2026 at 9:37 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, May 5, 2026 at 8:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > PFA, poc patch for the same.
>
> I like the idea of PoC. It simplifies the implementation.
>
> > >
> >
> > I know it is POC but I think you need more work to prevent manual
> > inserts/updates on conflict tables.
> >
>
> I think CheckValidResultRel() handles it.
>
> postgres=# insert into pg_conflict.pg_conflict_16391 values (0);
> ERROR:  cannot modify or insert data into conflict log table "pg_conflict_16391"
> DETAIL:  Conflict log tables are system-managed and only support
> cleanup via DELETE or TRUNCATE

I think we can tweak a bit and pg_class_aclmask_ext() we can only
allow truncate/delete on pg_conflict and block insert and update, here
is the modified version.  Please let me know your thoughts.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-06T03:54:01Z

On Fri, 1 May 2026 at 19:16, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > 2.
> > > > > > +typedef enum ConflictLogDest
> > > > > > +{
> > > > > > + /* Log conflicts to the server logs */
> > > > > > + CONFLICT_LOG_DEST_LOG   = 1 << 0,   /* 0x01 */
> > > > > > +
> > > > > > + /* Log conflicts to an internally managed conflict log table */
> > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1,   /* 0x02 */
> > > > > > +
> > > > > > + /* Convenience bitmask for all supported destinations */
> > > > > > + CONFLICT_LOG_DEST_ALL   = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE)
> > > > > > +} ConflictLogDest;
> > > > > > +
> > > > > > +/*
> > > > > > + * Array mapping for converting internal enum to string.
> > > > > > + */
> > > > > > +static const char *const ConflictLogDestNames[] = {
> > > > > > + [CONFLICT_LOG_DEST_LOG] = "log",
> > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table",
> > > > > > + [CONFLICT_LOG_DEST_ALL] = "all"
> > > > > > +};
> > > > > >
> > > > > > Defining an array this way could be an Array size issue. Actually the
> > > > > > array has just three elements so the last element should be at
> > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will
> > > > > > be ConflictLogDestNames[3]. Can we define by referring the following
> > > > > > existing way:
> > > >
> > > > I was analyzing this because I remember we were initially using the
> > > > format you suggested and switched to the bit format to enable direct
> > > > bitwise operations elsewhere.  I think Peter suggested that [1], and
> > > > the argument was that the bitwise operation is easy if we represent
> > > > them as a bit. Also, since we would not have too many options, the
> > > > array size shouldn't be an issue.  But I understand your point: adding
> > > > more elements will cause the array size to grow very fast as this is
> > > > using sparse array.  Let's see what others think about this, and then
> > > > we can decide whether to change it back?
> > > >
> > >
> > > The benefit of the current approach is that checking whether the
> > > destination is TABLE becomes straightforward:
> > >
> > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE)
> > >
> > > if we go by regular enum values (simialr to XLogSource), then it will be:
> > >
> > >  if (opts.logdest == CONFLICT_LOG_DEST_TABLE ||
> > >      opts.logdest == CONFLICT_LOG_DEST_ALL)
> >
> > Right
> >
> > > For ease of extending the enum and its corresponding text mappings, my
> > > personal preference is still the regular (non-bitwise) enum approach.
> >
> > Yeah, that's my personal preference too.  But Peter had strong stand
> > on keeping as bitwise so that we can directly use
> > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations.
> > Since this array shouldn't have many options, a sparse array is not an
> > issue.  So lets see what @Peter Smith has to say here and then we can
> > build a concensus on this.
> >
> > > But if we anticipate adding more destination options in the future
> > > that would be covered by ALL, checking for those in code could lead to
> > > growing chains of OR conditions, whereas the bitwise approach scales
> > > more cleanly in that respect. So I think the choice depends on what
> > > kinds of future extensions we expect.
> > >
> > > Do we have plans to add more options that would naturally fall under
> > > ALL? Or do we instead expect additions that are mutually exclusive;
> > > for example, splitting CONFLICT_LOG_DEST_LOG into something like
> > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may
> > > not make sense to group under ALL in the same way?
> >
> > Currently, I haven't considered which options would naturally fall
> > under "ALL." Perhaps if we plan targets other than logs and files,
> > those might also fall under "ALL."
>
> I have fixed all the reported comments except these four.

Few comments:
1) Currently we allow renaming of pg_conflict schema, this might be ok
as we allow other sysem schema like pg_catalog and pg_toast also.
postgres=# alter schema pg_conflict rename to test_conflict;
ALTER SCHEMA

While displaying the conflict table we will have to display the
renamed schema name instead of hard coding the schema name:
postgres=# \dRs+

List of subscriptions
 Name |  Owner  | Enabled | Publication | Binary | Streaming |
Two-phase commit | Disable on error | Origin | Password required | Run
as owner? | Failover | Server | Retain dead tuples | Max
 retention duration | Retention active | Synchronous commit |
       Conninfo                 | Receiver timeout |  Skip LSN  |
Description | Conflict log destination |        Confl
ict log table
------+---------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------+--------------------+----
--------------------+------------------+--------------------+------------------------------------------+------------------+------------+-------------+--------------------------+-------------
----------------------
 sub1 | vignesh | t       | {pub1,pub2} | f      | parallel  | d
         | f                | any    | t                 | f
  | f        |        | f                  |
                  0 | f                | off                |
dbname=postgres host=localhost port=5432 | -1               |
0/00000000 |             | table                    | pg_conflict.
pg_conflict_log_16397
(2 rows)

postgres=# select * from pg_conflict.pg_conflict_log_16397;
ERROR:  relation "pg_conflict.pg_conflict_log_16397" does not exist
LINE 1: select * from pg_conflict.pg_conflict_log_16397;

+               /* Conflict log destination is supported in v19 and higher */
+               if (pset.sversion >= 190000)
+               {
+                       appendPQExpBuffer(&buf,
+                                                         ",
subconflictlogdest AS \"%s\"\n",
+
gettext_noop("Conflict log destination"));
+
+                       appendPQExpBuffer(&buf,
+                                                         ", (CASE
WHEN subconflictlogdest IN ('table', 'all') "
+                                                         " THEN
'pg_conflict.pg_conflict_log_' || oid "
+                                                         " ELSE '-'
END) AS \"%s\"\n",
+
gettext_noop("Conflict log table"));
+               }

2) We will have to use the renamed schema here instead of hard coding:
+       /*
+        * Check for an existing table with the sname name in the
pg_conflict namespace.
+        * A collision  should not occur under normal operation, but
we must handle cases
+        * where a table has been created manually.
+        */
+       if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE)))
+               ereport(ERROR,
+                               (errcode(ERRCODE_DUPLICATE_TABLE),
+                                errmsg("conflict log table
pg_conflict.\"%s\" already exists", relname),
+                                errhint("A table with the same name
already exists. "
+                                                "To proceed, drop the
existing table and retry.")));

3) Similarly here too:
+       /* Release tuple descriptor memory. */
+       FreeTupleDesc(tupdesc);
+
+       ereport(NOTICE,
+                       (errmsg("created conflict log table
pg_conflict.\"%s\" for subscription \"%s\"",
+                                       relname, subname)));

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-06T05:17:20Z

On Tue, May 5, 2026 at 6:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, May 5, 2026 at 9:37 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, May 5, 2026 at 8:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > PFA, poc patch for the same.
> >
> > I like the idea of PoC. It simplifies the implementation.
> >
> > > >
> > >
> > > I know it is POC but I think you need more work to prevent manual
> > > inserts/updates on conflict tables.
> > >
> >
> > I think CheckValidResultRel() handles it.
> >
> > postgres=# insert into pg_conflict.pg_conflict_16391 values (0);
> > ERROR:  cannot modify or insert data into conflict log table "pg_conflict_16391"
> > DETAIL:  Conflict log tables are system-managed and only support
> > cleanup via DELETE or TRUNCATE
>
> I think we can tweak a bit and pg_class_aclmask_ext() we can only
> allow truncate/delete on pg_conflict and block insert and update, here
> is the modified version.  Please let me know your thoughts.
>

BTW, I am still getting the same ERROR even after POC. See
postgres=# insert into pg_conflict.pg_conflict_log_16402 values(NULL);
ERROR:  cannot modify or insert data into conflict log table
"pg_conflict_log_16402"
DETAIL:  Conflict log tables are system-managed and only support
cleanup via DELETE or TRUNCATE.

Few other comments:
*
postgres=# create subscription sub1 connection 'dbname=postgres'
publication pub1 WITH (conflict_log_destination='table');
NOTICE:  created conflict log table
pg_conflict."pg_conflict_log_16394" for subscription "sub1"
NOTICE:  created replication slot "sub1" on publisher
CREATE SUBSCRIPTION

To make the messages similar, isn't it better to use the following
wording in the first message: "created conflict log table
"pg_conflict.pg_conflict_log_16394" on subscriber? The part
"subscription "sub1"" is clear from the command itself.

*
postgres=# drop subscription sub1;
NOTICE:  dropped replication slot "sub1" on publisher
DROP SUBSCRIPTION

Drop seems to have missed the NOTICE to implicitly drop the table.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-06T09:31:30Z

On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
>
> Few comments:
> 1) Currently we allow renaming of pg_conflict schema, this might be ok
> as we allow other sysem schema like pg_catalog and pg_toast also.
> postgres=# alter schema pg_conflict rename to test_conflict;
> ALTER SCHEMA
>

I agree that we allow renaming other schemas including pg_toast, but I
am not sure if this is consciously made decision, see BUG #18281 ast
[1]. I don't favour allowing renaming pg_conflict for 2 reasons:

1) Because Postgres explicitly blocks renaming schemas to a name
starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to
something else, they are permanently locked out from renaming it back.

2) While the core worker might survive a rename via OID lookups;
external scripts, extensions, and monitoring tools will likely
hardcode the 'pg_conflict' string. If the schema is renamed, these
tools will fail.

One such example  of scripts breaking is present event in Postgres. I
did the following, and most of psql commands started failing after
that due to hard-coded pg_catalog name in them.

postgres=# alter schema pg_catalog rename to catalog_new;
ALTER SCHEMA

postgres=# \d catalog_new.*
ERROR:  relation "pg_catalog.pg_class" does not exist
LINE 5: FROM pg_catalog.pg_class c

[1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-06T09:36:06Z

On Wed, May 6, 2026 at 10:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, May 5, 2026 at 6:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, May 5, 2026 at 9:37 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Tue, May 5, 2026 at 8:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > PFA, poc patch for the same.
> > >
> > > I like the idea of PoC. It simplifies the implementation.
> > >
> > > > >
> > > >
> > > > I know it is POC but I think you need more work to prevent manual
> > > > inserts/updates on conflict tables.
> > > >
> > >
> > > I think CheckValidResultRel() handles it.
> > >
> > > postgres=# insert into pg_conflict.pg_conflict_16391 values (0);
> > > ERROR:  cannot modify or insert data into conflict log table "pg_conflict_16391"
> > > DETAIL:  Conflict log tables are system-managed and only support
> > > cleanup via DELETE or TRUNCATE
> >
> > I think we can tweak a bit and pg_class_aclmask_ext() we can only
> > allow truncate/delete on pg_conflict and block insert and update, here
> > is the modified version.  Please let me know your thoughts.
> >
>
> BTW, I am still getting the same ERROR even after POC. See
> postgres=# insert into pg_conflict.pg_conflict_log_16402 values(NULL);
> ERROR:  cannot modify or insert data into conflict log table
> "pg_conflict_log_16402"
> DETAIL:  Conflict log tables are system-managed and only support
> cleanup via DELETE or TRUNCATE.

I also see the same behaviour.
~~

One observation for others to review:

As a non super-user which does not have 'pg_create_subscription' privelege:
postgres=>  alter table pg_conflict.pg_conflict_16487 add column i int;
ERROR:  permission denied for schema pg_conflict
<seems correct, as access is denied at schema level itself>

As a non super-user which has 'pg_create_subscription' privelege, but
does not own the respective sub:
postgres=> alter table pg_conflict.pg_conflict_16487 add column i int;
ERROR:  must be owner of table pg_conflict_16487
<Due to 'pg_create_subscription', it seems schema access is provided,
so it goes to check table access now and gives above error. Not sure
about this error, even if the user were the owner, they still wouldn't
be able to perform this operation>

As a non super-user which has 'pg_create_subscription' privilege and
also owns the respective sub:
postgres=> alter table pg_conflict.pg_conflict_16498 add column i int;
ERROR:  permission denied: "pg_conflict_16498" is a system catalog
<okay>

As a super-user, the error is same irrespective of fact whether it
actually owns that table or not:
postgres=# alter table pg_conflict.pg_conflict_16487 add column i int;
ERROR:  permission denied: "pg_conflict_16487" is a system catalog
<okay>

For second case, not a strong opinion, but can the better error be:
ERROR:  permission denied: "pg_conflict_16487" is a system catalog?

I have not analyzed code myself for this yet.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-06T10:31:41Z

On Wed, May 6, 2026 at 3:06 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> As a non super-user which does not have 'pg_create_subscription' privelege:
> postgres=>  alter table pg_conflict.pg_conflict_16487 add column i int;
> ERROR:  permission denied for schema pg_conflict
> <seems correct, as access is denied at schema level itself>
>
> As a non super-user which has 'pg_create_subscription' privelege, but
> does not own the respective sub:
> postgres=> alter table pg_conflict.pg_conflict_16487 add column i int;
> ERROR:  must be owner of table pg_conflict_16487
> <Due to 'pg_create_subscription', it seems schema access is provided,
> so it goes to check table access now and gives above error. Not sure
> about this error, even if the user were the owner, they still wouldn't
> be able to perform this operation>
>
> As a non super-user which has 'pg_create_subscription' privilege and
> also owns the respective sub:
> postgres=> alter table pg_conflict.pg_conflict_16498 add column i int;
> ERROR:  permission denied: "pg_conflict_16498" is a system catalog
> <okay>
>
> As a super-user, the error is same irrespective of fact whether it
> actually owns that table or not:
> postgres=# alter table pg_conflict.pg_conflict_16487 add column i int;
> ERROR:  permission denied: "pg_conflict_16487" is a system catalog
> <okay>
>
> For second case, not a strong opinion, but can the better error be:
> ERROR:  permission denied: "pg_conflict_16487" is a system catalog?
>
> I have not analyzed code myself for this yet.
>

I analyzed this case and think that the current behavior is okay. As
per RangeVarCallbackForAlterRelation(), we first ensure that the
current user is either a table owner or superuser and then check
actual permissions to perform the operations on the table. The same is
true for the DROP case. I don't see the need to change it.

Few cosmetic changes are attached in top-up patches. Dilip can include
these in the next version, if he is okay with them.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-06T11:01:19Z

On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Few comments:
> > 1) Currently we allow renaming of pg_conflict schema, this might be ok
> > as we allow other sysem schema like pg_catalog and pg_toast also.
> > postgres=# alter schema pg_conflict rename to test_conflict;
> > ALTER SCHEMA
> >
>
> I agree that we allow renaming other schemas including pg_toast, but I
> am not sure if this is consciously made decision, see BUG #18281 ast
> [1]. I don't favour allowing renaming pg_conflict for 2 reasons:
>
> 1) Because Postgres explicitly blocks renaming schemas to a name
> starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to
> something else, they are permanently locked out from renaming it back.
>
> 2) While the core worker might survive a rename via OID lookups;
> external scripts, extensions, and monitoring tools will likely
> hardcode the 'pg_conflict' string. If the schema is renamed, these
> tools will fail.
>

I think we shouldn't go out of our way to disallow superusers to
rename pg_conflict schema similar to other cases. We can try to
prevent hard-coding schema names where possible but not sure we can
guarantee that nothing related to pg_conflict schema won't break as
shown by you in the following similar case for pg_conflict.

> One such example  of scripts breaking is present event in Postgres. I
> did the following, and most of psql commands started failing after
> that due to hard-coded pg_catalog name in them.
>
> postgres=# alter schema pg_catalog rename to catalog_new;
> ALTER SCHEMA
>
> postgres=# \d catalog_new.*
> ERROR:  relation "pg_catalog.pg_class" does not exist
> LINE 5: FROM pg_catalog.pg_class c
>
> [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org
>

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-06T11:25:44Z

On Fri, 1 May 2026 at 19:16, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > 2.
> > > > > > +typedef enum ConflictLogDest
> > > > > > +{
> > > > > > + /* Log conflicts to the server logs */
> > > > > > + CONFLICT_LOG_DEST_LOG   = 1 << 0,   /* 0x01 */
> > > > > > +
> > > > > > + /* Log conflicts to an internally managed conflict log table */
> > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1,   /* 0x02 */
> > > > > > +
> > > > > > + /* Convenience bitmask for all supported destinations */
> > > > > > + CONFLICT_LOG_DEST_ALL   = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE)
> > > > > > +} ConflictLogDest;
> > > > > > +
> > > > > > +/*
> > > > > > + * Array mapping for converting internal enum to string.
> > > > > > + */
> > > > > > +static const char *const ConflictLogDestNames[] = {
> > > > > > + [CONFLICT_LOG_DEST_LOG] = "log",
> > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table",
> > > > > > + [CONFLICT_LOG_DEST_ALL] = "all"
> > > > > > +};
> > > > > >
> > > > > > Defining an array this way could be an Array size issue. Actually the
> > > > > > array has just three elements so the last element should be at
> > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will
> > > > > > be ConflictLogDestNames[3]. Can we define by referring the following
> > > > > > existing way:
> > > >
> > > > I was analyzing this because I remember we were initially using the
> > > > format you suggested and switched to the bit format to enable direct
> > > > bitwise operations elsewhere.  I think Peter suggested that [1], and
> > > > the argument was that the bitwise operation is easy if we represent
> > > > them as a bit. Also, since we would not have too many options, the
> > > > array size shouldn't be an issue.  But I understand your point: adding
> > > > more elements will cause the array size to grow very fast as this is
> > > > using sparse array.  Let's see what others think about this, and then
> > > > we can decide whether to change it back?
> > > >
> > >
> > > The benefit of the current approach is that checking whether the
> > > destination is TABLE becomes straightforward:
> > >
> > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE)
> > >
> > > if we go by regular enum values (simialr to XLogSource), then it will be:
> > >
> > >  if (opts.logdest == CONFLICT_LOG_DEST_TABLE ||
> > >      opts.logdest == CONFLICT_LOG_DEST_ALL)
> >
> > Right
> >
> > > For ease of extending the enum and its corresponding text mappings, my
> > > personal preference is still the regular (non-bitwise) enum approach.
> >
> > Yeah, that's my personal preference too.  But Peter had strong stand
> > on keeping as bitwise so that we can directly use
> > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations.
> > Since this array shouldn't have many options, a sparse array is not an
> > issue.  So lets see what @Peter Smith has to say here and then we can
> > build a concensus on this.
> >
> > > But if we anticipate adding more destination options in the future
> > > that would be covered by ALL, checking for those in code could lead to
> > > growing chains of OR conditions, whereas the bitwise approach scales
> > > more cleanly in that respect. So I think the choice depends on what
> > > kinds of future extensions we expect.
> > >
> > > Do we have plans to add more options that would naturally fall under
> > > ALL? Or do we instead expect additions that are mutually exclusive;
> > > for example, splitting CONFLICT_LOG_DEST_LOG into something like
> > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may
> > > not make sense to group under ALL in the same way?
> >
> > Currently, I haven't considered which options would naturally fall
> > under "ALL." Perhaps if we plan targets other than logs and files,
> > those might also fall under "ALL."
>
> I have fixed all the reported comments except these four.

Few minor comments:
1) Now that we create the table in pg_conflict system schema where
other users cannot create the table, is there a scenario where this is
possible?
    /*
     * Check for an existing table with the sname name in the
pg_conflict namespace.
     * A collision  should not occur under normal operation, but we
must handle cases
     * where a table has been created manually.
     */
    if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE)))
        ereport(ERROR,
                (errcode(ERRCODE_DUPLICATE_TABLE),
                 errmsg("conflict log table pg_conflict.\"%s\" already
exists", relname),
                 errhint("A table with the same name already exists. "
                         "To proceed, drop the existing table and retry.")));

2) I felt table_open will throw an exception in case of error, it will
not return error, this check will not be hit:
+       conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
+       if (conflictlogrel == NULL)
+               elog(ERROR, "could not open conflict log table (OID %u)",
+                        conflictlogrelid);

3) Typo sname should be same here:
+        * Check for an existing table with the sname name in the
pg_conflict namespace.
+        * A collision  should not occur under normal operation, but
we must handle cases

4) This include is not required:
@@ -37,6 +40,7 @@
 #include "commands/subscriptioncmds.h"
 #include "executor/executor.h"
 #include "foreign/foreign.h"
+#include "funcapi.h"

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-06T12:58:27Z

On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Few comments:
> > > 1) Currently we allow renaming of pg_conflict schema, this might be ok
> > > as we allow other sysem schema like pg_catalog and pg_toast also.
> > > postgres=# alter schema pg_conflict rename to test_conflict;
> > > ALTER SCHEMA
> > >
> >
> > I agree that we allow renaming other schemas including pg_toast, but I
> > am not sure if this is consciously made decision, see BUG #18281 ast
> > [1]. I don't favour allowing renaming pg_conflict for 2 reasons:
> >
> > 1) Because Postgres explicitly blocks renaming schemas to a name
> > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to
> > something else, they are permanently locked out from renaming it back.
> >
> > 2) While the core worker might survive a rename via OID lookups;
> > external scripts, extensions, and monitoring tools will likely
> > hardcode the 'pg_conflict' string. If the schema is renamed, these
> > tools will fail.
> >
>
> I think we shouldn't go out of our way to disallow superusers to
> rename pg_conflict schema similar to other cases. We can try to
> prevent hard-coding schema names where possible but not sure we can
> guarantee that nothing related to pg_conflict schema won't break as
> shown by you in the following similar case for pg_conflict.
>
> > One such example  of scripts breaking is present event in Postgres. I
> > did the following, and most of psql commands started failing after
> > that due to hard-coded pg_catalog name in them.
> >
> > postgres=# alter schema pg_catalog rename to catalog_new;
> > ALTER SCHEMA
> >
> > postgres=# \d catalog_new.*
> > ERROR:  relation "pg_catalog.pg_class" does not exist
> > LINE 5: FROM pg_catalog.pg_class c
> >
> > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org

I can see pg_toast and pg_catalog schema also hard coded in couple of
places e.g.

listPartitionedTables()
{
if (!pattern)
appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n"
" AND n.nspname !~ '^pg_toast'\n"
" AND n.nspname <> 'information_schema'\n");
}

I will analyze which all places we are hardcoding, I think on server
side code we can easily avoid but from client side e.g. describe we
might need to invent a way to identify the schema name, or we might
have to store it somewhere in pg_subscription etc, I don't think we
should go that route.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-06T14:04:57Z

On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Few comments:
> > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok
> > > > as we allow other sysem schema like pg_catalog and pg_toast also.
> > > > postgres=# alter schema pg_conflict rename to test_conflict;
> > > > ALTER SCHEMA
> > > >
> > >
> > > I agree that we allow renaming other schemas including pg_toast, but I
> > > am not sure if this is consciously made decision, see BUG #18281 ast
> > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons:
> > >
> > > 1) Because Postgres explicitly blocks renaming schemas to a name
> > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to
> > > something else, they are permanently locked out from renaming it back.
> > >
> > > 2) While the core worker might survive a rename via OID lookups;
> > > external scripts, extensions, and monitoring tools will likely
> > > hardcode the 'pg_conflict' string. If the schema is renamed, these
> > > tools will fail.
> > >
> >
> > I think we shouldn't go out of our way to disallow superusers to
> > rename pg_conflict schema similar to other cases. We can try to
> > prevent hard-coding schema names where possible but not sure we can
> > guarantee that nothing related to pg_conflict schema won't break as
> > shown by you in the following similar case for pg_conflict.
> >
> > > One such example  of scripts breaking is present event in Postgres. I
> > > did the following, and most of psql commands started failing after
> > > that due to hard-coded pg_catalog name in them.
> > >
> > > postgres=# alter schema pg_catalog rename to catalog_new;
> > > ALTER SCHEMA
> > >
> > > postgres=# \d catalog_new.*
> > > ERROR:  relation "pg_catalog.pg_class" does not exist
> > > LINE 5: FROM pg_catalog.pg_class c
> > >
> > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org
>
> I can see pg_toast and pg_catalog schema also hard coded in couple of
> places e.g.
>
> listPartitionedTables()
> {
> if (!pattern)
> appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n"
> " AND n.nspname !~ '^pg_toast'\n"
> " AND n.nspname <> 'information_schema'\n");
> }
>
> I will analyze which all places we are hardcoding, I think on server
> side code we can easily avoid but from client side e.g. describe we
> might need to invent a way to identify the schema name, or we might
> have to store it somewhere in pg_subscription etc, I don't think we
> should go that route.

Here is updated patch set

Open comments:
1. Analyze and avoid hardcoding the 'pg_conflict' schema name wherever possible
2. change the way we display clt in \dRs+
3. Transfer the clt ownership when subscription ownership has change
(Note: I have coded a poc for this but still checking whether it works
in all cases)

I will send the revised version by end of this week after fixing these
open comments as well.



-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-07T02:55:42Z

On Wed, May 6, 2026 at 7:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > Few comments:
> > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok
> > > > > as we allow other sysem schema like pg_catalog and pg_toast also.
> > > > > postgres=# alter schema pg_conflict rename to test_conflict;
> > > > > ALTER SCHEMA
> > > > >
> > > >
> > > > I agree that we allow renaming other schemas including pg_toast, but I
> > > > am not sure if this is consciously made decision, see BUG #18281 ast
> > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons:
> > > >
> > > > 1) Because Postgres explicitly blocks renaming schemas to a name
> > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to
> > > > something else, they are permanently locked out from renaming it back.
> > > >
> > > > 2) While the core worker might survive a rename via OID lookups;
> > > > external scripts, extensions, and monitoring tools will likely
> > > > hardcode the 'pg_conflict' string. If the schema is renamed, these
> > > > tools will fail.
> > > >
> > >
> > > I think we shouldn't go out of our way to disallow superusers to
> > > rename pg_conflict schema similar to other cases. We can try to
> > > prevent hard-coding schema names where possible but not sure we can
> > > guarantee that nothing related to pg_conflict schema won't break as
> > > shown by you in the following similar case for pg_conflict.
> > >
> > > > One such example  of scripts breaking is present event in Postgres. I
> > > > did the following, and most of psql commands started failing after
> > > > that due to hard-coded pg_catalog name in them.
> > > >
> > > > postgres=# alter schema pg_catalog rename to catalog_new;
> > > > ALTER SCHEMA
> > > >
> > > > postgres=# \d catalog_new.*
> > > > ERROR:  relation "pg_catalog.pg_class" does not exist
> > > > LINE 5: FROM pg_catalog.pg_class c
> > > >
> > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org
> >
> > I can see pg_toast and pg_catalog schema also hard coded in couple of
> > places e.g.
> >
> > listPartitionedTables()
> > {
> > if (!pattern)
> > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n"
> > " AND n.nspname !~ '^pg_toast'\n"
> > " AND n.nspname <> 'information_schema'\n");
> > }
> >
> > I will analyze which all places we are hardcoding, I think on server
> > side code we can easily avoid but from client side e.g. describe we
> > might need to invent a way to identify the schema name, or we might
> > have to store it somewhere in pg_subscription etc, I don't think we
> > should go that route.
>
> Here is updated patch set
>
> Open comments:
> 1. Analyze and avoid hardcoding the 'pg_conflict' schema name wherever possible
> 2. change the way we display clt in \dRs+
> 3. Transfer the clt ownership when subscription ownership has change
> (Note: I have coded a poc for this but still checking whether it works
> in all cases)
>
> I will send the revised version by end of this week after fixing these
> open comments as well.

So for the ownership change, this simple change[1] is working fine,
but there is another issue that currently we can assign subscription
nownership to any user even that doesn't have pg_create_subscription
maybe that should be fine as it is not creating the subscription but
now question is how to manage the permission on the conflict log table
see below test[2]


[1[]
diff --git a/src/backend/commands/subscriptioncmds.c
b/src/backend/commands/subscriptioncmds.c
index a2de57e17b4..c9fac56714e 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2718,6 +2718,10 @@ AlterSubscriptionOwner_internal(Relation rel,
HeapTuple tup, Oid newOwnerId)
        form->subowner = newOwnerId;
        CatalogTupleUpdate(rel, &tup->t_self, tup);
+       /* Update owner of the conflict log table if it exists */
+       if (OidIsValid(form->subconflictlogrelid))
+               ATExecChangeOwner(form->subconflictlogrelid,
newOwnerId, true, AccessExclusiveLock);
+
        /* Update owner dependency reference */
        changeDependencyOnOwner(SubscriptionRelationId,
                                                        form->oid,

[2]
-- test to show the ownership is getting changed for the table, but
now this user will have access issue on the pg_conflict_log table as
this user do not have pg_create_subscription role, I haven't yet
checked whether the problems are only related to clt access or there
would be issue for other subcription management as well.

postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE
relname = 'pg_conflict_log_16406';
        relname        | relowner
-----------------------+----------
 pg_conflict_log_16406 |       10
(1 row)

postgres[557253]=# CREATE USER test;
CREATE ROLE
postgres[557253]=# ALTER SUBSCRIPTION sub OWNER TO test;
ALTER SUBSCRIPTION
postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE
relname = 'pg_conflict_log_16406';
        relname        | relowner
-----------------------+----------
 pg_conflict_log_16406 |    16410
(1 row)


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-07T03:26:36Z

On Wed, May 6, 2026 at 4:55 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 1 May 2026 at 19:16, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > > 2.
> > > > > > > +typedef enum ConflictLogDest
> > > > > > > +{
> > > > > > > + /* Log conflicts to the server logs */
> > > > > > > + CONFLICT_LOG_DEST_LOG   = 1 << 0,   /* 0x01 */
> > > > > > > +
> > > > > > > + /* Log conflicts to an internally managed conflict log table */
> > > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1,   /* 0x02 */
> > > > > > > +
> > > > > > > + /* Convenience bitmask for all supported destinations */
> > > > > > > + CONFLICT_LOG_DEST_ALL   = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE)
> > > > > > > +} ConflictLogDest;
> > > > > > > +
> > > > > > > +/*
> > > > > > > + * Array mapping for converting internal enum to string.
> > > > > > > + */
> > > > > > > +static const char *const ConflictLogDestNames[] = {
> > > > > > > + [CONFLICT_LOG_DEST_LOG] = "log",
> > > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table",
> > > > > > > + [CONFLICT_LOG_DEST_ALL] = "all"
> > > > > > > +};
> > > > > > >
> > > > > > > Defining an array this way could be an Array size issue. Actually the
> > > > > > > array has just three elements so the last element should be at
> > > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will
> > > > > > > be ConflictLogDestNames[3]. Can we define by referring the following
> > > > > > > existing way:
> > > > >
> > > > > I was analyzing this because I remember we were initially using the
> > > > > format you suggested and switched to the bit format to enable direct
> > > > > bitwise operations elsewhere.  I think Peter suggested that [1], and
> > > > > the argument was that the bitwise operation is easy if we represent
> > > > > them as a bit. Also, since we would not have too many options, the
> > > > > array size shouldn't be an issue.  But I understand your point: adding
> > > > > more elements will cause the array size to grow very fast as this is
> > > > > using sparse array.  Let's see what others think about this, and then
> > > > > we can decide whether to change it back?
> > > > >
> > > >
> > > > The benefit of the current approach is that checking whether the
> > > > destination is TABLE becomes straightforward:
> > > >
> > > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE)
> > > >
> > > > if we go by regular enum values (simialr to XLogSource), then it will be:
> > > >
> > > >  if (opts.logdest == CONFLICT_LOG_DEST_TABLE ||
> > > >      opts.logdest == CONFLICT_LOG_DEST_ALL)
> > >
> > > Right
> > >
> > > > For ease of extending the enum and its corresponding text mappings, my
> > > > personal preference is still the regular (non-bitwise) enum approach.
> > >
> > > Yeah, that's my personal preference too.  But Peter had strong stand
> > > on keeping as bitwise so that we can directly use
> > > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations.
> > > Since this array shouldn't have many options, a sparse array is not an
> > > issue.  So lets see what @Peter Smith has to say here and then we can
> > > build a concensus on this.
> > >
> > > > But if we anticipate adding more destination options in the future
> > > > that would be covered by ALL, checking for those in code could lead to
> > > > growing chains of OR conditions, whereas the bitwise approach scales
> > > > more cleanly in that respect. So I think the choice depends on what
> > > > kinds of future extensions we expect.
> > > >
> > > > Do we have plans to add more options that would naturally fall under
> > > > ALL? Or do we instead expect additions that are mutually exclusive;
> > > > for example, splitting CONFLICT_LOG_DEST_LOG into something like
> > > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may
> > > > not make sense to group under ALL in the same way?
> > >
> > > Currently, I haven't considered which options would naturally fall
> > > under "ALL." Perhaps if we plan targets other than logs and files,
> > > those might also fall under "ALL."
> >
> > I have fixed all the reported comments except these four.
>
> Few minor comments:
> 1) Now that we create the table in pg_conflict system schema where
> other users cannot create the table, is there a scenario where this is
> possible?
>     /*
>      * Check for an existing table with the sname name in the
> pg_conflict namespace.
>      * A collision  should not occur under normal operation, but we
> must handle cases
>      * where a table has been created manually.
>      */
>     if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE)))
>         ereport(ERROR,
>                 (errcode(ERRCODE_DUPLICATE_TABLE),
>                  errmsg("conflict log table pg_conflict.\"%s\" already
> exists", relname),
>                  errhint("A table with the same name already exists. "
>                          "To proceed, drop the existing table and retry.")));
>

It is possible to hit it with allow_system_table_mods=on. See issue1
raised by Nisha in [1]

[1]: https://www.postgresql.org/message-id/CABdArM6jpLnzC5O%3DX48RpFXRmAr5WOSHJtw0ebT%2B7Wmb-WdfvQ%40mail.gmail.com

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-07T04:31:37Z

On Thu, May 7, 2026 at 8:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, May 6, 2026 at 7:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > > >
> > > > > > Few comments:
> > > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok
> > > > > > as we allow other sysem schema like pg_catalog and pg_toast also.
> > > > > > postgres=# alter schema pg_conflict rename to test_conflict;
> > > > > > ALTER SCHEMA
> > > > > >
> > > > >
> > > > > I agree that we allow renaming other schemas including pg_toast, but I
> > > > > am not sure if this is consciously made decision, see BUG #18281 ast
> > > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons:
> > > > >
> > > > > 1) Because Postgres explicitly blocks renaming schemas to a name
> > > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to
> > > > > something else, they are permanently locked out from renaming it back.
> > > > >
> > > > > 2) While the core worker might survive a rename via OID lookups;
> > > > > external scripts, extensions, and monitoring tools will likely
> > > > > hardcode the 'pg_conflict' string. If the schema is renamed, these
> > > > > tools will fail.
> > > > >
> > > >
> > > > I think we shouldn't go out of our way to disallow superusers to
> > > > rename pg_conflict schema similar to other cases. We can try to
> > > > prevent hard-coding schema names where possible but not sure we can
> > > > guarantee that nothing related to pg_conflict schema won't break as
> > > > shown by you in the following similar case for pg_conflict.
> > > >
> > > > > One such example  of scripts breaking is present event in Postgres. I
> > > > > did the following, and most of psql commands started failing after
> > > > > that due to hard-coded pg_catalog name in them.
> > > > >
> > > > > postgres=# alter schema pg_catalog rename to catalog_new;
> > > > > ALTER SCHEMA
> > > > >
> > > > > postgres=# \d catalog_new.*
> > > > > ERROR:  relation "pg_catalog.pg_class" does not exist
> > > > > LINE 5: FROM pg_catalog.pg_class c
> > > > >
> > > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org
> > >
> > > I can see pg_toast and pg_catalog schema also hard coded in couple of
> > > places e.g.
> > >
> > > listPartitionedTables()
> > > {
> > > if (!pattern)
> > > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n"
> > > " AND n.nspname !~ '^pg_toast'\n"
> > > " AND n.nspname <> 'information_schema'\n");
> > > }
> > >
> > > I will analyze which all places we are hardcoding, I think on server
> > > side code we can easily avoid but from client side e.g. describe we
> > > might need to invent a way to identify the schema name, or we might
> > > have to store it somewhere in pg_subscription etc, I don't think we
> > > should go that route.
> >
> > Here is updated patch set
> >
> > Open comments:
> > 1. Analyze and avoid hardcoding the 'pg_conflict' schema name wherever possible
> > 2. change the way we display clt in \dRs+
> > 3. Transfer the clt ownership when subscription ownership has change
> > (Note: I have coded a poc for this but still checking whether it works
> > in all cases)
> >
> > I will send the revised version by end of this week after fixing these
> > open comments as well.
>
> So for the ownership change, this simple change[1] is working fine,
> but there is another issue that currently we can assign subscription
> nownership to any user even that doesn't have pg_create_subscription
> maybe that should be fine as it is not creating the subscription but
> now question is how to manage the permission on the conflict log table
> see below test[2]
>
>
> [1[]
> diff --git a/src/backend/commands/subscriptioncmds.c
> b/src/backend/commands/subscriptioncmds.c
> index a2de57e17b4..c9fac56714e 100644
> --- a/src/backend/commands/subscriptioncmds.c
> +++ b/src/backend/commands/subscriptioncmds.c
> @@ -2718,6 +2718,10 @@ AlterSubscriptionOwner_internal(Relation rel,
> HeapTuple tup, Oid newOwnerId)
>         form->subowner = newOwnerId;
>         CatalogTupleUpdate(rel, &tup->t_self, tup);
> +       /* Update owner of the conflict log table if it exists */
> +       if (OidIsValid(form->subconflictlogrelid))
> +               ATExecChangeOwner(form->subconflictlogrelid,
> newOwnerId, true, AccessExclusiveLock);
> +
>         /* Update owner dependency reference */
>         changeDependencyOnOwner(SubscriptionRelationId,
>                                                         form->oid,
>
> [2]
> -- test to show the ownership is getting changed for the table, but
> now this user will have access issue on the pg_conflict_log table as
> this user do not have pg_create_subscription role, I haven't yet
> checked whether the problems are only related to clt access or there
> would be issue for other subcription management as well.
>
> postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE
> relname = 'pg_conflict_log_16406';
>         relname        | relowner
> -----------------------+----------
>  pg_conflict_log_16406 |       10
> (1 row)
>
> postgres[557253]=# CREATE USER test;
> CREATE ROLE
> postgres[557253]=# ALTER SUBSCRIPTION sub OWNER TO test;
> ALTER SUBSCRIPTION
> postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE
> relname = 'pg_conflict_log_16406';
>         relname        | relowner
> -----------------------+----------
>  pg_conflict_log_16406 |    16410
> (1 row)
>

During my testing, I initally found it strange that user without
pg_create_subscription is allowed to perform ALTER Sub. But that is
base/head behaviour. Now coming to our use-case around it.

postgres=# create user user1;
CREATE ROLE
postgres=#  ALTER SUBSCRIPTION sub1 OWNER TO user1;
ALTER SUBSCRIPTION
postgres=# SELECT relowner::regrole FROM pg_class WHERE relname =
'pg_conflict_log_16392';
 relowner
----------
user1

As Dilip stated, user1 owns the table but cannot access or truncate it.

postgres=> select * from pg_conflict.pg_conflict_log_16392;
ERROR:  permission denied for schema pg_conflict

postgres=> truncate pg_conflict.pg_conflict_log_16392;
ERROR:  permission denied for schema pg_conflict

It looks weird at first, but I think we have exact same beahviour for
toast table:

--as superuser:
postgres=# CREATE TABLE user_data (id int, big_text text);
CREATE TABLE

postgres=# SELECT reltoastrelid::regclass FROM pg_class WHERE relname
= 'user_data';
      reltoastrelid
-------------------------
 pg_toast.pg_toast_16399

postgres=# SELECT * FROM pg_toast.pg_toast_16399;
 chunk_id | chunk_seq | chunk_data
----------+-----------+------------
(0 rows)


postgres=# alter table user_data owner to user1;
ALTER TABLE

--toast table ownership got changed:
postgres=# \dt+ pg_toast.pg_toast_16399
  Schema  |      Name      |    Type     | Owner |
----------+----------------+-------------+-------+-
 pg_toast | pg_toast_16399 | TOAST table | user1 |

As user1:
postgres=> SELECT * FROM pg_toast.pg_toast_16399;
ERROR:  permission denied for schema pg_toast

So behaviour is similar to our case. IMO, at best we can document it
well, something like:

Note: Conflict log tables reside in the restricted pg_conflict schema.
To query or truncate these logs, a user must be a superuser or have
the pg_create_subscription privilege. A subscription owner lacking
these privileges will not be able to access or purge conflict log
tables.

Thoughts?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-07T06:46:34Z

On Thu, May 7, 2026 at 10:01 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, May 7, 2026 at 8:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> >
> > So for the ownership change, this simple change[1] is working fine,
> > but there is another issue that currently we can assign subscription
> > nownership to any user even that doesn't have pg_create_subscription
> > maybe that should be fine as it is not creating the subscription but
> > now question is how to manage the permission on the conflict log table
> > see below test[2]
> >
> >
> > [1[]
> > diff --git a/src/backend/commands/subscriptioncmds.c
> > b/src/backend/commands/subscriptioncmds.c
> > index a2de57e17b4..c9fac56714e 100644
> > --- a/src/backend/commands/subscriptioncmds.c
> > +++ b/src/backend/commands/subscriptioncmds.c
> > @@ -2718,6 +2718,10 @@ AlterSubscriptionOwner_internal(Relation rel,
> > HeapTuple tup, Oid newOwnerId)
> >         form->subowner = newOwnerId;
> >         CatalogTupleUpdate(rel, &tup->t_self, tup);
> > +       /* Update owner of the conflict log table if it exists */
> > +       if (OidIsValid(form->subconflictlogrelid))
> > +               ATExecChangeOwner(form->subconflictlogrelid,
> > newOwnerId, true, AccessExclusiveLock);
> > +
> >         /* Update owner dependency reference */
> >         changeDependencyOnOwner(SubscriptionRelationId,
> >                                                         form->oid,
> >
> > [2]
> > -- test to show the ownership is getting changed for the table, but
> > now this user will have access issue on the pg_conflict_log table as
> > this user do not have pg_create_subscription role, I haven't yet
> > checked whether the problems are only related to clt access or there
> > would be issue for other subcription management as well.
> >
> > postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE
> > relname = 'pg_conflict_log_16406';
> >         relname        | relowner
> > -----------------------+----------
> >  pg_conflict_log_16406 |       10
> > (1 row)
> >
> > postgres[557253]=# CREATE USER test;
> > CREATE ROLE
> > postgres[557253]=# ALTER SUBSCRIPTION sub OWNER TO test;
> > ALTER SUBSCRIPTION
> > postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE
> > relname = 'pg_conflict_log_16406';
> >         relname        | relowner
> > -----------------------+----------
> >  pg_conflict_log_16406 |    16410
> > (1 row)
> >
>
> During my testing, I initally found it strange that user without
> pg_create_subscription is allowed to perform ALTER Sub. But that is
> base/head behaviour. Now coming to our use-case around it.
>
> postgres=# create user user1;
> CREATE ROLE
> postgres=#  ALTER SUBSCRIPTION sub1 OWNER TO user1;
> ALTER SUBSCRIPTION
> postgres=# SELECT relowner::regrole FROM pg_class WHERE relname =
> 'pg_conflict_log_16392';
>  relowner
> ----------
> user1
>
> As Dilip stated, user1 owns the table but cannot access or truncate it.
>
> postgres=> select * from pg_conflict.pg_conflict_log_16392;
> ERROR:  permission denied for schema pg_conflict
>
> postgres=> truncate pg_conflict.pg_conflict_log_16392;
> ERROR:  permission denied for schema pg_conflict
>
> It looks weird at first, but I think we have exact same beahviour for
> toast table:
>
> --as superuser:
> postgres=# CREATE TABLE user_data (id int, big_text text);
> CREATE TABLE
>
> postgres=# SELECT reltoastrelid::regclass FROM pg_class WHERE relname
> = 'user_data';
>       reltoastrelid
> -------------------------
>  pg_toast.pg_toast_16399
>
> postgres=# SELECT * FROM pg_toast.pg_toast_16399;
>  chunk_id | chunk_seq | chunk_data
> ----------+-----------+------------
> (0 rows)
>
>
> postgres=# alter table user_data owner to user1;
> ALTER TABLE
>
> --toast table ownership got changed:
> postgres=# \dt+ pg_toast.pg_toast_16399
>   Schema  |      Name      |    Type     | Owner |
> ----------+----------------+-------------+-------+-
>  pg_toast | pg_toast_16399 | TOAST table | user1 |
>
> As user1:
> postgres=> SELECT * FROM pg_toast.pg_toast_16399;
> ERROR:  permission denied for schema pg_toast
>
> So behaviour is similar to our case.
>

I am not sure the case is the same for CLT tables. For allowing change
to a user as an owner of a subscription that doesn't have
pg_create_subscription privilege, won't that be risky? Because now the
background worker will be able to insert in the CLT table whereas for
regular tables, it will still use table_owner's privilege (who
originally created the table) as run_as_owner is false. So, shouldn't
we disallow changing to an owner who doesn't pg_create_subscrition
privilege when a CLT table is associated with a subscription similar
to what we do for the SERVER case. (See comment: * If the subscription
uses a server, check that the new owner has USAGE... in
AlterSubscriptionOwner_internal())

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-07T08:59:46Z

On Wed, 6 May 2026 at 19:35, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > Few comments:
> > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok
> > > > > as we allow other sysem schema like pg_catalog and pg_toast also.
> > > > > postgres=# alter schema pg_conflict rename to test_conflict;
> > > > > ALTER SCHEMA
> > > > >
> > > >
> > > > I agree that we allow renaming other schemas including pg_toast, but I
> > > > am not sure if this is consciously made decision, see BUG #18281 ast
> > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons:
> > > >
> > > > 1) Because Postgres explicitly blocks renaming schemas to a name
> > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to
> > > > something else, they are permanently locked out from renaming it back.
> > > >
> > > > 2) While the core worker might survive a rename via OID lookups;
> > > > external scripts, extensions, and monitoring tools will likely
> > > > hardcode the 'pg_conflict' string. If the schema is renamed, these
> > > > tools will fail.
> > > >
> > >
> > > I think we shouldn't go out of our way to disallow superusers to
> > > rename pg_conflict schema similar to other cases. We can try to
> > > prevent hard-coding schema names where possible but not sure we can
> > > guarantee that nothing related to pg_conflict schema won't break as
> > > shown by you in the following similar case for pg_conflict.
> > >
> > > > One such example  of scripts breaking is present event in Postgres. I
> > > > did the following, and most of psql commands started failing after
> > > > that due to hard-coded pg_catalog name in them.
> > > >
> > > > postgres=# alter schema pg_catalog rename to catalog_new;
> > > > ALTER SCHEMA
> > > >
> > > > postgres=# \d catalog_new.*
> > > > ERROR:  relation "pg_catalog.pg_class" does not exist
> > > > LINE 5: FROM pg_catalog.pg_class c
> > > >
> > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org
> >
> > I can see pg_toast and pg_catalog schema also hard coded in couple of
> > places e.g.
> >
> > listPartitionedTables()
> > {
> > if (!pattern)
> > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n"
> > " AND n.nspname !~ '^pg_toast'\n"
> > " AND n.nspname <> 'information_schema'\n");
> > }
> >
> > I will analyze which all places we are hardcoding, I think on server
> > side code we can easily avoid but from client side e.g. describe we
> > might need to invent a way to identify the schema name, or we might
> > have to store it somewhere in pg_subscription etc, I don't think we
> > should go that route.
>
> Here is updated patch set

Thanks for the updated patches, the v30 version patch posted has few issues:
There is an assert at [1]:
TRAP: failed Assert("conflictlogrel != NULL"), File:
"../src/backend/replication/logical/conflict.c", Line: 195, PID: 59658
0xb3a472 <ExceptionalCondition+0x82> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x9433b8 <ReportApplyConflict+0x13c8> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x7d91fb <CheckAndReportConflict+0x2cb> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x7d8b4b <ExecSimpleRelationInsert+0x10b> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x96525b <apply_dispatch+0x23eb> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x966150 <start_apply+0x310> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x967010 <run_apply_worker+0x290> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x966d6d <ApplyWorkerMain+0x1d> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x90ff0c <BackgroundWorkerMain+0x1cc> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x914a25 <postmaster_child_launch+0x145> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x917b77 <maybe_start_bgworkers+0x1d7> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x9198f5 <ServerLoop+0x1c65> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x917156 <PostmasterMain+0x1116> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
0x83c16d <main+0x48d> at
/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres

There are the following warnings at [2]:
[14:55:07.472] conflict.c:187:6: error: variable 'log_dest_clt' is
used uninitialized whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]
[14:55:07.472]   187 |         if (dest == CONFLICT_LOG_DEST_TABLE ||
dest == CONFLICT_LOG_DEST_ALL)
[14:55:07.472]       |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[14:55:07.472] conflict.c:193:6: note: uninitialized use occurs here
[14:55:07.472]   193 |         if (log_dest_clt)
[14:55:07.472]       |             ^~~~~~~~~~~~
[14:55:07.472] conflict.c:187:2: note: remove the 'if' if its
condition is always true
[14:55:07.472]   187 |         if (dest == CONFLICT_LOG_DEST_TABLE ||
dest == CONFLICT_LOG_DEST_ALL)
[14:55:07.472]       |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[14:55:07.472]   188 |                 log_dest_clt = true;
[14:55:07.472] conflict.c:176:21: note: initialize the variable
'log_dest_clt' to silence this warning
[14:55:07.472]   176 |         bool                    log_dest_clt;
[14:55:07.472]       |                                             ^
[14:55:07.472]       |                                              = false
[14:55:07.472] 1 error generated.

[1] - https://api.cirrus-ci.com/v1/artifact/task/5630092254117888/testrun/build/testrun/subscription/026_stats/log/026_stats_subscriber.log
[2] - https://cirrus-ci.com/task/5770829742473216

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-07T10:45:38Z

On Thu, 7 May 2026 at 14:29, vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 6 May 2026 at 19:35, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > > >
> > > > > > Few comments:
> > > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok
> > > > > > as we allow other sysem schema like pg_catalog and pg_toast also.
> > > > > > postgres=# alter schema pg_conflict rename to test_conflict;
> > > > > > ALTER SCHEMA
> > > > > >
> > > > >
> > > > > I agree that we allow renaming other schemas including pg_toast, but I
> > > > > am not sure if this is consciously made decision, see BUG #18281 ast
> > > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons:
> > > > >
> > > > > 1) Because Postgres explicitly blocks renaming schemas to a name
> > > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to
> > > > > something else, they are permanently locked out from renaming it back.
> > > > >
> > > > > 2) While the core worker might survive a rename via OID lookups;
> > > > > external scripts, extensions, and monitoring tools will likely
> > > > > hardcode the 'pg_conflict' string. If the schema is renamed, these
> > > > > tools will fail.
> > > > >
> > > >
> > > > I think we shouldn't go out of our way to disallow superusers to
> > > > rename pg_conflict schema similar to other cases. We can try to
> > > > prevent hard-coding schema names where possible but not sure we can
> > > > guarantee that nothing related to pg_conflict schema won't break as
> > > > shown by you in the following similar case for pg_conflict.
> > > >
> > > > > One such example  of scripts breaking is present event in Postgres. I
> > > > > did the following, and most of psql commands started failing after
> > > > > that due to hard-coded pg_catalog name in them.
> > > > >
> > > > > postgres=# alter schema pg_catalog rename to catalog_new;
> > > > > ALTER SCHEMA
> > > > >
> > > > > postgres=# \d catalog_new.*
> > > > > ERROR:  relation "pg_catalog.pg_class" does not exist
> > > > > LINE 5: FROM pg_catalog.pg_class c
> > > > >
> > > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org
> > >
> > > I can see pg_toast and pg_catalog schema also hard coded in couple of
> > > places e.g.
> > >
> > > listPartitionedTables()
> > > {
> > > if (!pattern)
> > > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n"
> > > " AND n.nspname !~ '^pg_toast'\n"
> > > " AND n.nspname <> 'information_schema'\n");
> > > }
> > >
> > > I will analyze which all places we are hardcoding, I think on server
> > > side code we can easily avoid but from client side e.g. describe we
> > > might need to invent a way to identify the schema name, or we might
> > > have to store it somewhere in pg_subscription etc, I don't think we
> > > should go that route.
> >
> > Here is updated patch set
>
> Thanks for the updated patches, the v30 version patch posted has few issues:
> There is an assert at [1]:
> TRAP: failed Assert("conflictlogrel != NULL"), File:
> "../src/backend/replication/logical/conflict.c", Line: 195, PID: 59658
> 0xb3a472 <ExceptionalCondition+0x82> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x9433b8 <ReportApplyConflict+0x13c8> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x7d91fb <CheckAndReportConflict+0x2cb> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x7d8b4b <ExecSimpleRelationInsert+0x10b> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x96525b <apply_dispatch+0x23eb> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x966150 <start_apply+0x310> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x967010 <run_apply_worker+0x290> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x966d6d <ApplyWorkerMain+0x1d> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x90ff0c <BackgroundWorkerMain+0x1cc> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x914a25 <postmaster_child_launch+0x145> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x917b77 <maybe_start_bgworkers+0x1d7> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x9198f5 <ServerLoop+0x1c65> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x917156 <PostmasterMain+0x1116> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
> 0x83c16d <main+0x48d> at
> /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres
>
> There are the following warnings at [2]:
> [14:55:07.472] conflict.c:187:6: error: variable 'log_dest_clt' is
> used uninitialized whenever 'if' condition is false
> [-Werror,-Wsometimes-uninitialized]
> [14:55:07.472]   187 |         if (dest == CONFLICT_LOG_DEST_TABLE ||
> dest == CONFLICT_LOG_DEST_ALL)
> [14:55:07.472]       |
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> [14:55:07.472] conflict.c:193:6: note: uninitialized use occurs here
> [14:55:07.472]   193 |         if (log_dest_clt)
> [14:55:07.472]       |             ^~~~~~~~~~~~
> [14:55:07.472] conflict.c:187:2: note: remove the 'if' if its
> condition is always true
> [14:55:07.472]   187 |         if (dest == CONFLICT_LOG_DEST_TABLE ||
> dest == CONFLICT_LOG_DEST_ALL)
> [14:55:07.472]       |
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> [14:55:07.472]   188 |                 log_dest_clt = true;
> [14:55:07.472] conflict.c:176:21: note: initialize the variable
> 'log_dest_clt' to silence this warning
> [14:55:07.472]   176 |         bool                    log_dest_clt;
> [14:55:07.472]       |                                             ^
> [14:55:07.472]       |                                              = false
> [14:55:07.472] 1 error generated.
>
> [1] - https://api.cirrus-ci.com/v1/artifact/task/5630092254117888/testrun/build/testrun/subscription/026_stats/log/026_stats_subscriber.log
> [2] - https://cirrus-ci.com/task/5770829742473216

In the below, log_dest_clt is declared without initialization. Later,
they are assigned only for specific dest values. This leaves a bug
when dest is set to CONFLICT_LOG_DEST_LOG. In that case,log_dest_clt
retains an indeterminate stack value. Because log_dest_clt is
uninitialized, it may evaluate to true depending on the garbage value
present on the stack. That can incorrectly enter the CLT insertion
path and trigger assertion failure.

...
@@ -131,30 +170,92 @@ ReportApplyConflict(EState *estate,
ResultRelInfo *relinfo, int elevel,
                                        ConflictType type,
TupleTableSlot *searchslot,
                                        TupleTableSlot *remoteslot,
List *conflicttuples)
 {
-       Relation        localrel = relinfo->ri_RelationDesc;
-       StringInfoData err_detail;
+       Relation                localrel = relinfo->ri_RelationDesc;
+       ConflictLogDest dest;
+       Relation                conflictlogrel;
+       bool                    log_dest_clt;
+       bool                    log_dest_logfile;
...

...
-       pgstat_report_subscription_conflict(MySubscription->oid, type);
+       if (dest == CONFLICT_LOG_DEST_TABLE || dest == CONFLICT_LOG_DEST_ALL)
+               log_dest_clt = true;
+       if (dest == CONFLICT_LOG_DEST_LOG || dest == CONFLICT_LOG_DEST_ALL)
+               log_dest_logfile = true;

-       ereport(elevel,
-                       errcode_apply_conflict(type),
-                       errmsg("conflict detected on relation
\"%s.%s\": conflict=%s",
-
get_namespace_name(RelationGetNamespace(localrel)),
-                                  RelationGetRelationName(localrel),
-                                  ConflictTypeNames[type]),
-                       errdetail_internal("%s", err_detail.data));
+       /* Insert to table if requested. */
+       if (log_dest_clt)
+       {
+               Assert(conflictlogrel != NULL);
...

The attached v31 version has the changes to fix this issue by
initializing the variable.
This also has the rebased version along with the rebased version of
the 'Preserve conflict log destination and subscription OID for
subscriptions' patch which is present in the 0005 patch.

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Nisha Moond <nisha.moond412@gmail.com> — 2026-05-07T11:53:58Z

> The attached v31 version has the changes to fix this issue by
> initializing the variable.
> This also has the rebased version along with the rebased version of
> the 'Preserve conflict log destination and subscription OID for
> subscriptions' patch which is present in the 0005 patch.

Thanks for the patches, please find a few comments on the patches 002 to 004:

1) I noticed that if a non-superuser creates the subscription, but a
superuser later runs:
  ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
then the conflict table ends up being owned by the superuser instead
of the subscription owner. Though, apply_worker would be able to
insert into the CLT, but the subscription owner cannot access its
associated conflict log table,

I think this happens because the heap_create_with_catalog() call uses
GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
SUBSCRIPTION, it causes the table to be created under the ALTER
command executor’s ownership instead of the subscription owner.

Since only the subscription owner or a superuser can run ALTER
SUBSCRIPTION, should we always create the table with the subscription
owner as the owner?

2) In GetConflictLogDestAndTable():
+ conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
+ if (conflictlogrel == NULL)
+ elog(ERROR, "could not open conflict log table (OID %u)",
+ conflictlogrelid);
+
+ return conflictlogrel;

I think the "if (conflictlogrel == NULL)" check is unreachable. The
table_open()->relation_open() will error-out if it fails to open the
relation.

3) Minor typo in create_conflict_log_table() comments:
+ /*
+ * Check for an existing table with the sname name in the pg_conflict
namespace.
+ * A collision  should not occur under normal operation, but we must
handle cases
+ * where a table has been created manually.
+ */
==> double space in "A collision  should not"

4) The document patch-0004 is still referring to the old name
"pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".

--
Thanks,
Nisha

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-08T02:58:17Z

On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> > The attached v31 version has the changes to fix this issue by
> > initializing the variable.
> > This also has the rebased version along with the rebased version of
> > the 'Preserve conflict log destination and subscription OID for
> > subscriptions' patch which is present in the 0005 patch.
>
> Thanks for the patches, please find a few comments on the patches 002 to 004:
>
> 1) I noticed that if a non-superuser creates the subscription, but a
> superuser later runs:
>   ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
> then the conflict table ends up being owned by the superuser instead
> of the subscription owner. Though, apply_worker would be able to
> insert into the CLT, but the subscription owner cannot access its
> associated conflict log table,
>
> I think this happens because the heap_create_with_catalog() call uses
> GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
> SUBSCRIPTION, it causes the table to be created under the ALTER
> command executor’s ownership instead of the subscription owner.
>
> Since only the subscription owner or a superuser can run ALTER
> SUBSCRIPTION, should we always create the table with the subscription
> owner as the owner?

Yeah that makes sense.

> 2) In GetConflictLogDestAndTable():
> + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
> + if (conflictlogrel == NULL)
> + elog(ERROR, "could not open conflict log table (OID %u)",
> + conflictlogrelid);
> +
> + return conflictlogrel;
>
> I think the "if (conflictlogrel == NULL)" check is unreachable. The
> table_open()->relation_open() will error-out if it fails to open the
> relation.

Yeah, that's a valid point.

> 3) Minor typo in create_conflict_log_table() comments:
> + /*
> + * Check for an existing table with the sname name in the pg_conflict
> namespace.
> + * A collision  should not occur under normal operation, but we must
> handle cases
> + * where a table has been created manually.
> + */
> ==> double space in "A collision  should not"
>
> 4) The document patch-0004 is still referring to the old name
> "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".

I will fix these in next version.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-08T12:09:53Z

On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > > The attached v31 version has the changes to fix this issue by
> > > initializing the variable.
> > > This also has the rebased version along with the rebased version of
> > > the 'Preserve conflict log destination and subscription OID for
> > > subscriptions' patch which is present in the 0005 patch.
> >
> > Thanks for the patches, please find a few comments on the patches 002 to 004:
> >
> > 1) I noticed that if a non-superuser creates the subscription, but a
> > superuser later runs:
> >   ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
> > then the conflict table ends up being owned by the superuser instead
> > of the subscription owner. Though, apply_worker would be able to
> > insert into the CLT, but the subscription owner cannot access its
> > associated conflict log table,
> >
> > I think this happens because the heap_create_with_catalog() call uses
> > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
> > SUBSCRIPTION, it causes the table to be created under the ALTER
> > command executor’s ownership instead of the subscription owner.
> >
> > Since only the subscription owner or a superuser can run ALTER
> > SUBSCRIPTION, should we always create the table with the subscription
> > owner as the owner?
>
> Yeah that makes sense.
>
> > 2) In GetConflictLogDestAndTable():
> > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
> > + if (conflictlogrel == NULL)
> > + elog(ERROR, "could not open conflict log table (OID %u)",
> > + conflictlogrelid);
> > +
> > + return conflictlogrel;
> >
> > I think the "if (conflictlogrel == NULL)" check is unreachable. The
> > table_open()->relation_open() will error-out if it fails to open the
> > relation.
>
> Yeah, that's a valid point.
>
> > 3) Minor typo in create_conflict_log_table() comments:
> > + /*
> > + * Check for an existing table with the sname name in the pg_conflict
> > namespace.
> > + * A collision  should not occur under normal operation, but we must
> > handle cases
> > + * where a table has been created manually.
> > + */
> > ==> double space in "A collision  should not"
> >
> > 4) The document patch-0004 is still referring to the old name
> > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".
>
> I will fix these in next version.
>

This fixes all 4 comments Nisha reported.  And 0002 is an add-on patch
to allow ownership transfer.  I haven't yet changed the clt display
witjh \dRs+ reported by shveta.  I have a work-in-progress patch, but
I couldn't get it to work.  I will try to debug that tomorrow or next
week whenever I get time.

Open Items:
-  Add comments explaining the reasoning for the ownership change
- change clt display
- Test cases for ownership change, truncation, deletion, and select
from a non-superuser owner of subscriber.

@vignesh C Your patch needs to be rebased.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-11T06:21:19Z

On Fri, May 8, 2026 at 5:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > > The attached v31 version has the changes to fix this issue by
> > > > initializing the variable.
> > > > This also has the rebased version along with the rebased version of
> > > > the 'Preserve conflict log destination and subscription OID for
> > > > subscriptions' patch which is present in the 0005 patch.
> > >
> > > Thanks for the patches, please find a few comments on the patches 002 to 004:
> > >
> > > 1) I noticed that if a non-superuser creates the subscription, but a
> > > superuser later runs:
> > >   ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
> > > then the conflict table ends up being owned by the superuser instead
> > > of the subscription owner. Though, apply_worker would be able to
> > > insert into the CLT, but the subscription owner cannot access its
> > > associated conflict log table,
> > >
> > > I think this happens because the heap_create_with_catalog() call uses
> > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
> > > SUBSCRIPTION, it causes the table to be created under the ALTER
> > > command executor’s ownership instead of the subscription owner.
> > >
> > > Since only the subscription owner or a superuser can run ALTER
> > > SUBSCRIPTION, should we always create the table with the subscription
> > > owner as the owner?
> >
> > Yeah that makes sense.
> >
> > > 2) In GetConflictLogDestAndTable():
> > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
> > > + if (conflictlogrel == NULL)
> > > + elog(ERROR, "could not open conflict log table (OID %u)",
> > > + conflictlogrelid);
> > > +
> > > + return conflictlogrel;
> > >
> > > I think the "if (conflictlogrel == NULL)" check is unreachable. The
> > > table_open()->relation_open() will error-out if it fails to open the
> > > relation.
> >
> > Yeah, that's a valid point.
> >
> > > 3) Minor typo in create_conflict_log_table() comments:
> > > + /*
> > > + * Check for an existing table with the sname name in the pg_conflict
> > > namespace.
> > > + * A collision  should not occur under normal operation, but we must
> > > handle cases
> > > + * where a table has been created manually.
> > > + */
> > > ==> double space in "A collision  should not"
> > >
> > > 4) The document patch-0004 is still referring to the old name
> > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".
> >
> > I will fix these in next version.
> >
>
> This fixes all 4 comments Nisha reported.  And 0002 is an add-on patch
> to allow ownership transfer.  I haven't yet changed the clt display
> witjh \dRs+ reported by shveta.  I have a work-in-progress patch, but
> I couldn't get it to work.  I will try to debug that tomorrow or next
> week whenever I get time.
>
> Open Items:
> -  Add comments explaining the reasoning for the ownership change
> - change clt display
> - Test cases for ownership change, truncation, deletion, and select
> from a non-superuser owner of subscriber.
>
> @vignesh C Your patch needs to be rebased.
>

Few comments on 001:


1)
+ /*
+ * Check for an existing table with the sname name in the pg_conflict
namespace.
+ * A collision  should not occur under normal operation, but we must
handle cases
+ * where a table has been created manually.
+ */

We can extend the comment to mention 'allow_system_table_mods'
otherwise it may be difficult
to understand how a table could be created in pg_conflict.

Suggestion: ...has been created manually when allow_system_table_mods is ON.

2)
+ /* Create conflict log table. */
+ relid = heap_create_with_catalog(relname,
+ PG_CONFLICT_NAMESPACE,

Post this, it will be good to have sanity check on relid before we
start using it.
Assert(relid != InvalidOid);

3)
Currently the structure of CLT is:

+const ConflictLogColumnDef ConflictLogSchema[] = {
+ { .attname = "relid",            .atttypid = OIDOID },
+ { .attname = "schemaname",       .atttypid = TEXTOID },
+ { .attname = "relname",          .atttypid = TEXTOID },
+ { .attname = "conflict_type",    .atttypid = TEXTOID },
+ { .attname = "remote_xid",       .atttypid = XIDOID },
+ { .attname = "remote_commit_lsn",.atttypid = LSNOID },
+ { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
+ { .attname = "remote_origin",    .atttypid = TEXTOID },
+ { .attname = "replica_identity", .atttypid = JSONOID },
+ { .attname = "remote_tuple",     .atttypid = JSONOID },
+ { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
+};

So if user has to delete a conflict from CLT after resolving it, then
what is the user-friendly way to do it? IMO, it will be cumbersome
(and perhaps error-prone) to write a query with remote_commit_lsn,
remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others)
think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS
AS IDENTITY). This provides a simple, unique identifier so the user
can easily target a single row (WHERE log_id = 105) or purge a batch
of old conflicts (WHERE log_id < 1000).

4)
When querying pg_subscription, I noticed that the two CLT-related
fields (subconflictlogrelid and subconflictlogdest) are positioned far
apart, making them difficult to track and relate. Do you think we
shall have both next to each other. If we do that, that will mean
'subconflictlogdes'
coming before 'subconninfo', but is should be fine (IMO), as it will
be right next to 'subconflictlogrelid'

postgres=# select * from pg_subscription;

  oid  | subdbid | subskiplsn | subname | subowner | subenabled |
subbinary | substream | subtwophasestate | subdisableonerr |
subpasswordrequired | subrunasowner | subfailover |
subretaindeadtuples | submaxretent
ion | subretentionactive | subserver | subconflictlogrelid |
                 subconninfo                            | subslotname
| subsynccommit | subwalrcvtimeout | subpublications |
subconflictlogdes
t | suborigin
-------+---------+------------+---------+----------+------------+-----------+-----------+------------------+-----------------+---------------------+---------------+-------------+---------------------+-------------
----+--------------------+-----------+---------------------+-------------------------------------------------------------------+-------------+---------------+------------------+-----------------+------------------
--+-----------
 16387 |       5 | 0/00000000 | sub1    |       10 | t          | f
     | p         | d                | f               | t
     | f             | f           | f                   |
  0 | f                  |         0 |               16388 |
dbname=postgres host=localhost user=shveta port=5433              |
sub1        | off           | -1               | {pub1}          | all


5)
+-- verify subconflictlogdest is 'log' and relid is 0 (InvalidOid) for
default case

We can mention 'subconflictlogrelid' instead of 'relid'

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-11T09:07:46Z

Please see the test below:

CREATE USER user1 LOGIN ;
ALTER subscription sub1 owner to user1;

--Now as expected, user1 is able to access, delete or truncate:
postgres=> select count(*) from pg_conflict.pg_conflict_log_16387;
     0

postgres=> delete from pg_conflict.pg_conflict_log_16387;
DELETE 0

--When user1 tries to do insert, it gets error:
postgres=> insert into pg_conflict.pg_conflict_log_16387 values (0);
ERROR:  permission denied for table pg_conflict_log_16387

While superuser gets
postgres=# insert into pg_conflict.pg_conflict_log_16387 values (0);
ERROR:  cannot modify or insert data into conflict log table
"pg_conflict_log_16387"
DETAIL:  Conflict log tables are system-managed and only support
cleanup via DELETE or TRUNCATE.
-----

The error for user1 seems less intuitive as user1 owns
pg_conflict_log_16387. Shouldn't the non-superuser but the owner of
the CLT see the same error as the superuser is getting? I think the
error is due to the recent changes made in pg_class_aclmask_ext().
What do others think here?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Shlok Kyal <shlok.kyal.oss@gmail.com> — 2026-05-11T09:29:22Z

On Fri, 8 May 2026 at 17:40, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > > The attached v31 version has the changes to fix this issue by
> > > > initializing the variable.
> > > > This also has the rebased version along with the rebased version of
> > > > the 'Preserve conflict log destination and subscription OID for
> > > > subscriptions' patch which is present in the 0005 patch.
> > >
> > > Thanks for the patches, please find a few comments on the patches 002 to 004:
> > >
> > > 1) I noticed that if a non-superuser creates the subscription, but a
> > > superuser later runs:
> > >   ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
> > > then the conflict table ends up being owned by the superuser instead
> > > of the subscription owner. Though, apply_worker would be able to
> > > insert into the CLT, but the subscription owner cannot access its
> > > associated conflict log table,
> > >
> > > I think this happens because the heap_create_with_catalog() call uses
> > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
> > > SUBSCRIPTION, it causes the table to be created under the ALTER
> > > command executor’s ownership instead of the subscription owner.
> > >
> > > Since only the subscription owner or a superuser can run ALTER
> > > SUBSCRIPTION, should we always create the table with the subscription
> > > owner as the owner?
> >
> > Yeah that makes sense.
> >
> > > 2) In GetConflictLogDestAndTable():
> > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
> > > + if (conflictlogrel == NULL)
> > > + elog(ERROR, "could not open conflict log table (OID %u)",
> > > + conflictlogrelid);
> > > +
> > > + return conflictlogrel;
> > >
> > > I think the "if (conflictlogrel == NULL)" check is unreachable. The
> > > table_open()->relation_open() will error-out if it fails to open the
> > > relation.
> >
> > Yeah, that's a valid point.
> >
> > > 3) Minor typo in create_conflict_log_table() comments:
> > > + /*
> > > + * Check for an existing table with the sname name in the pg_conflict
> > > namespace.
> > > + * A collision  should not occur under normal operation, but we must
> > > handle cases
> > > + * where a table has been created manually.
> > > + */
> > > ==> double space in "A collision  should not"
> > >
> > > 4) The document patch-0004 is still referring to the old name
> > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".
> >
> > I will fix these in next version.
> >
>
> This fixes all 4 comments Nisha reported.  And 0002 is an add-on patch
> to allow ownership transfer.  I haven't yet changed the clt display
> witjh \dRs+ reported by shveta.  I have a work-in-progress patch, but
> I couldn't get it to work.  I will try to debug that tomorrow or next
> week whenever I get time.
>
> Open Items:
> -  Add comments explaining the reasoning for the ownership change
> - change clt display
> - Test cases for ownership change, truncation, deletion, and select
> from a non-superuser owner of subscriber.
>
> @vignesh C Your patch needs to be rebased.
>
Hi Dilip,

I started reviewing the patches.
Here are minor comments for 0001 patch:

1. If allow_system_table_mods=on we can add/drop columns of conflict log tables
But the same for pg_toast or other catalog tables are prohibited. Also
for other system tables we are getting following error.

postgres=# ALTER TABLE pg_toast.pg_toast_16413 DROP COLUMN chunk_seq;
ERROR:  ALTER action DROP COLUMN cannot be performed on relation
"pg_toast_16413"

DETAIL:  This operation is not supported for TOAST tables.
postgres=# ALTER TABLE pg_publication DROP COLUMN pubname;
ERROR:  cannot drop column pubname of table pg_publication because it
is required by the database system
postgres=# ALTER TABLE pg_description DROP COLUMN description;
ERROR:  cannot drop column description of table pg_description because
it is required by the database system

postgres=# ALTER TABLE pg_conflict.pg_conflict_log_16408 DROP COLUMN relname;
ALTER TABLE

Should we prohibit it for conflict log tables as well?

2. Should we also have a 'dropped conflict log table' NOTICE, when the
subscription is dropped?
postgres=# CREATE SUBSCRIPTION sub1 connection 'dbname=postgres
host=localhost port=5432' publication pub1 WITH
(conflict_log_destination = 'TABLE');
NOTICE:  created conflict log table
"pg_conflict.pg_conflict_log_16394" for subscription "sub1"
NOTICE:  created replication slot "sub1" on publisher
CREATE SUBSCRIPTION
postgres=# drop subscription sub1;
NOTICE:  dropped replication slot "sub1" on publisher
DROP SUBSCRIPTION

3. Typo:
+   /*
+    * Check for an existing table with the sname name in the
pg_conflict namespace.

sname -> same

Thanks,
Shlok Kyal

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-11T10:43:52Z

On Fri, 8 May 2026 at 17:40, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > > The attached v31 version has the changes to fix this issue by
> > > > initializing the variable.
> > > > This also has the rebased version along with the rebased version of
> > > > the 'Preserve conflict log destination and subscription OID for
> > > > subscriptions' patch which is present in the 0005 patch.
> > >
> > > Thanks for the patches, please find a few comments on the patches 002 to 004:
> > >
> > > 1) I noticed that if a non-superuser creates the subscription, but a
> > > superuser later runs:
> > >   ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
> > > then the conflict table ends up being owned by the superuser instead
> > > of the subscription owner. Though, apply_worker would be able to
> > > insert into the CLT, but the subscription owner cannot access its
> > > associated conflict log table,
> > >
> > > I think this happens because the heap_create_with_catalog() call uses
> > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
> > > SUBSCRIPTION, it causes the table to be created under the ALTER
> > > command executor’s ownership instead of the subscription owner.
> > >
> > > Since only the subscription owner or a superuser can run ALTER
> > > SUBSCRIPTION, should we always create the table with the subscription
> > > owner as the owner?
> >
> > Yeah that makes sense.
> >
> > > 2) In GetConflictLogDestAndTable():
> > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
> > > + if (conflictlogrel == NULL)
> > > + elog(ERROR, "could not open conflict log table (OID %u)",
> > > + conflictlogrelid);
> > > +
> > > + return conflictlogrel;
> > >
> > > I think the "if (conflictlogrel == NULL)" check is unreachable. The
> > > table_open()->relation_open() will error-out if it fails to open the
> > > relation.
> >
> > Yeah, that's a valid point.
> >
> > > 3) Minor typo in create_conflict_log_table() comments:
> > > + /*
> > > + * Check for an existing table with the sname name in the pg_conflict
> > > namespace.
> > > + * A collision  should not occur under normal operation, but we must
> > > handle cases
> > > + * where a table has been created manually.
> > > + */
> > > ==> double space in "A collision  should not"
> > >
> > > 4) The document patch-0004 is still referring to the old name
> > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".
> >
> > I will fix these in next version.
> >
>
> This fixes all 4 comments Nisha reported.  And 0002 is an add-on patch
> to allow ownership transfer.  I haven't yet changed the clt display
> witjh \dRs+ reported by shveta.  I have a work-in-progress patch, but
> I couldn't get it to work.  I will try to debug that tomorrow or next
> week whenever I get time.
>
> Open Items:
> -  Add comments explaining the reasoning for the ownership change
> - change clt display
> - Test cases for ownership change, truncation, deletion, and select
> from a non-superuser owner of subscriber.

The attached patch addresses the remaining open items and is provided
separately as patch 0005.  @Dilip Kumar, if the changes look good to
you, please merge them into the corresponding patch.

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-12T06:00:54Z

On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote:
>
> Few comments on 001:
> 3)
> Currently the structure of CLT is:
>
> +const ConflictLogColumnDef ConflictLogSchema[] = {
> + { .attname = "relid",            .atttypid = OIDOID },
> + { .attname = "schemaname",       .atttypid = TEXTOID },
> + { .attname = "relname",          .atttypid = TEXTOID },
> + { .attname = "conflict_type",    .atttypid = TEXTOID },
> + { .attname = "remote_xid",       .atttypid = XIDOID },
> + { .attname = "remote_commit_lsn",.atttypid = LSNOID },
> + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
> + { .attname = "remote_origin",    .atttypid = TEXTOID },
> + { .attname = "replica_identity", .atttypid = JSONOID },
> + { .attname = "remote_tuple",     .atttypid = JSONOID },
> + { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
> +};
>
> So if user has to delete a conflict from CLT after resolving it, then
> what is the user-friendly way to do it? IMO, it will be cumbersome
> (and perhaps error-prone) to write a query with remote_commit_lsn,
> remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others)
> think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS
> AS IDENTITY). This provides a simple, unique identifier so the user
> can easily target a single row (WHERE log_id = 105) or purge a batch
> of old conflicts (WHERE log_id < 1000).

I agree with this. I could think of a few other possible approaches as well.
The following options seem possible to make row identification/deletion easier:
a) Use existing remote_commit_ts
ex:
DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts =
'2026-05-12 10:25:46.483899+05:30';
DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts <
now() - interval '100 minutes';
b) Use existing system column ctid
ex:
DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE ctid = '(0,1)';
c) Add a dedicated identifier conflict_id column as Shveta said
DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id = 42;
DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id < 100;
d) Add a local conflict_logged_at timestamp
DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at
= '2026-05-12 10:25:46.483899+05:30';
DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at
< now() - interval '100 minutes';

I'm not sure which approach would be best here.
Thoughts?

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-12T06:29:58Z

On Tue, May 12, 2026 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Few comments on 001:
> > 3)
> > Currently the structure of CLT is:
> >
> > +const ConflictLogColumnDef ConflictLogSchema[] = {
> > + { .attname = "relid",            .atttypid = OIDOID },
> > + { .attname = "schemaname",       .atttypid = TEXTOID },
> > + { .attname = "relname",          .atttypid = TEXTOID },
> > + { .attname = "conflict_type",    .atttypid = TEXTOID },
> > + { .attname = "remote_xid",       .atttypid = XIDOID },
> > + { .attname = "remote_commit_lsn",.atttypid = LSNOID },
> > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
> > + { .attname = "remote_origin",    .atttypid = TEXTOID },
> > + { .attname = "replica_identity", .atttypid = JSONOID },
> > + { .attname = "remote_tuple",     .atttypid = JSONOID },
> > + { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
> > +};
> >
> > So if user has to delete a conflict from CLT after resolving it, then
> > what is the user-friendly way to do it? IMO, it will be cumbersome
> > (and perhaps error-prone) to write a query with remote_commit_lsn,
> > remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others)
> > think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS
> > AS IDENTITY). This provides a simple, unique identifier so the user
> > can easily target a single row (WHERE log_id = 105) or purge a batch
> > of old conflicts (WHERE log_id < 1000).
>
> I agree with this. I could think of a few other possible approaches as well.
> The following options seem possible to make row identification/deletion easier:
> a) Use existing remote_commit_ts
> ex:
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts =
> '2026-05-12 10:25:46.483899+05:30';
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts <
> now() - interval '100 minutes';
> b) Use existing system column ctid
> ex:
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE ctid = '(0,1)';
> c) Add a dedicated identifier conflict_id column as Shveta said
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id = 42;
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id < 100;
> d) Add a local conflict_logged_at timestamp
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at
> = '2026-05-12 10:25:46.483899+05:30';
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at
> < now() - interval '100 minutes';
>

I like c and d. IMO, approach 'a' is cumbersome to write query with.
Approach 'b' may not be known to all.

I had earlier suggested a timestamp column (pt 3 at [1]) to record
conflict-occurence time (mainly 'conflict_logged_at' column) in CLT
but the idea was kept on hold awaiting more feedback. Now we can
revisit this.

I feel 'conflict_logged_at' could be more beneficial because, going
forward (based on feedback), we may range-partition this table on that
field which may form as basis of historical data purge. I also
suggested this in [2] (see 'That said, irrespective of what we
decide') . Such a field could be basis of purging mechanism.

[1]: https://www.postgresql.org/message-id/CAJpy0uCMDqcWGepcTwFPH%2BhTDjD8b72KnbL-S%2Bd-qd7ChomOyQ%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CAJpy0uAfRZa4axLV_e4gvVdmunb8BOVx%2BYr%3DXecECAVD0KnD%3DA%40mail.gmail.com

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-12T06:52:50Z

On Mon, May 11, 2026 at 2:59 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Fri, 8 May 2026 at 17:40, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > >
> > > > > The attached v31 version has the changes to fix this issue by
> > > > > initializing the variable.
> > > > > This also has the rebased version along with the rebased version of
> > > > > the 'Preserve conflict log destination and subscription OID for
> > > > > subscriptions' patch which is present in the 0005 patch.
> > > >
> > > > Thanks for the patches, please find a few comments on the patches 002 to 004:
> > > >
> > > > 1) I noticed that if a non-superuser creates the subscription, but a
> > > > superuser later runs:
> > > >   ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
> > > > then the conflict table ends up being owned by the superuser instead
> > > > of the subscription owner. Though, apply_worker would be able to
> > > > insert into the CLT, but the subscription owner cannot access its
> > > > associated conflict log table,
> > > >
> > > > I think this happens because the heap_create_with_catalog() call uses
> > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
> > > > SUBSCRIPTION, it causes the table to be created under the ALTER
> > > > command executor’s ownership instead of the subscription owner.
> > > >
> > > > Since only the subscription owner or a superuser can run ALTER
> > > > SUBSCRIPTION, should we always create the table with the subscription
> > > > owner as the owner?
> > >
> > > Yeah that makes sense.
> > >
> > > > 2) In GetConflictLogDestAndTable():
> > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
> > > > + if (conflictlogrel == NULL)
> > > > + elog(ERROR, "could not open conflict log table (OID %u)",
> > > > + conflictlogrelid);
> > > > +
> > > > + return conflictlogrel;
> > > >
> > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The
> > > > table_open()->relation_open() will error-out if it fails to open the
> > > > relation.
> > >
> > > Yeah, that's a valid point.
> > >
> > > > 3) Minor typo in create_conflict_log_table() comments:
> > > > + /*
> > > > + * Check for an existing table with the sname name in the pg_conflict
> > > > namespace.
> > > > + * A collision  should not occur under normal operation, but we must
> > > > handle cases
> > > > + * where a table has been created manually.
> > > > + */
> > > > ==> double space in "A collision  should not"
> > > >
> > > > 4) The document patch-0004 is still referring to the old name
> > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".
> > >
> > > I will fix these in next version.
> > >
> >
> > This fixes all 4 comments Nisha reported.  And 0002 is an add-on patch
> > to allow ownership transfer.  I haven't yet changed the clt display
> > witjh \dRs+ reported by shveta.  I have a work-in-progress patch, but
> > I couldn't get it to work.  I will try to debug that tomorrow or next
> > week whenever I get time.
> >
> > Open Items:
> > -  Add comments explaining the reasoning for the ownership change
> > - change clt display
> > - Test cases for ownership change, truncation, deletion, and select
> > from a non-superuser owner of subscriber.
> >
> > @vignesh C Your patch needs to be rebased.
> >
> Hi Dilip,
>
> I started reviewing the patches.
> Here are minor comments for 0001 patch:
>
> 1. If allow_system_table_mods=on we can add/drop columns of conflict log tables
> But the same for pg_toast or other catalog tables are prohibited. Also
> for other system tables we are getting following error.
>
> postgres=# ALTER TABLE pg_toast.pg_toast_16413 DROP COLUMN chunk_seq;
> ERROR:  ALTER action DROP COLUMN cannot be performed on relation
> "pg_toast_16413"
>
> DETAIL:  This operation is not supported for TOAST tables.
> postgres=# ALTER TABLE pg_publication DROP COLUMN pubname;
> ERROR:  cannot drop column pubname of table pg_publication because it
> is required by the database system
> postgres=# ALTER TABLE pg_description DROP COLUMN description;
> ERROR:  cannot drop column description of table pg_description because
> it is required by the database system
>
> postgres=# ALTER TABLE pg_conflict.pg_conflict_log_16408 DROP COLUMN relname;
> ALTER TABLE
>
> Should we prohibit it for conflict log tables as well?
>

Good catch Shlok, yes it should be restricted IMO.

Another thing I found was that we could attach CLT as a partition of
another table. And then add it indirectly to publication.

Test:
-------------------------
CREATE TABLE public.conflict_parent (LIKE
pg_conflict.pg_conflict_log_16387 INCLUDING ALL) PARTITION BY LIST
(conflict_type);

ALTER TABLE public.conflict_parent ATTACH PARTITION
pg_conflict.pg_conflict_log_16387 FOR VALUES IN ('insert_exists');

CREATE publication pub1 FOR TABLE public.conflict_parent
WITH(PUBLISH_VIA_PARTITION_ROOT =false);

postgres=# select * from pg_publication_tables;
 pubname | schemaname  |       tablename
---------+-------------+-----------------------+------------
 pub1    | pg_conflict | pg_conflict_log_16387

---------------------------

While for toast table, 'LIKE' operation failed for the toast table:

postgres=# CREATE TABLE public.fake_toast_parent ( LIKE
pg_toast.pg_toast_16459 INCLUDING ALL) PARTITION BY LIST (chunk_seq);
ERROR:  relation "pg_toast_16459" is invalid in LIKE clause
LINE 1: CREATE TABLE public.fake_toast_parent ( LIKE pg_toast.pg_toa...

                   ^
DETAIL:  This operation is not supported for TOAST tables.

~~

Trying it differently, attaching it as a partition also fails.

postgres=# CREATE TABLE public.fake_toast_parent (    chunk_id oid,
chunk_seq int4,    chunk_data bytea) PARTITION BY LIST (chunk_seq);
CREATE TABLE
postgres=# ALTER TABLE public.fake_toast_parent ATTACH PARTITION
pg_toast.pg_toast_16459 FOR VALUES IN (1);
ERROR:  ALTER action ATTACH PARTITION cannot be performed on relation
"pg_toast_16459"
DETAIL:  This operation is not supported for TOAST tables.

~~

I have tried above tests with allow_system_table_mods=on;

So toast table does not support 'LIKE'.
It also does not support attaching it as a partition to another table.

IMO, we need the same restrcitions for CLT. Thoughts?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-12T09:19:15Z

On Tue, May 12, 2026 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Few comments on 001:
> > 3)
> > Currently the structure of CLT is:
> >
> > +const ConflictLogColumnDef ConflictLogSchema[] = {
> > + { .attname = "relid",            .atttypid = OIDOID },
> > + { .attname = "schemaname",       .atttypid = TEXTOID },
> > + { .attname = "relname",          .atttypid = TEXTOID },
> > + { .attname = "conflict_type",    .atttypid = TEXTOID },
> > + { .attname = "remote_xid",       .atttypid = XIDOID },
> > + { .attname = "remote_commit_lsn",.atttypid = LSNOID },
> > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
> > + { .attname = "remote_origin",    .atttypid = TEXTOID },
> > + { .attname = "replica_identity", .atttypid = JSONOID },
> > + { .attname = "remote_tuple",     .atttypid = JSONOID },
> > + { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
> > +};
> >
> > So if user has to delete a conflict from CLT after resolving it, then
> > what is the user-friendly way to do it? IMO, it will be cumbersome
> > (and perhaps error-prone) to write a query with remote_commit_lsn,
> > remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others)
> > think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS
> > AS IDENTITY). This provides a simple, unique identifier so the user
> > can easily target a single row (WHERE log_id = 105) or purge a batch
> > of old conflicts (WHERE log_id < 1000).
>
> I agree with this. I could think of a few other possible approaches as well.
> The following options seem possible to make row identification/deletion easier:
> a) Use existing remote_commit_ts
> ex:
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts =
> '2026-05-12 10:25:46.483899+05:30';
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts <
> now() - interval '100 minutes';
> b) Use existing system column ctid
> ex:
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE ctid = '(0,1)';
> c) Add a dedicated identifier conflict_id column as Shveta said
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id = 42;
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id < 100;
> d) Add a local conflict_logged_at timestamp
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at
> = '2026-05-12 10:25:46.483899+05:30';
> DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at
> < now() - interval '100 minutes';
>

We can use approach (c) as that sounds easier for manual conflict
resolutions. Though, I feel in practise different fields could be used
while removing, say when transactions are interleaved, one may prefer
to remove based on remote_xid or remote_lsn.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-12T09:21:15Z

On Tue, May 12, 2026 at 12:00 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> I had earlier suggested a timestamp column (pt 3 at [1]) to record
> conflict-occurence time (mainly 'conflict_logged_at' column) in CLT
> but the idea was kept on hold awaiting more feedback. Now we can
> revisit this.
>
> I feel 'conflict_logged_at' could be more beneficial because, going
> forward (based on feedback), we may range-partition this table on that
> field which may form as basis of historical data purge. I also
> suggested this in [2] (see 'That said, irrespective of what we
> decide') . Such a field could be basis of purging mechanism.
>

Fair enough. We can extend the table with this field after more
discussion, so it will be better to pick up this discussion once the
base feature is committed.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-12T09:26:25Z

On Tue, May 12, 2026 at 2:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, May 12, 2026 at 12:00 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > I had earlier suggested a timestamp column (pt 3 at [1]) to record
> > conflict-occurence time (mainly 'conflict_logged_at' column) in CLT
> > but the idea was kept on hold awaiting more feedback. Now we can
> > revisit this.
> >
> > I feel 'conflict_logged_at' could be more beneficial because, going
> > forward (based on feedback), we may range-partition this table on that
> > field which may form as basis of historical data purge. I also
> > suggested this in [2] (see 'That said, irrespective of what we
> > decide') . Such a field could be basis of purging mechanism.
> >
>
> Fair enough. We can extend the table with this field after more
> discussion, so it will be better to pick up this discussion once the
> base feature is committed.
>

Okay. Works for me.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-13T06:07:19Z

On Fri, May 1, 2026 at 11:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > > 2.
> > > > > > +typedef enum ConflictLogDest
> > > > > > +{
> > > > > > + /* Log conflicts to the server logs */
> > > > > > + CONFLICT_LOG_DEST_LOG   = 1 << 0,   /* 0x01 */
> > > > > > +
> > > > > > + /* Log conflicts to an internally managed conflict log table */
> > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1,   /* 0x02 */
> > > > > > +
> > > > > > + /* Convenience bitmask for all supported destinations */
> > > > > > + CONFLICT_LOG_DEST_ALL   = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE)
> > > > > > +} ConflictLogDest;
> > > > > > +
> > > > > > +/*
> > > > > > + * Array mapping for converting internal enum to string.
> > > > > > + */
> > > > > > +static const char *const ConflictLogDestNames[] = {
> > > > > > + [CONFLICT_LOG_DEST_LOG] = "log",
> > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table",
> > > > > > + [CONFLICT_LOG_DEST_ALL] = "all"
> > > > > > +};
> > > > > >
> > > > > > Defining an array this way could be an Array size issue. Actually the
> > > > > > array has just three elements so the last element should be at
> > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will
> > > > > > be ConflictLogDestNames[3]. Can we define by referring the following
> > > > > > existing way:
> > > >
> > > > I was analyzing this because I remember we were initially using the
> > > > format you suggested and switched to the bit format to enable direct
> > > > bitwise operations elsewhere.  I think Peter suggested that [1], and
> > > > the argument was that the bitwise operation is easy if we represent
> > > > them as a bit. Also, since we would not have too many options, the
> > > > array size shouldn't be an issue.  But I understand your point: adding
> > > > more elements will cause the array size to grow very fast as this is
> > > > using sparse array.  Let's see what others think about this, and then
> > > > we can decide whether to change it back?
> > > >
> > >
> > > The benefit of the current approach is that checking whether the
> > > destination is TABLE becomes straightforward:
> > >
> > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE)
> > >
> > > if we go by regular enum values (simialr to XLogSource), then it will be:
> > >
> > >  if (opts.logdest == CONFLICT_LOG_DEST_TABLE ||
> > >      opts.logdest == CONFLICT_LOG_DEST_ALL)
> >
> > Right
> >
> > > For ease of extending the enum and its corresponding text mappings, my
> > > personal preference is still the regular (non-bitwise) enum approach.
> >
> > Yeah, that's my personal preference too.  But Peter had strong stand
> > on keeping as bitwise so that we can directly use
> > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations.
> > Since this array shouldn't have many options, a sparse array is not an
> > issue.  So lets see what @Peter Smith has to say here and then we can
> > build a concensus on this.
> >
> > > But if we anticipate adding more destination options in the future
> > > that would be covered by ALL, checking for those in code could lead to
> > > growing chains of OR conditions, whereas the bitwise approach scales
> > > more cleanly in that respect. So I think the choice depends on what
> > > kinds of future extensions we expect.
> > >
> > > Do we have plans to add more options that would naturally fall under
> > > ALL? Or do we instead expect additions that are mutually exclusive;
> > > for example, splitting CONFLICT_LOG_DEST_LOG into something like
> > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may
> > > not make sense to group under ALL in the same way?
> >
> > Currently, I haven't considered which options would naturally fall
> > under "ALL." Perhaps if we plan targets other than logs and files,
> > those might also fall under "ALL."
>
> I have fixed all the reported comments except these four.
> 1. I'm changing the ConflictLogDest enum from bitmap to integer. I can
> revert this in the next version but I want to see Peter's opinion
> first, as he suggested using a bitmap to easily apply bitwise
> operators.
>

Sorry for the delay in responding. I have been away.

Yes, I recall thinking bitmaps were a tidy way of checking if a CLT
was required, just by:
IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE)

IMO, "all" is not really a discrete target value... it meant more like
"a combination of all the other ones". That is why bitmaps felt like a
better fit to me.

Of course, then you will have the (not very) sparse
designated-initializer array of names that some people objected to:
+static const char *const ConflictLogDestNames[] = {
+ [CONFLICT_LOG_DEST_LOG] = "log",
+ [CONFLICT_LOG_DEST_TABLE] = "table",
+ [CONFLICT_LOG_DEST_ALL] = "all"
+};

TBH, I did not think the sparse array posed any real problem because
even if there were 5 target values (which is way more than I could
imagine it growing to) that would still only be a sparse array of 2^5
elements which seemed hardly worth worrying about.

Anyway, it is fine by me if you want to revert to a plain enum. The
code of CreateSubscription/AlterSubscription becomes a bit clunkier
now having to check CONFLICT_LOG_DEST_ALL, but it's OK.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-13T06:12:54Z

Hi Dilip/Vignesh.

Some review comments for v33-0001.

======
src/backend/catalog/aclchk.c

pg_class_aclmask_ext:

1.
  if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE |
ACL_USAGE)) &&
- IsSystemClass(table_oid, classForm) &&
- classForm->relkind != RELKIND_VIEW &&
+ IsConflictClass(classForm) &&
  !superuser_arg(roleid))
- mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
+ mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE);
+ else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE |
ACL_TRUNCATE | ACL_USAGE)) &&
+ IsSystemClass(table_oid, classForm) &&
+ classForm->relkind != RELKIND_VIEW &&
+ !superuser_arg(roleid))
+ mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);

The new patched code seems a bit repetitive.

How about refactoring like below and putting the comments where they belong.

if (!superuser_arg(roleid))
{
  if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE))
  {
    if (IsSystemClass(table_oid, classForm) &&
      classForm->relkind != RELKIND_VIEW)
    {
      /*
       * Deny anyone permission to update a system catalog unless
       * pg_authid.rolsuper is set.
       *
       * As of 7.4 we have some updatable system views; those shouldn't be
       * protected in this way.  Assume the view rules can take care of
       * themselves.  ACL_USAGE is if we ever have system sequences.
       */
      mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE |
ACL_USAGE);
}
    else if (IsConflictClass(classForm))
    {
      /*
       * For conflict log tables, we allow non-superusers to perform DELETE
       * and TRUNCATE for maintenance, while still restricting INSERT,
       * UPDATE, and USAGE.
       */
      mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE);
    }
  }
}
else
{
  /* Superusers bypass all permission-checking. */

  ReleaseSysCache(tuple);
  return mask;
}

======
src/backend/catalog/catalog.c

IsConflictClass:

2.
+/*
+ * IsConflictClass - Check if the given pg_class tuple belongs to the conflict
+ * namespace.
+ */

This function comment looks different from all the nearby ones where
the function name appears on a line by itself.

======
src/backend/catalog/heap.c

heap_create:

3.
  if (!allow_system_table_mods &&
  ((IsCatalogNamespace(relnamespace) && relkind != RELKIND_INDEX) ||
- IsToastNamespace(relnamespace)) &&
+ IsToastNamespace(relnamespace) ||
+ IsConflictNamespace(relnamespace)) &&

Is this code correct? It seems like it is conveniently re-using a
similar error, which is not quite appropriate.

The comment refers to creating relations in pg_catalog.
The errdetail refers to "System catalog modifications"

But, the CLT is neither in pg_catalog schema, nor is it a system catalog.

======
src/backend/catalog/namespace.c

CheckSetNamespace:

4.
- * We complain if either the old or new namespaces is a temporary schema
- * (or temporary toast schema), or if either the old or new namespaces is the
- * TOAST schema.
+ * We complain if either the old or new namespaces is a temporary schema,
+ * temporary toast schema, the TOAST schema, or the CONFLICT schema.

TOAST is uppercase because it is an acronym, but I see no reason why
"CONFLICT" is uppercase. Maybe replace that with pg_conflict.

~~~

5.
+
+ /* similarly for CONFLICT schema */
+ if (nspOid == PG_CONFLICT_NAMESPACE || oldNspOid == PG_CONFLICT_NAMESPACE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot move objects into or out of CONFLICT schema")));

Ditto for the uppercase "CONFLICT" in the comment and in the errmsg.
Say pg_conflict.

======
src/backend/catalog/pg_publication.c

6.
+
+ /* Can't be conflict log table */
+ if (IsConflictNamespace(RelationGetNamespace(targetrel)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg(errormsg, relname),
+ errdetail("This operation is not supported for conflict log tables.")));

I felt this code is quite similar to the "Can't be system table"
check, so it might be better to move it to be adjacent to that.

======
src/backend/commands/subscriptioncmds.c

CreateSubscription:

7.
+ /* Always set the destination, default will be 'log'. */
+ values[Anum_pg_subscription_subconflictlogdest - 1] =
+ CStringGetTextDatum(ConflictLogDestNames[opts.conflictlogdest]);

None of the other values[] assignments here have comments talking
about defaults etc, so why is this one different?

~~~

8.
Despite some of these just being static, I am beginning to think that
the "conflict" specific CLT code might be more appropriate to be put
in conflict.c, along with the CLT schema etc.

e.g. functions like:
- create_conflict_log_table_tupdesc
- create_conflict_log_table
- GetLogDestination

~~~

create_conflict_log_table:

9.
+ snprintf(relname, NAMEDATALEN, "pg_conflict_log_%u", subid);

Would it be more helpful if the generated table name describes what
that %u means?

e.g. "pg_conflict_log_for_subid_%u"

~~~

10.
+ /*
+ * Check for an existing table with the sname name in the pg_conflict
namespace.
+ * A collision should not occur under normal operation, but we must
handle cases
+ * where a table has been created manually.
+ */
+ if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_TABLE),
+ errmsg("conflict log table pg_conflict.\"%s\" already exists", relname),
+ errhint("A table with the same name already exists. "
+ "To proceed, drop the existing table and retry.")));
+

10a.
Typo /sname name/same name/

~

10b.
That 1st sentence of the errhint seems unnecessary because it is
saying the same as the errmsg.

======
src/backend/executor/execMain.c

11.
+
+ /*
+ * Conflict log tables are managed by the system to record logical
+ * replication conflicts.  We allow DELETE and TRUNCATE to permit users to
+ * manually prune these logs, but manual data insertion or modification
+ * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the
+ * system-generated logs.
+ *
+ * Since TRUNCATE is handled as a separate utility command, we only need
+ * to explicitly permit CMD_DELETE here.
+ */
+ if (IsConflictNamespace(RelationGetNamespace(resultRel)) &&
+ operation != CMD_DELETE)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("cannot modify or insert data into conflict log table \"%s\"",
+ RelationGetRelationName(resultRel)),
+ errdetail("Conflict log tables are system-managed and only support
cleanup via DELETE or TRUNCATE.")));

It somehow feels backwards to check "operation != CMD_DELETE", with
the obscure comment that TRUNCATE is handled elsewhere.

How about just check if "(operation == CMD_INSERT || operation ==
CMD_UPDATE || operation == CMD_MERGE)".

~~~

12.
+
+ /*
+ * Conflict log tables are managed by the system to record logical
+ * replication conflicts.  We do not allow locking rows in CONFLICT
+ * relations.
+ */
+ if (IsConflictNamespace(RelationGetNamespace(rel)))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot lock rows in conflict log table \"%s\"",
+ RelationGetRelationName(rel))));

I was not sure what was meant by "CONFLICT relations.".

Does it mean "... relations in the pg_conflict schema.". Anyway, is
there any value to that 2nd sentence because it is much the same text
as the errmsg.

======
src/backend/replication/logical/conflict.c

13.
+const char *const ConflictLogDestNames[] = {
+ [CONFLICT_LOG_DEST_LOG] = "log",
+ [CONFLICT_LOG_DEST_TABLE] = "table",
+ [CONFLICT_LOG_DEST_ALL] = "all"
+};
+
+const ConflictLogColumnDef v[] = {
+ { .attname = "relid",            .atttypid = OIDOID },
+ { .attname = "schemaname",       .atttypid = TEXTOID },
+ { .attname = "relname",          .atttypid = TEXTOID },
+ { .attname = "conflict_type",    .atttypid = TEXTOID },
+ { .attname = "remote_xid",       .atttypid = XIDOID },
+ { .attname = "remote_commit_lsn",.atttypid = LSNOID },
+ { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
+ { .attname = "remote_origin",    .atttypid = TEXTOID },
+ { .attname = "replica_identity", .atttypid = JSONOID },
+ { .attname = "remote_tuple",     .atttypid = JSONOID },
+ { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
+};

13a.
Both these arrays could benefit with some comments.

~

13b.
In the ConflictLogSchema, would it be better to keep all those
"remote_" columns grouped together, instead of being broken by
"replica_identity".

~

13c.
TBH, I preferred code how it used to be -- where all the CLT constants
and structs and enums and schemas were kept together. Now they are
split across conflict.h and conflict.c making it harder to read as
well as introducing need for static asserts that were not needed
before.

(Keeping everything together might become easier if the CLT stuff is
all colocated in the conflict.c per comment #8)

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-13T07:11:18Z

Hi Dilip/Vignesh,

I was looking at patch v33-0002.

Shouldn't there be some accompanying tests in this patch to verify
that altering ownership works as expected when the subscription has a
CLT?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-14T01:54:19Z

Hi Dilip/Vignesh.

Some review comments for v33-0004 (docs).

======
doc/src/sgml/logical-replication.sgml

(29.2. Subscription)

1.
Perhaps the "conflict log table" should be using <firstterm> SGML
markup the first time it gets mentioned?

~~~

(29.8. Conflicts)

2.
   <para>
-   The log format for logical replication conflicts is as follows:
+   The <link linkend="sql-createsubscription-params-with-conflict-log-destination"><literal>conflict_log_destination</literal></link>
+   parameter automatically creates a dedicated conflict log table.
This table is created in the dedicated
+   <literal>pg_conflict</literal> namespace. The name of the conflict log table
+   is <literal>pg_conflict_log_&lt;subid&gt;</literal>. The
predefined schema of this table is
+   detailed in
+   <xref linkend="logical-replication-conflict-log-schema"/>.
+  </para>

2a.
It's not really correct to say that it "automatically creates a
dedicated conflict log table.", because that sounds like it will
always happen.

SUGGESTION
The conflict_log_destination parameter can be set to automatically
create a dedicated conflict log table.

~

2b.
Also it seems overkill to say the word "dedicated" multiple times.
Maybe remove the 2nd one.

~~~

3.
+  <para>
+   The conflicting row data, including the incoming remote row
(<literal>remote_tuple</literal>)
+   and the associated local conflict details
(<literal>local_conflicts</literal>), is stored in
+   <type>JSON</type> formats, for flexible querying and analysis.
+  </para>
+

Comma typo: /formats, for/formats for/

~~~

(29.9. Restrictions)

4.
+
+   <listitem>
+    <para>
+     Conflict log tables (see <link
linkend="sql-createsubscription-params-with-conflict-log-destination"><literal>conflict_log_destination</literal></link>
parameter)
+     are never published, even when using FOR ALL TABLES in a publication.
+    </para>
+   </listitem>

The "FOR ALL TABLES" should have SGML <literal> markup.

======
doc/src/sgml/ref/create_subscription.sgml

(conflict_log_destination (enum))

5.
+             <para>
+              If post-mortem analysis may be needed, back up the
conflict log table before
+              removing the subscription.
+             </para>

5a.
My AI tool says that the "post-mortem analysis" wording is a bit
overkill for online documentation:

SUGGESTION
If conflict history may be needed later, back up...

~

5b.
That note only says about "removing the subscription", but AFAIK the
user will also need to do backup if changing from "table/all" to
"log". Should that also be mentioned? It might make this caution a bit
repetitive -- Maybe it is simply easier to reword this sentence like:

SUGGESTION
If conflict history may be needed later, be sure to back up the
conflict log table before it gets removed.

======
GENERAL -- add new subsections

6.
Apart from those minor review comments above, I felt that the current
single "29.8. Conflicts" section should be broken into subsections for
readability and for easier referencing.

I propose that it should look like this:

29.8. Conflicts
29.8.1. Conflict logging
29.8.2. Table-based logging
29.8.3. File-based logging
29.8.4. Notes

PSA a POC patch where I've done this restructuring. It looks much
better to me. See what you think.

Most of the original patch wording is unchanged.
Some xrefs are added on the CREATE SUBSCRIPTION page.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-14T07:15:08Z

On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, May 8, 2026 at 5:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > >
> > > > > The attached v31 version has the changes to fix this issue by
> > > > > initializing the variable.
> > > > > This also has the rebased version along with the rebased version of
> > > > > the 'Preserve conflict log destination and subscription OID for
> > > > > subscriptions' patch which is present in the 0005 patch.
> > > >
> > > > Thanks for the patches, please find a few comments on the patches 002 to 004:
> > > >
> > > > 1) I noticed that if a non-superuser creates the subscription, but a
> > > > superuser later runs:
> > > >   ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
> > > > then the conflict table ends up being owned by the superuser instead
> > > > of the subscription owner. Though, apply_worker would be able to
> > > > insert into the CLT, but the subscription owner cannot access its
> > > > associated conflict log table,
> > > >
> > > > I think this happens because the heap_create_with_catalog() call uses
> > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
> > > > SUBSCRIPTION, it causes the table to be created under the ALTER
> > > > command executor’s ownership instead of the subscription owner.
> > > >
> > > > Since only the subscription owner or a superuser can run ALTER
> > > > SUBSCRIPTION, should we always create the table with the subscription
> > > > owner as the owner?
> > >
> > > Yeah that makes sense.
> > >
> > > > 2) In GetConflictLogDestAndTable():
> > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
> > > > + if (conflictlogrel == NULL)
> > > > + elog(ERROR, "could not open conflict log table (OID %u)",
> > > > + conflictlogrelid);
> > > > +
> > > > + return conflictlogrel;
> > > >
> > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The
> > > > table_open()->relation_open() will error-out if it fails to open the
> > > > relation.
> > >
> > > Yeah, that's a valid point.
> > >
> > > > 3) Minor typo in create_conflict_log_table() comments:
> > > > + /*
> > > > + * Check for an existing table with the sname name in the pg_conflict
> > > > namespace.
> > > > + * A collision  should not occur under normal operation, but we must
> > > > handle cases
> > > > + * where a table has been created manually.
> > > > + */
> > > > ==> double space in "A collision  should not"
> > > >
> > > > 4) The document patch-0004 is still referring to the old name
> > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".
> > >
> > > I will fix these in next version.
> > >
> >
> > This fixes all 4 comments Nisha reported.  And 0002 is an add-on patch
> > to allow ownership transfer.  I haven't yet changed the clt display
> > witjh \dRs+ reported by shveta.  I have a work-in-progress patch, but
> > I couldn't get it to work.  I will try to debug that tomorrow or next
> > week whenever I get time.
> >
> > Open Items:
> > -  Add comments explaining the reasoning for the ownership change
> > - change clt display
> > - Test cases for ownership change, truncation, deletion, and select
> > from a non-superuser owner of subscriber.
> >
> > @vignesh C Your patch needs to be rebased.
> >
>
> Few comments on 001:
>
> 3)
> Currently the structure of CLT is:
>
> +const ConflictLogColumnDef ConflictLogSchema[] = {
> + { .attname = "relid",            .atttypid = OIDOID },
> + { .attname = "schemaname",       .atttypid = TEXTOID },
> + { .attname = "relname",          .atttypid = TEXTOID },
> + { .attname = "conflict_type",    .atttypid = TEXTOID },
> + { .attname = "remote_xid",       .atttypid = XIDOID },
> + { .attname = "remote_commit_lsn",.atttypid = LSNOID },
> + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
> + { .attname = "remote_origin",    .atttypid = TEXTOID },
> + { .attname = "replica_identity", .atttypid = JSONOID },
> + { .attname = "remote_tuple",     .atttypid = JSONOID },
> + { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
> +};
>
> So if user has to delete a conflict from CLT after resolving it, then
> what is the user-friendly way to do it? IMO, it will be cumbersome
> (and perhaps error-prone) to write a query with remote_commit_lsn,
> remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others)
> think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS
> AS IDENTITY). This provides a simple, unique identifier so the user
> can easily target a single row (WHERE log_id = 105) or purge a batch
> of old conflicts (WHERE log_id < 1000).

I have fixed the other comments except this one. I will think more
about this and reply separately. The attached patch has the changes
for the rest of the comments. The patch also addresses comments from
[1].

[1] - https://www.postgresql.org/message-id/CAJpy0uANkzTyUjO2W0%3DRtaJCGg%3DVYcwLGGCpqax%3DzKJgNbB0Hw%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-14T07:20:49Z

Hi Dilip/Vignesh.

Some review comments for patch v330003

======
Commit Message

1.
SELECT remote_xid, relname, remote_origin, local_conflicts[1] ->>
'xid' AS local_xid,
       local_conflicts[1] ->> 'tuple' AS local_tuple
FROM myschema.conflict_log_history2;

~

Shouldn't this example be querying the pg_conflict schema (not
myschema), for a CLT name like pg_conflict_log_1234?

======
src/backend/replication/logical/conflict.c

2.
+/* Schema for the elements within the 'local_conflicts' JSON array */
+static const ConflictLogColumnDef LocalConflictSchema[] =
+{
+ { .attname = "xid",       .atttypid = XIDOID },
+ { .attname = "commit_ts", .atttypid = TIMESTAMPTZOID },
+ { .attname = "origin",    .atttypid = TEXTOID },
+ { .attname = "key",       .atttypid = JSONOID },
+ { .attname = "tuple",     .atttypid = JSONOID }
+};

I think this all belongs directly beneath the ConflictLogSchema[]
where 'local_conflicts' was defined.

~~~

3.
+#define MAX_LOCAL_CONFLICT_INFO_ATTRS lengthof(LocalConflictSchema)

"MAX_" doesn't really seem appropriate as a prefix because this is not
some upper limit; it is just a number.

A better name is "NUM_LOCAL_CONFLICT_ATTRS".

~~

Ditto for the other "MAX_CONFLICT_ATTR_NUM" of patch 0001.

How about "NUM_CONFLICT_ATTRS".

~~~

RequestApplyConflict:

4.
+ if (dest == CONFLICT_LOG_DEST_TABLE || dest == CONFLICT_LOG_DEST_ALL)
+ log_dest_clt = true;
+ if (dest == CONFLICT_LOG_DEST_LOG || dest == CONFLICT_LOG_DEST_ALL)
+ log_dest_logfile = true;

This code could be improved by introducing some macros to hide all the
checking. There was also similar code in patch 0001 where such macros
would have been helpful.

SUGGESTION
log_dest_clt = CONFLICTS_LOGGED_TO_TABLE(dest);
log_dest_logfile = CONFLICTS_LOGGED_TO_FILE(dest);

~~~

5.
+ ereport(elevel,
+ errcode_apply_conflict(type),
+ errmsg("conflict detected on relation \"%s.%s\": conflict=%s",
+ get_namespace_name(RelationGetNamespace(localrel)),
+ RelationGetRelationName(localrel),
+ ConflictTypeNames[type]),
+ errdetail("Conflict details are logged to the conflict log table: %s",
+   RelationGetRelationName(conflictlogrel)));

I think there is some recently committed function for getting
fully-qualified relation names that this error can make use of.

~~~

6.
+ /* Standard reporting with full internal details. */
+ ereport(elevel,
+ errcode_apply_conflict(type),
+ errmsg("conflict detected on relation \"%s.%s\": conflict=%s",
+    get_namespace_name(RelationGetNamespace(localrel)),
+    RelationGetRelationName(localrel),
+    ConflictTypeNames[type]),
+ errdetail_internal("%s", err_detail.data));

Ditto. I think there is some recently committed function for getting
fully-qualified relation names that this error can make use of.

~~~

GetConflictLogDestAndTable:

7.
+ /* Quick exit if a conflict log table was not requested. */
+ if (*log_dest == CONFLICT_LOG_DEST_LOG)
+ return NULL;

It would be more intuitive to use that new macro here that I suggested
in a previous review comment.

SUGGESTION
if (!CONFLICTS_LOGGED_TO_TABLE(*log_dest))
  return NULL;

~~~

InsertConflictLogTuple:

8.
+ int options = HEAP_INSERT_NO_LOGICAL;

This variable seems unnecessary. Easier to just pass
HEAP_INSERT_NO_LOGICAL as a function parameter.

======
src/backend/replication/logical/worker.c

start_apply:

9.
+ /* Open conflict log table and insert the tuple. */
+ conflictlogrel = GetConflictLogDestAndTable(&dest);
+ Assert(dest != CONFLICT_LOG_DEST_LOG);
+ InsertConflictLogTuple(conflictlogrel);
+ table_close(conflictlogrel, RowExclusiveLock);

Another place where using the suggested new macro would be more intuitive.

SUGGESTION
Assert(CONFLICTS_LOGGED_TO_TABLE(dest));

======
src/test/subscription/t/035_conflicts.pl

10.
+# Verify the contents of the Conflict Log Table (CLT)
+# This section ensures that the clt contains the expected
+# type and specific key data.

/clt/CLT/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-14T08:52:11Z

On Mon, May 11, 2026 at 4:14 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 8 May 2026 at 17:40, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > >
> > > > > The attached v31 version has the changes to fix this issue by
> > > > > initializing the variable.
> > > > > This also has the rebased version along with the rebased version of
> > > > > the 'Preserve conflict log destination and subscription OID for
> > > > > subscriptions' patch which is present in the 0005 patch.
> > > >
> > > > Thanks for the patches, please find a few comments on the patches 002 to 004:
> > > >
> > > > 1) I noticed that if a non-superuser creates the subscription, but a
> > > > superuser later runs:
> > > >   ALTER SUBSCRIPTION ... SET (conflict_log_table = all)
> > > > then the conflict table ends up being owned by the superuser instead
> > > > of the subscription owner. Though, apply_worker would be able to
> > > > insert into the CLT, but the subscription owner cannot access its
> > > > associated conflict log table,
> > > >
> > > > I think this happens because the heap_create_with_catalog() call uses
> > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER
> > > > SUBSCRIPTION, it causes the table to be created under the ALTER
> > > > command executor’s ownership instead of the subscription owner.
> > > >
> > > > Since only the subscription owner or a superuser can run ALTER
> > > > SUBSCRIPTION, should we always create the table with the subscription
> > > > owner as the owner?
> > >
> > > Yeah that makes sense.
> > >
> > > > 2) In GetConflictLogDestAndTable():
> > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock);
> > > > + if (conflictlogrel == NULL)
> > > > + elog(ERROR, "could not open conflict log table (OID %u)",
> > > > + conflictlogrelid);
> > > > +
> > > > + return conflictlogrel;
> > > >
> > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The
> > > > table_open()->relation_open() will error-out if it fails to open the
> > > > relation.
> > >
> > > Yeah, that's a valid point.
> > >
> > > > 3) Minor typo in create_conflict_log_table() comments:
> > > > + /*
> > > > + * Check for an existing table with the sname name in the pg_conflict
> > > > namespace.
> > > > + * A collision  should not occur under normal operation, but we must
> > > > handle cases
> > > > + * where a table has been created manually.
> > > > + */
> > > > ==> double space in "A collision  should not"
> > > >
> > > > 4) The document patch-0004 is still referring to the old name
> > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>".
> > >
> > > I will fix these in next version.
> > >
> >
> > This fixes all 4 comments Nisha reported.  And 0002 is an add-on patch
> > to allow ownership transfer.  I haven't yet changed the clt display
> > witjh \dRs+ reported by shveta.  I have a work-in-progress patch, but
> > I couldn't get it to work.  I will try to debug that tomorrow or next
> > week whenever I get time.
> >
> > Open Items:
> > -  Add comments explaining the reasoning for the ownership change
> > - change clt display
> > - Test cases for ownership change, truncation, deletion, and select
> > from a non-superuser owner of subscriber.
>
> The attached patch addresses the remaining open items and is provided
> separately as patch 0005.  @Dilip Kumar, if the changes look good to
> you, please merge them into the corresponding patch.
>

Thanks Vignesh, Please find a few comments on 0005:


1)
listSubscriptions has:

+ pg_log_error("The server (version %s) does not support publications.",

publications --> subscriptions

2)
printfPQExpBuffer(&buf, "/* %s */\n", _("Get matching subscriptions"));
appendPQExpBuffer(&buf,
  "SELECT subname AS \"%s\"\n"
  ",  pg_catalog.pg_get_userbyid(subowner) AS \"%s\"\n"
  ",  subenabled AS \"%s\"\n"
  ",  subpublications AS \"%s\"\n",
  gettext_noop("Name"),
  gettext_noop("Owner"),
  gettext_noop("Enabled"),
  gettext_noop("Publication"));

/* Only display subscriptions in current database. */
appendPQExpBufferStr(&buf,
"FROM pg_catalog.pg_subscription\n"
"WHERE subdbid = (SELECT oid\n"
"                 FROM pg_catalog.pg_database\n"
"                 WHERE datname = pg_catalog.current_database())");


Why have we split the query? Can we have it in one go itself?

3)
+ appendPQExpBuffer(&buf,
+   "SELECT oid, subname AS \"%s\"\n"
+   ",  pg_catalog.pg_get_userbyid(subowner) AS \"%s\"\n"
+   ",  subenabled AS \"%s\"\n"
+   ",  subpublications AS \"%s\"\n",
+   gettext_noop("Name"),
+   gettext_noop("Owner"),
+   gettext_noop("Enabled"),
+   gettext_noop("Publication"));
+ ncols = 3;

The query has 5 columns and we have set ncols as 3. A comment will help here.

4)
+ snprintf(conflictlogtable,
+ sizeof(conflictlogtable),
+ "pg_conflict.pg_conflict_log_%s",
+ subid);

Should be avoid hard-coding the namespace name like above?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-14T08:53:14Z

On Mon, 11 May 2026 at 14:59, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> I started reviewing the patches.
> Here are minor comments for 0001 patch:
>
> 1. If allow_system_table_mods=on we can add/drop columns of conflict log tables
> But the same for pg_toast or other catalog tables are prohibited. Also
> for other system tables we are getting following error.
>
> postgres=# ALTER TABLE pg_toast.pg_toast_16413 DROP COLUMN chunk_seq;
> ERROR:  ALTER action DROP COLUMN cannot be performed on relation
> "pg_toast_16413"
>
> DETAIL:  This operation is not supported for TOAST tables.
> postgres=# ALTER TABLE pg_publication DROP COLUMN pubname;
> ERROR:  cannot drop column pubname of table pg_publication because it
> is required by the database system
> postgres=# ALTER TABLE pg_description DROP COLUMN description;
> ERROR:  cannot drop column description of table pg_description because
> it is required by the database system
>
> postgres=# ALTER TABLE pg_conflict.pg_conflict_log_16408 DROP COLUMN relname;
> ALTER TABLE
>
> Should we prohibit it for conflict log tables as well?

The reason it fails for regular system catalogs is that
IsPinnedObject() returns true for them. Objects with OIDs less than
FirstUnpinnedObjectId(12000) are considered pinned, which includes the
core catalogs created during initdb. In such cases, PostgreSQL
immediately throws the following error:
/*
 * If the target object is pinned, we can just error out immediately; it
 * won't have any objects recorded as depending on it.
 */
if (IsPinnedObject(object->classId, object->objectId))
    ereport(ERROR,
            (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST),
             errmsg("cannot drop %s because it is required by the
database system",
                    getObjectDescription(object, false))));
The call chain is:
ATExecDropColumn -> performMultipleDeletions  -> findDependentObjects
-> IsPinnedObject

However, the conflict log tables are not created during initdb; they
are created later during subscription creation. Therefore, they are
not considered pinned objects, IsPinnedObject() returns false, and the
DROP COLUMN operation is allowed.

I also noticed that ADD COLUMN is currently allowed on system tables
when allow_system_table_mods is enabled:
postgres=# SET allow_system_table_mods = on;
SET
postgres=# ALTER TABLE pg_description ADD COLUMN fake text;
ALTER TABLE

There are also cases where such operations lead to assertion failures.
For example:
postgres=# SET allow_system_table_mods = on;
SET
postgres=# alter table pg_type add column fake int;
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

TRAP: failed Assert("i >= 0 && i < tupdesc->natts"), File:
"../../../src/include/access/tupdesc.h", Line: 182, PID: 11443
postgres: vignesh postgres [local] ALTER
TABLE(ExceptionalCondition+0xba) [0x616a67fc753c]
postgres: vignesh postgres [local] ALTER TABLE(+0x7057fa) [0x616a67d067fa]
postgres: vignesh postgres [local] ALTER
TABLE(build_column_default+0x34) [0x616a67d08961]
postgres: vignesh postgres [local] ALTER TABLE(+0x3e8875) [0x616a679e9875]
postgres: vignesh postgres [local] ALTER TABLE(+0x3e34e8) [0x616a679e44e8]
postgres: vignesh postgres [local] ALTER TABLE(+0x3e2e24) [0x616a679e3e24]

The documentation also explicitly warns about this behavior at [1]:
Allows modification of the structure of system tables as well as
certain other risky actions on system tables. This is otherwise not
allowed even for superusers. Ill-advised use of this setting can cause
irretrievable data loss or seriously corrupt the database system.

Given this, I am not sure whether we should specifically prevent
dropping columns from conflict log tables when allow_system_table_mods
is enabled.

Rest of the comments are addressed in the v34 version patch posted at [2].

[1] - https://www.postgresql.org/docs/current/runtime-config-developer.html
[2] - https://www.postgresql.org/message-id/CALDaNm1ZOWAbv5WCsORPBqo7tjHn4f7E%2BB5LZhExfnPMs-zo9A%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-14T09:25:27Z

.

On Thu, May 14, 2026 at 2:23 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 11 May 2026 at 14:59, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > I started reviewing the patches.
> > Here are minor comments for 0001 patch:
> >
> > 1. If allow_system_table_mods=on we can add/drop columns of conflict log tables
> > But the same for pg_toast or other catalog tables are prohibited. Also
> > for other system tables we are getting following error.
> >
> > postgres=# ALTER TABLE pg_toast.pg_toast_16413 DROP COLUMN chunk_seq;
> > ERROR:  ALTER action DROP COLUMN cannot be performed on relation
> > "pg_toast_16413"
> >
> > DETAIL:  This operation is not supported for TOAST tables.
> > postgres=# ALTER TABLE pg_publication DROP COLUMN pubname;
> > ERROR:  cannot drop column pubname of table pg_publication because it
> > is required by the database system
> > postgres=# ALTER TABLE pg_description DROP COLUMN description;
> > ERROR:  cannot drop column description of table pg_description because
> > it is required by the database system
> >
> > postgres=# ALTER TABLE pg_conflict.pg_conflict_log_16408 DROP COLUMN relname;
> > ALTER TABLE
> >
> > Should we prohibit it for conflict log tables as well?
>
> The reason it fails for regular system catalogs is that
> IsPinnedObject() returns true for them. Objects with OIDs less than
> FirstUnpinnedObjectId(12000) are considered pinned, which includes the
> core catalogs created during initdb. In such cases, PostgreSQL
> immediately throws the following error:
> /*
>  * If the target object is pinned, we can just error out immediately; it
>  * won't have any objects recorded as depending on it.
>  */
> if (IsPinnedObject(object->classId, object->objectId))
>     ereport(ERROR,
>             (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST),
>              errmsg("cannot drop %s because it is required by the
> database system",
>                     getObjectDescription(object, false))));
> The call chain is:
> ATExecDropColumn -> performMultipleDeletions  -> findDependentObjects
> -> IsPinnedObject
>
> However, the conflict log tables are not created during initdb; they
> are created later during subscription creation. Therefore, they are
> not considered pinned objects, IsPinnedObject() returns false, and the
> DROP COLUMN operation is allowed.
>
> I also noticed that ADD COLUMN is currently allowed on system tables
> when allow_system_table_mods is enabled:
> postgres=# SET allow_system_table_mods = on;
> SET
> postgres=# ALTER TABLE pg_description ADD COLUMN fake text;
> ALTER TABLE
>
> There are also cases where such operations lead to assertion failures.
> For example:
> postgres=# SET allow_system_table_mods = on;
> SET
> postgres=# alter table pg_type add column fake int;
> server closed the connection unexpectedly
>     This probably means the server terminated abnormally
>     before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.
>
> TRAP: failed Assert("i >= 0 && i < tupdesc->natts"), File:
> "../../../src/include/access/tupdesc.h", Line: 182, PID: 11443
> postgres: vignesh postgres [local] ALTER
> TABLE(ExceptionalCondition+0xba) [0x616a67fc753c]
> postgres: vignesh postgres [local] ALTER TABLE(+0x7057fa) [0x616a67d067fa]
> postgres: vignesh postgres [local] ALTER
> TABLE(build_column_default+0x34) [0x616a67d08961]
> postgres: vignesh postgres [local] ALTER TABLE(+0x3e8875) [0x616a679e9875]
> postgres: vignesh postgres [local] ALTER TABLE(+0x3e34e8) [0x616a679e44e8]
> postgres: vignesh postgres [local] ALTER TABLE(+0x3e2e24) [0x616a679e3e24]
>
> The documentation also explicitly warns about this behavior at [1]:
> Allows modification of the structure of system tables as well as
> certain other risky actions on system tables. This is otherwise not
> allowed even for superusers. Ill-advised use of this setting can cause
> irretrievable data loss or seriously corrupt the database system.
>
> Given this, I am not sure whether we should specifically prevent
> dropping columns from conflict log tables when allow_system_table_mods
> is enabled.
>

+1. We can keep the current behavior as-is since it only applies when
allow_system_table_mods is enabled. The documentation already clearly
warns about the associated risks, so this should be fine.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-14T10:48:33Z

On Wed, May 13, 2026 at 11:43 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Some review comments for v33-0001.
>
> ======
> src/backend/catalog/aclchk.c
>
> pg_class_aclmask_ext:
>
> 1.
>   if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE |
> ACL_USAGE)) &&
> - IsSystemClass(table_oid, classForm) &&
> - classForm->relkind != RELKIND_VIEW &&
> + IsConflictClass(classForm) &&
>   !superuser_arg(roleid))
> - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
> + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE);
> + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE |
> ACL_TRUNCATE | ACL_USAGE)) &&
> + IsSystemClass(table_oid, classForm) &&
> + classForm->relkind != RELKIND_VIEW &&
> + !superuser_arg(roleid))
> + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
>
> The new patched code seems a bit repetitive.
>
> How about refactoring like below and putting the comments where they belong.
>
> if (!superuser_arg(roleid))
> {
>   if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE))
>   {
>     if (IsSystemClass(table_oid, classForm) &&
>       classForm->relkind != RELKIND_VIEW)
>     {
>       /*
>        * Deny anyone permission to update a system catalog unless
>        * pg_authid.rolsuper is set.
>        *
>        * As of 7.4 we have some updatable system views; those shouldn't be
>        * protected in this way.  Assume the view rules can take care of
>        * themselves.  ACL_USAGE is if we ever have system sequences.
>        */
>       mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE |
> ACL_USAGE);
> }
>     else if (IsConflictClass(classForm))
>     {
>       /*
>        * For conflict log tables, we allow non-superusers to perform DELETE
>        * and TRUNCATE for maintenance, while still restricting INSERT,
>        * UPDATE, and USAGE.
>        */
>       mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE);
>     }
>   }
> }
> else
> {
>   /* Superusers bypass all permission-checking. */
>
>   ReleaseSysCache(tuple);
>   return mask;
> }
>

It is okay to reduce duplicity here but the check for IsConflictClass
should be first because IsSystemClass also contains the similar check
though for a different reason.

>
> 8.
> Despite some of these just being static, I am beginning to think that
> the "conflict" specific CLT code might be more appropriate to be put
> in conflict.c, along with the CLT schema etc.
>
> e.g. functions like:
> - create_conflict_log_table_tupdesc
> - create_conflict_log_table
> - GetLogDestination
>

+1.

>
> ======
> src/backend/replication/logical/conflict.c
>
> 13.
> +const char *const ConflictLogDestNames[] = {
> + [CONFLICT_LOG_DEST_LOG] = "log",
> + [CONFLICT_LOG_DEST_TABLE] = "table",
> + [CONFLICT_LOG_DEST_ALL] = "all"
> +};
> +
> +const ConflictLogColumnDef v[] = {
> + { .attname = "relid",            .atttypid = OIDOID },
> + { .attname = "schemaname",       .atttypid = TEXTOID },
> + { .attname = "relname",          .atttypid = TEXTOID },
> + { .attname = "conflict_type",    .atttypid = TEXTOID },
> + { .attname = "remote_xid",       .atttypid = XIDOID },
> + { .attname = "remote_commit_lsn",.atttypid = LSNOID },
> + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
> + { .attname = "remote_origin",    .atttypid = TEXTOID },
> + { .attname = "replica_identity", .atttypid = JSONOID },
> + { .attname = "remote_tuple",     .atttypid = JSONOID },
> + { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
> +};
>
...
>
> 13c.
> TBH, I preferred code how it used to be -- where all the CLT constants
> and structs and enums and schemas were kept together. Now they are
> split across conflict.h and conflict.c making it harder to read as
> well as introducing need for static asserts that were not needed
> before.
>

That would lead to unnecessary inclusions at multiple places where it
is not required. See my 4th comment in email [1].

[1]: https://www.postgresql.org/message-id/CAA4eK1LhOHa_TEznw%2BgFoq%2Bw0vMvvsDG2g9Xq8Mwa8xZMY73og%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Nisha Moond <nisha.moond412@gmail.com> — 2026-05-15T10:29:23Z

On Thu, May 14, 2026 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote:
>
> I have fixed the other comments except this one. I will think more
> about this and reply separately. The attached patch has the changes
> for the rest of the comments. The patch also addresses comments from
> [1].
>

Thanks for the patches. Please find below comments for v34 patch-set.

1) Bug report:
When disable_on_error = true for a subscription, and an ERROR-level
conflict such as insert_exists occurs, the subscription gets disabled
without logging the conflict into the CLT.

patch-001:
2)  execMain.c:
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("cannot modify or insert data into conflict log table \"%s\"",
+ RelationGetRelationName(resultRel)),

Is ERRCODE_INSUFFICIENT_PRIVILEGE the right error code here? It gives
the impression that the operation might succeed with higher
privileges. Should we instead use ERRCODE_WRONG_OBJECT_TYPE, similar
to nearby restrictions?

3) No notice is shown when the conflict log table is removed after
changing conflict_log_destination from table/all to log.
Example:
postgres=# alter subscription sub1 set (conflict_log_destination = table);
NOTICE:  created conflict log table
"pg_conflict.pg_conflict_log_16400" for subscription "sub1"
ALTER SUBSCRIPTION

postgres=# alter subscription sub1 set (conflict_log_destination = log);
ALTER SUBSCRIPTION

We already show a notice when changing from log to table/all. Should
we add a similar notice as in DROP SUBSCRIPTION for above case?

patch-003:
4) conflict.c: ReportApplyConflict()
+ bool log_dest_clt = false;
+ bool log_dest_logfile;

log_dest_logfile should also be initialized to false, since for dest
== CONFLICT_LOG_DEST_TABLE, it is never assigned.

5) worker_internal.h
 extern PGDLLIMPORT List *table_states_not_ready;

+extern XLogRecPtr remote_final_lsn;
+extern TimestampTz remote_commit_ts;
+extern TransactionId remote_xid;

Should these new declarations also use PGDLLIMPORT?

6) worker.c: apply_handle_stream_start()
+ remote_xid = stream_xid;
+ remote_final_lsn = InvalidXLogRecPtr;
+ remote_commit_ts = 0;
+
  if (!TransactionIdIsValid(stream_xid))
  ereport(ERROR,
  (errcode(ERRCODE_PROTOCOL_VIOLATION),

Should the remote_xid assignment be moved after the validity check? We
could move all three assignments below the check.

patch-005:
7) subscriptioncmds.c: DropSubscription()
+ if (OidIsValid(form->subconflictlogrelid))
+ {
+ char *conflictrelname = get_rel_name(form->subconflictlogrelid);
....
"form" is being used here after the tuple it points to has already been deleted:

  /* Remove the tuple from catalog. */
  CatalogTupleDelete(rel, &tup->t_self);

  ReleaseSysCache(tup);

I think form->subconflictlogrelid should be saved beforehand and then
used later, similar to subid.

--
Thanks,
Nisha

Re: Proposal: Conflict log history table for Logical Replication

Nisha Moond <nisha.moond412@gmail.com> — 2026-05-18T09:12:14Z

While testing with all patches(v34) applied, I noticed an unexpected
behavior change in \dRs+ output.

I see that we changed the \dRs+ output format to display "Conflict log
table:" separately instead of as a column, but the output ordering
also seems to have changed.

Without the patch, both \dRs and \dRs+ display subscriptions in
alphabetical order by name. With this patch, \dRs still shows the
expected ordering, but \dRs+ now appears ordered by subscription
creation order (likely subid) instead of subscription name.

This is not a major issue, but it seems to break consistency. For
example, \dRp+ has a similar display pattern, but its output is
ordered by pub-name.

--
Thanks,
Nisha

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-18T12:35:40Z

On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Dilip/Vignesh.
>
> Some review comments for v33-0001.
>
> ======
> src/backend/executor/execMain.c
>
> 11.
> +
> + /*
> + * Conflict log tables are managed by the system to record logical
> + * replication conflicts.  We allow DELETE and TRUNCATE to permit users to
> + * manually prune these logs, but manual data insertion or modification
> + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the
> + * system-generated logs.
> + *
> + * Since TRUNCATE is handled as a separate utility command, we only need
> + * to explicitly permit CMD_DELETE here.
> + */
> + if (IsConflictNamespace(RelationGetNamespace(resultRel)) &&
> + operation != CMD_DELETE)
> + ereport(ERROR,
> + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> + errmsg("cannot modify or insert data into conflict log table \"%s\"",
> + RelationGetRelationName(resultRel)),
> + errdetail("Conflict log tables are system-managed and only support
> cleanup via DELETE or TRUNCATE.")));
>
> It somehow feels backwards to check "operation != CMD_DELETE", with
> the obscure comment that TRUNCATE is handled elsewhere.
>
> How about just check if "(operation == CMD_INSERT || operation ==
> CMD_UPDATE || operation == CMD_MERGE)".

I felt the existing is ok here, as it is mentioned "we only need to
explicitly permit CMD_DELETE" . Are you seeing any commands other than
INSERT, UPDATE & MERGE possible here?

> ~~~
>
> 12.
> +
> + /*
> + * Conflict log tables are managed by the system to record logical
> + * replication conflicts.  We do not allow locking rows in CONFLICT
> + * relations.
> + */
> + if (IsConflictNamespace(RelationGetNamespace(rel)))
> + ereport(ERROR,
> + (errcode(ERRCODE_WRONG_OBJECT_TYPE),
> + errmsg("cannot lock rows in conflict log table \"%s\"",
> + RelationGetRelationName(rel))));
>
> I was not sure what was meant by "CONFLICT relations.".
>
> Does it mean "... relations in the pg_conflict schema.". Anyway, is
> there any value to that 2nd sentence because it is much the same text
> as the errmsg.

 Yes, it means the relations in pg_conflict schema. Removed the second sentence.

> ======
> src/backend/replication/logical/conflict.c
>
> 13.
> +const char *const ConflictLogDestNames[] = {
> + [CONFLICT_LOG_DEST_LOG] = "log",
> + [CONFLICT_LOG_DEST_TABLE] = "table",
> + [CONFLICT_LOG_DEST_ALL] = "all"
> +};
> +
> +const ConflictLogColumnDef v[] = {
> + { .attname = "relid",            .atttypid = OIDOID },
> + { .attname = "schemaname",       .atttypid = TEXTOID },
> + { .attname = "relname",          .atttypid = TEXTOID },
> + { .attname = "conflict_type",    .atttypid = TEXTOID },
> + { .attname = "remote_xid",       .atttypid = XIDOID },
> + { .attname = "remote_commit_lsn",.atttypid = LSNOID },
> + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID },
> + { .attname = "remote_origin",    .atttypid = TEXTOID },
> + { .attname = "replica_identity", .atttypid = JSONOID },
> + { .attname = "remote_tuple",     .atttypid = JSONOID },
> + { .attname = "local_conflicts",  .atttypid = JSONARRAYOID }
> +};
>
> 13a.
> Both these arrays could benefit with some comments.

Added comments

> ~
>
> 13b.
> In the ConflictLogSchema, would it be better to keep all those
> "remote_" columns grouped together, instead of being broken by
> "replica_identity".

Modified

> ~
>
> 13c.
> TBH, I preferred code how it used to be -- where all the CLT constants
> and structs and enums and schemas were kept together. Now they are
> split across conflict.h and conflict.c making it harder to read as
> well as introducing need for static asserts that were not needed
> before.

No change done, as this change is required. Amit has given the
explanation at [1].

Rest of the comments were addressed. The attached v35 version patch
has the changes for the same.

I have kept the review comment fixes as separate patches so that Dilip
can merge them when convenient. Due to the additional review-fix
patches, Dilip's original patches 0001, 0002, 0003, and 0004 are now
renumbered as 0001, 0003, 0005, and 0007 respectively.  The
intermediate patches contain the review comment fixes:
a) 0002 contains fixes for 0001 b) 0004 contains fixes for 0003 c)
0006 contains fixes for 0005 d) 0008 contains fixes for 0007

Also comments from [2] and [3] are addressed in this.

[1] - https://www.postgresql.org/message-id/CAA4eK1Ki5mBgAubBkUPcBjN%3DO1jeT3AUh7vLQBm8w%3DgQiHO5Jw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAHut%2BPv%2BBK7iM3KZNcrXzPMYagrL2O4%3D46Hi3stT2XT-RmsjRQ%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAJpy0uARoVZkTA_PV4PB1MtUXZMyxkun1Cg5o1YOxaKsCbWxCA%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-19T06:31:51Z

On Mon, May 18, 2026 at 10:35 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote:

Hi Vignesh.

Thanks for addressing lots of my previous v33-0001 review comments.

Here are some more review comments for the combined v35-0001/0002 patches.

======
Commit message.

1.
If the user chooses to enable logging to a table (by selecting 'table'
or 'all'),
an internal logging table named pg_conflict_log_<subid> is automatically
created within a dedicated, system-managed 'pg_conflict' namespace to prevent
users from manually dropping or altering it. This also prevents accidental
name collisions with user-created tables. This table is linked to the
subscription via an internal dependency, ensuring it is automatically dropped
when the subscription is removed

~

The internal name of the CLT table has changed slightly, so the commit
message needs updating.

======
src/backend/catalog/heap.c

2.
+ * Don't allow creating relations in pg_catalog/pg_conflict directly, even
+ * though it is allowed to move user defined relations there. Semantics
+ * with search paths including pg_catalog are too confusing for now.

I think "pg_catalog/pg_conflict" could be misinterpreted. Better to
say "pg_catalog or pg_conflict".

~~~

3.
+ if (!allow_system_table_mods && IsNormalProcessingMode())
+ {
+ if ((IsCatalogNamespace(relnamespace) && relkind != RELKIND_INDEX) ||
+ IsToastNamespace(relnamespace))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("permission denied to create \"%s.%s\"",
+ get_namespace_name(relnamespace), relname),
+ errdetail("System catalog modifications are currently disallowed.")));
+ }
+
+ if (IsConflictNamespace(relnamespace))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("permission denied to create \"%s.%s\"",
+ get_namespace_name(relnamespace), relname),
+ errdetail("Conflict schema modifications are currently disallowed.")));
+ }
+ }

The curly-braces are unnecesary for those nested if-blocks.

======
src/backend/catalog/namespace.c

CheckSetNamespace:

4.
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot move objects into or out of pg_conflict schema")));

Is it better to say "the pg_conflict schema".

======
src/backend/commands/subscriptioncmds.c

5.
-

Looks like this was some unintended whitespace removal just after the
static function forward declarations.

~~~

AlterSubscription:

6.
+ bool want_table = (opts.conflictlogdest == CONFLICT_LOG_DEST_TABLE ||
+    opts.conflictlogdest == CONFLICT_LOG_DEST_ALL);
+ bool has_oldtable = (old_dest == CONFLICT_LOG_DEST_TABLE ||
+ old_dest == CONFLICT_LOG_DEST_ALL);

These should be simplified using the new macro: CONFLICTS_LOGGED_TO_TABLE.

======
src/backend/commands/tablecmds.c

DropSubscription:

7.
+ ObjectAddress object;

This can be declared at the lower scope closer to where it is actually used.

~~~

8.
+ if (OidIsValid(form->subconflictlogrelid))
+ {
+ char *conflictrelname = get_rel_name(form->subconflictlogrelid);
+ /*

There should be a blank line before that block comment.

> > ======
> > src/backend/executor/execMain.c
> >
> > 11.
> > +
> > + /*
> > + * Conflict log tables are managed by the system to record logical
> > + * replication conflicts.  We allow DELETE and TRUNCATE to permit users to
> > + * manually prune these logs, but manual data insertion or modification
> > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the
> > + * system-generated logs.
> > + *
> > + * Since TRUNCATE is handled as a separate utility command, we only need
> > + * to explicitly permit CMD_DELETE here.
> > + */
> > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) &&
> > + operation != CMD_DELETE)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > + errmsg("cannot modify or insert data into conflict log table \"%s\"",
> > + RelationGetRelationName(resultRel)),
> > + errdetail("Conflict log tables are system-managed and only support
> > cleanup via DELETE or TRUNCATE.")));
> >
> > It somehow feels backwards to check "operation != CMD_DELETE", with
> > the obscure comment that TRUNCATE is handled elsewhere.
> >
> > How about just check if "(operation == CMD_INSERT || operation ==
> > CMD_UPDATE || operation == CMD_MERGE)".
>
> I felt the existing is ok here, as it is mentioned "we only need to
> explicitly permit CMD_DELETE" . Are you seeing any commands other than
> INSERT, UPDATE & MERGE possible here?

9.
YMMV.

No, I'm not seeing other commands. I guess the current code works.

My previous review comment was because:
1. IMO, conditions that are positive instead of negative are easier to
comprehend
2. It would make the checking code consistent with the comment
“(INSERT, UPDATE, MERGE) is prohibited”, and with the error message
“cannot modify or insert”.
3. Doing it the suggested way eliminates any need to mention that
strange comment “Since TRUNCATE…”

CheckValidRowMarkRel:

10.
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot lock rows in conflict log table \"%s\"",

Should that say "in the"?

======
src/backend/replication/logical/conflict.c

> > 13c.
> > TBH, I preferred code how it used to be -- where all the CLT constants
> > and structs and enums and schemas were kept together. Now they are
> > split across conflict.h and conflict.c making it harder to read as
> > well as introducing need for static asserts that were not needed
> > before.
>
> No change done, as this change is required. Amit has given the
> explanation at [1].
>

By refactoring the conflict functions into conflict.c, it means nearly
everything is now kept together anyhow, just in the .c file instead of
the .h file :-)

~~~

11.
+StaticAssertDecl(lengthof(ConflictLogSchema) == NUM_CONFLICT_ATTRS,
+ "ConflictLogSchema length mismatch");
+
+

11a.
In fact, NUM_CONFLICT_ATTRS is not used outside this file, so now it
can be defined right here. It means the assertion is unnecessary.

Instead, the code here should look like:
#define NUM_CONFLICT_ATTRS lengthof(ConflictLogSchema)

~

11b.
Unnecessary extra whitespace here.

~~~

create_conflict_log_table:

12.
+ Assert(relid != InvalidOid);

Favour using the macro OidIsValid(relid).

======
src/include/catalog/pg_subscription.h

13.
 #include "catalog/objectaddress.h"
 #include "parser/parse_node.h"
+#include "replication/conflict.h"

I am guessing that this #include is probably no longer needed, because
you removed the extern function that was using ConflictLogDest.

======
src/include/replication/conflict.h

14.
+/* Structure to hold metadata for one column of the conflict log table */
+typedef struct ConflictLogColumnDef
+{
+ const char *attname;    /* Column name */
+ Oid         atttypid;   /* Data type OID */
+} ConflictLogColumnDef;
+

AFAIK, you can move this into conflict.c now because it is only used
in that file.

~~~

15.
+/* The single source of truth for the conflict log table schema */
+extern PGDLLIMPORT const ConflictLogColumnDef ConflictLogSchema[];
+

AFAIK, you can remove this because all usages are now within conflict.c.

~~~

16.
+#define NUM_CONFLICT_ATTRS 11
+

Move this into conflict.c -- see an earlier review comment.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-19T14:00:43Z

On Fri, 15 May 2026 at 15:59, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Thanks for the patches. Please find below comments for v34 patch-set.
>
> patch-003:
> 4) conflict.c: ReportApplyConflict()
> + bool log_dest_clt = false;
> + bool log_dest_logfile;
>
> log_dest_logfile should also be initialized to false, since for dest
> == CONFLICT_LOG_DEST_TABLE, it is never assigned.

It is not required to be initialized now as it is being assigned
before used in this function now.

> 5) worker_internal.h
>  extern PGDLLIMPORT List *table_states_not_ready;
>
> +extern XLogRecPtr remote_final_lsn;
> +extern TimestampTz remote_commit_ts;
> +extern TransactionId remote_xid;
>
> Should these new declarations also use PGDLLIMPORT?

I think these don't require PGDLLIMPORT as it will be used by the same
apply worker backend process.

Rest of the comments are handled, the attached v36 version patches
have the changes for the same.
Also the comment from [1] has been fixed in this version.

[1] - https://www.postgresql.org/message-id/CABdArM5XgHE4-HCryi54BxENgNqLDn81cMCUyqBdCeF9d3dbvA%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-20T06:32:11Z

On Tue, May 19, 2026 at 7:30 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Rest of the comments are handled, the attached v36 version patches
> have the changes for the same.
> Also the comment from [1] has been fixed in this version.
>

Thanks Vignesh.

A few comments for 0001 and 002 combined (I merged them and reviewed
for ease of review)

1)

+ * IsConflictLogTableClass
+ * True iff namespace is pg_conflict.
+ *
+ * Does not perform any catalog accesses.
  */
 bool
-IsConflictClass(Form_pg_class reltuple)
+IsConflictLogTableClass(Form_pg_class reltuple)

I think this function is trying to find if the reltuple is a CLT
rather than namepspace is pg_conflict.
We should change this comment. See IsToastRelation, IsToastClass.

Suggestion:
True iff Form_pg_class tuple represents a subscription-specific
Conflict Log Table.

2)

Both DropSubscription and AlterSubscription has below code to drop CLT:

+ if (OidIsValid(subconflictlogrelid))
+ {
+ char *conflictrelname = get_rel_name(subconflictlogrelid);
+
+ /*
+ * Conflict log tables are recorded as internal dependencies of the
+ * subscription.  We must drop the dependent objects before the
+ * subscription itself is removed.  By using
+ * PERFORM_DELETION_SKIP_ORIGINAL, we ensure that only the conflict log
+ * table is reaped while the subscription remains for the final
+ * deletion step.
+ */
+ ObjectAddressSet(object, SubscriptionRelationId, subid);
+ performDeletion(&object, DROP_CASCADE,
+ PERFORM_DELETION_INTERNAL |
+ PERFORM_DELETION_SKIP_ORIGINAL);
+
+ ereport(NOTICE,
+ errmsg("dropped conflict log table \"%s\" for subscription \"%s\"",
+    get_qualified_objname(PG_CONFLICT_NAMESPACE, conflictrelname),
+    subname));
+ }

Why don't we create a function
drop_conflict_log_table(subconflictlogrelid) and use it both places.

3)
+++ b/src/backend/commands/subscriptioncmds.c

+#include "catalog/heap.h"
+#include "catalog/pg_am_d.h"

It compiles now without these inclusion. 002 should remove these as well.

4)
AlterSubscription:
+ bool want_table = (opts.conflictlogdest == CONFLICT_LOG_DEST_TABLE ||
+    opts.conflictlogdest == CONFLICT_LOG_DEST_ALL);
+ bool has_oldtable = (old_dest == CONFLICT_LOG_DEST_TABLE ||
+ old_dest == CONFLICT_LOG_DEST_ALL);


Shall we replace checks at both places with CONFLICTS_LOGGED_TO_TABLE

~~

003,004: No comments

~~

Reviewing further.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-20T09:35:13Z

On Tue, 19 May 2026 at 12:02, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Mon, May 18, 2026 at 10:35 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Vignesh.
>
> Thanks for addressing lots of my previous v33-0001 review comments.
>
> Here are some more review comments for the combined v35-0001/0002 patches.
>
> ======
> Commit message.
>
> 1.
> If the user chooses to enable logging to a table (by selecting 'table'
> or 'all'),
> an internal logging table named pg_conflict_log_<subid> is automatically
> created within a dedicated, system-managed 'pg_conflict' namespace to prevent
> users from manually dropping or altering it. This also prevents accidental
> name collisions with user-created tables. This table is linked to the
> subscription via an internal dependency, ensuring it is automatically dropped
> when the subscription is removed
>
> ~
>
> The internal name of the CLT table has changed slightly, so the commit
> message needs updating.

This change is done as part of 0002 review comment fixes patch. I will
let Dilip do this change when he merges the review comment fixes patch
to 0001 patch.

> > > ======
> > > src/backend/executor/execMain.c
> > >
> > > 11.
> > > +
> > > + /*
> > > + * Conflict log tables are managed by the system to record logical
> > > + * replication conflicts.  We allow DELETE and TRUNCATE to permit users to
> > > + * manually prune these logs, but manual data insertion or modification
> > > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the
> > > + * system-generated logs.
> > > + *
> > > + * Since TRUNCATE is handled as a separate utility command, we only need
> > > + * to explicitly permit CMD_DELETE here.
> > > + */
> > > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) &&
> > > + operation != CMD_DELETE)
> > > + ereport(ERROR,
> > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > > + errmsg("cannot modify or insert data into conflict log table \"%s\"",
> > > + RelationGetRelationName(resultRel)),
> > > + errdetail("Conflict log tables are system-managed and only support
> > > cleanup via DELETE or TRUNCATE.")));
> > >
> > > It somehow feels backwards to check "operation != CMD_DELETE", with
> > > the obscure comment that TRUNCATE is handled elsewhere.
> > >
> > > How about just check if "(operation == CMD_INSERT || operation ==
> > > CMD_UPDATE || operation == CMD_MERGE)".
> >
> > I felt the existing is ok here, as it is mentioned "we only need to
> > explicitly permit CMD_DELETE" . Are you seeing any commands other than
> > INSERT, UPDATE & MERGE possible here?
>
> 9.
> YMMV.
>
> No, I'm not seeing other commands. I guess the current code works.

I preferred the current way in this case.

> ======
> src/backend/replication/logical/conflict.c
>
> > > 13c.
> > > TBH, I preferred code how it used to be -- where all the CLT constants
> > > and structs and enums and schemas were kept together. Now they are
> > > split across conflict.h and conflict.c making it harder to read as
> > > well as introducing need for static asserts that were not needed
> > > before.
> >
> > No change done, as this change is required. Amit has given the
> > explanation at [1].
> >
>
> By refactoring the conflict functions into conflict.c, it means nearly
> everything is now kept together anyhow, just in the .c file instead of
> the .h file :-)

No change done here because of the reason stated in the earlier mail.

Rest of the comments were fixed.
The attached v37 version patch has the changes for the same. Also
Peter's comments on the documentation patch from [1] and Shveta's
comments from [2] are addressed in the attached patch.

[1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-20T10:42:02Z

On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Rest of the comments were fixed.
> The attached v37 version patch has the changes for the same. Also
> Peter's comments on the documentation patch from [1] and Shveta's
> comments from [2] are addressed in the attached patch.
>
> [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com
> [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com
>

I have not yet looked at v37. But here are a few comments on v36-005,
006. I have merged them and reviewed together.

1)
+#include "utils/fmgroids.h"
+#include "utils/json.h"

conflict.c compiles without above inclusions.

2)
+ bool log_dest_clt = false;
+ bool log_dest_logfile;

A better and more clear name would be log_dest_table instead of
log_dest_clt here.

3)
@@ -6069,6 +6049,8 @@ DisableSubscriptionAndExit(void)
  */
  pgstat_report_subscription_error(MyLogicalRepWorker->subid);

+ ProcessPendingConflictLogTuple();

It does not look obvious as in why we are trying to process
conflict-tuple during disable-subscription? A comment will help here.

4)
tuple_table_slot_to_indextup_json():

+ indexDesc = index_open(indexid, NoLock);
+
+ build_index_datums_from_slot(estate, localrel, slot, indexDesc, values,
+ isnull);
+ tupdesc = RelationGetDescr(indexDesc);
+
+ /* Bless the tupdesc so it can be looked up by row_to_json. */
+ BlessTupleDesc(tupdesc);

We get the index's relcache pointer and pass it directly to
BlessTupleDesc which internally changes it by assigning tdtypmod. Is
this intentional i.e. do we want to change the relcache entry of index
directly? Shouldn't we copy it (CreateTupleDescCopy) and then Bless
it?

5)
build_conflict_tupledesc() does 'CreateTemplateTupleDesc' and Bless it
each time the conflict is raised. Since the tuple-descriptor here is
not going to change, IMO, it will be better to create and bless it
once and reuse it everytime. We can have a 'static' TupleDesc here.
Thoughts?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Shlok Kyal <shlok.kyal.oss@gmail.com> — 2026-05-20T10:50:34Z

On Wed, 20 May 2026 at 15:05, vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 19 May 2026 at 12:02, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Mon, May 18, 2026 at 10:35 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Vignesh.
> >
> > Thanks for addressing lots of my previous v33-0001 review comments.
> >
> > Here are some more review comments for the combined v35-0001/0002 patches.
> >
> > ======
> > Commit message.
> >
> > 1.
> > If the user chooses to enable logging to a table (by selecting 'table'
> > or 'all'),
> > an internal logging table named pg_conflict_log_<subid> is automatically
> > created within a dedicated, system-managed 'pg_conflict' namespace to prevent
> > users from manually dropping or altering it. This also prevents accidental
> > name collisions with user-created tables. This table is linked to the
> > subscription via an internal dependency, ensuring it is automatically dropped
> > when the subscription is removed
> >
> > ~
> >
> > The internal name of the CLT table has changed slightly, so the commit
> > message needs updating.
>
> This change is done as part of 0002 review comment fixes patch. I will
> let Dilip do this change when he merges the review comment fixes patch
> to 0001 patch.
>
> > > > ======
> > > > src/backend/executor/execMain.c
> > > >
> > > > 11.
> > > > +
> > > > + /*
> > > > + * Conflict log tables are managed by the system to record logical
> > > > + * replication conflicts.  We allow DELETE and TRUNCATE to permit users to
> > > > + * manually prune these logs, but manual data insertion or modification
> > > > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the
> > > > + * system-generated logs.
> > > > + *
> > > > + * Since TRUNCATE is handled as a separate utility command, we only need
> > > > + * to explicitly permit CMD_DELETE here.
> > > > + */
> > > > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) &&
> > > > + operation != CMD_DELETE)
> > > > + ereport(ERROR,
> > > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > > > + errmsg("cannot modify or insert data into conflict log table \"%s\"",
> > > > + RelationGetRelationName(resultRel)),
> > > > + errdetail("Conflict log tables are system-managed and only support
> > > > cleanup via DELETE or TRUNCATE.")));
> > > >
> > > > It somehow feels backwards to check "operation != CMD_DELETE", with
> > > > the obscure comment that TRUNCATE is handled elsewhere.
> > > >
> > > > How about just check if "(operation == CMD_INSERT || operation ==
> > > > CMD_UPDATE || operation == CMD_MERGE)".
> > >
> > > I felt the existing is ok here, as it is mentioned "we only need to
> > > explicitly permit CMD_DELETE" . Are you seeing any commands other than
> > > INSERT, UPDATE & MERGE possible here?
> >
> > 9.
> > YMMV.
> >
> > No, I'm not seeing other commands. I guess the current code works.
>
> I preferred the current way in this case.
>
> > ======
> > src/backend/replication/logical/conflict.c
> >
> > > > 13c.
> > > > TBH, I preferred code how it used to be -- where all the CLT constants
> > > > and structs and enums and schemas were kept together. Now they are
> > > > split across conflict.h and conflict.c making it harder to read as
> > > > well as introducing need for static asserts that were not needed
> > > > before.
> > >
> > > No change done, as this change is required. Amit has given the
> > > explanation at [1].
> > >
> >
> > By refactoring the conflict functions into conflict.c, it means nearly
> > everything is now kept together anyhow, just in the .c file instead of
> > the .h file :-)
>
> No change done here because of the reason stated in the earlier mail.
>
> Rest of the comments were fixed.
> The attached v37 version patch has the changes for the same. Also
> Peter's comments on the documentation patch from [1] and Shveta's
> comments from [2] are addressed in the attached patch.
>
> [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com
> [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com
>
Hi Vignesh,
Here are some minor comments:

Comment for all patches.
1. At multiple places (code comments and test cases) we are using the
word 'internal conflict log table'.
Do we need to use the word 'internal'? I think using 'conflict log
table' is sufficient?

Comments for 0002:
2. We can rename the schema pg_conflict to a different schema name.
Is it ok to hardcode the schema name to 'pg_conflict'?
-                errmsg("cannot move objects into or out of CONFLICT schema")));
+                errmsg("cannot move objects into or out of
pg_conflict schema")));

Example:
postgres=# ALTER SCHEMA pg_conflict RENAME TO sc1;
ALTER SCHEMA
postgres=# ALTER TABLE t2 SET SCHEMA sc1;
ERROR:  cannot move objects into or out of pg_conflict schema

Comment for 0005/0006:
3.
static const char *const ConflictTypeNames[] = {
    [CT_INSERT_EXISTS] = "insert_exists",
    [CT_UPDATE_ORIGIN_DIFFERS] = "update_origin_differs",
    [CT_UPDATE_EXISTS] = "update_exists",
    [CT_UPDATE_MISSING] = "update_missing",
    [CT_DELETE_ORIGIN_DIFFERS] = "delete_origin_differs",
    [CT_UPDATE_DELETED] = "update_deleted",
    [CT_DELETE_MISSING] = "delete_missing",
    [CT_MULTIPLE_UNIQUE_CONFLICTS] = "multiple_unique_conflicts"
};
There are a few extra blank lines after declaration of ConflictTypeNames.

Thanks,
Shlok Kyal

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-21T00:02:25Z

On Wed, May 20, 2026 at 8:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
..
> Comments for 0002:
> 2. We can rename the schema pg_conflict to a different schema name.
> Is it ok to hardcode the schema name to 'pg_conflict'?
> -                errmsg("cannot move objects into or out of CONFLICT schema")));
> +                errmsg("cannot move objects into or out of
> pg_conflict schema")));
>
> Example:
> postgres=# ALTER SCHEMA pg_conflict RENAME TO sc1;
> ALTER SCHEMA
> postgres=# ALTER TABLE t2 SET SCHEMA sc1;
> ERROR:  cannot move objects into or out of pg_conflict schema
>

Yikes!

I am not sure that the error message is the problem here. There are
worse things that are similar to this. e.g. I found that you can do
the same trick of renaming the 'pg_catalog' schema, and it breaks
anything that refers to that schema by name -- all the internal SQL!!

test_pub=# ALTER SCHEMA pg_catalog RENAME TO mycatalog;
ALTER SCHEMA
test_pub=# \dRp+
ERROR:  relation "pg_catalog.pg_publication" does not exist
LINE 9: FROM pg_catalog.pg_publication
             ^

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-21T01:19:18Z

Hi Vignesh.

I checked the latest v37-0001/0002 patches combined.

My only comment is below.

======

1.
+/*
+ * drop_conflict_log_table
+ *      Drop the conflict log table associated with a subscription.
+ *
+ * The conflict log table is registered as an internal dependency of the
+ * subscription. This function removes the dependency by performing a
+ * cascading deletion on the subscription object, which in turn drops the
+ * associated conflict log table.
+ *
+ * This is used to clean up conflict log tables that are no longer required,
+ * preventing accumulation of stale or orphaned relations.
+ *
+ * NOTE:
+ * Only conflict log tables are currently managed via this internal dependency
+ * mechanism. If additional internal dependencies are introduced in future,
+ * this function may require refinement to avoid unintended deletions.
+ */
+void
+drop_conflict_log_table(Oid subid, char *subname, Oid subconflictlogrelid)
+{
+ ObjectAddress object;
+ char *conflictrelname;
+
+ conflictrelname = get_rel_name(subconflictlogrelid);
+
+ ObjectAddressSet(object, SubscriptionRelationId, subid);
+ performDeletion(&object, DROP_CASCADE,
+ PERFORM_DELETION_INTERNAL |
+ PERFORM_DELETION_SKIP_ORIGINAL);
+
+ ereport(NOTICE,
+ errmsg("dropped conflict log table \"%s\" for subscription \"%s\"",
+ get_qualified_objname(PG_CONFLICT_NAMESPACE, conflictrelname),
+ subname));
+}
+

IIUC, this is a function that drops the subscription dependencies via
cascade. Since the CLT happens to be the only such dependency, it gets
dropped.

The current implementation feels backwards to me. IMO, this is really
a subscription function, so it should be refactored to be called
something like 'drop_subscription_dependencies', and not be in the
conflicts.c file. Refactoring/renaming to what it *really* does means
you won't need to give the other warnings like "may require refinement
to avoid unintended deletions". Maybe the callers do not need to be
guarded anymore -- this code can check internally so that it only does
anything when there is a known CLT associated with the subscription.

Also, the function comment should make it clearer that
PERFORM_DELETION_SKIP_ORIGINAL means the parent subscription object is
not deleted.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-21T03:59:52Z

Hi Vignesh.

Thanks for addressing my review comments for the documentation.

Here is one more comment for the v37-0008/0009 (combined) docs patches

======
doc/src/sgml/logical-replication.sgml

1.
+      <row>
+       <entry><literal>replica_identity</literal></entry>
+       <entry><type>json</type></entry>
+       <entry>The JSON representation of the replica identity.</entry>
+      </row>
+      <row>

I think patch 0002 modified the CLT column order. This doc's table row
order should match the order of the CLT columns, so please compare
again with the schema defined by the latest conflict.c.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-21T04:57:10Z

Hi Vignesh

Some trivial review comments for the combined v37-0003/0004 (transfer
ownership) patches.

======
src/test/regress/sql/subscription.sql

1.
+ALTER SUBSCRIPTION regress_conflict_test1 owner to regress_subscription_user2;

/owner to/OWNER TO/

~~~

2.
+-- Restore the original subscription owner.
+ALTER SUBSCRIPTION regress_conflict_test1 owner to regress_subscription_user;

/owner to/OWNER TO/

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-21T05:51:02Z

Hi Vignesh,

Some minor review comments for patches v37-0005/0006 combined.

======
src/backend/replication/logical/conflict.c

1.
+/* Schema for the elements within the 'local_conflicts' JSON array */
+static const ConflictLogColumnDef LocalConflictSchema[] =
+{
+ { .attname = "xid",       .atttypid = XIDOID },
+ { .attname = "commit_ts", .atttypid = TIMESTAMPTZOID },
+ { .attname = "origin",    .atttypid = TEXTOID },
+ { .attname = "key",       .atttypid = JSONOID },
+ { .attname = "tuple",     .atttypid = JSONOID }
+};
+
+#define NUM_LOCAL_CONFLICT_ATTRS lengthof(LocalConflictSchema)
+

IMO this belongs *below* the ConflictLogSchema[], which is where
'local_conflicts' attribute was introduced, instead of above it.

~~~

2.
+
+
 static int errcode_apply_conflict(ConflictType type);

~

There are some spurious blank lines here that should not be in the patch.

~~~

ProcessPendingConflictLogTuple:

3.
+ /* Open conflict log table and insert the tuple */
+ conflictlogrel = GetConflictLogDestAndTable(&dest);
+ Assert(CONFLICTS_LOGGED_TO_TABLE(dest));

Maybe here it's better to say Assert(conflictlogrel);

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-21T06:01:29Z

Amit, Vignesh,

A part of 007 patch is about preserving subscription-oid. Another
thread (origin migration) also needs the same logic as per discussion
at [1]. And there was a old thread which already attempted preserving
subscription-oid at [2], but the idea was rejected at that time. Why
don't we attempt to resume the same thread ([2]) and implement
preserving subscription-oid as a separate thread as we now have
multiple dependencies on it? Thoughts?

[1]:  https://www.postgresql.org/message-id/CALDaNm2-uwpbJ8fnrssp%2BhORvOutsqRoZAsa05xVVzXe5Bt3bw%40mail.gmail.com
[2]:  https://www.postgresql.org/message-id/flat/CALDaNm2Wj63VcbB0SY2NECHr1mKM1YSaV1ZydrdQVxyox2O2hg%40mail.gmail.com

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-21T07:08:28Z

A few comments on v36-007:

1)

AlterSubscriptionConflictLogDestination
+ want_table = (logdest == CONFLICT_LOG_DEST_TABLE ||
+   logdest == CONFLICT_LOG_DEST_ALL);
+ has_oldtable = (old_dest == CONFLICT_LOG_DEST_TABLE ||
+ old_dest == CONFLICT_LOG_DEST_ALL);

Shall we replace checks at both places with CONFLICTS_LOGGED_TO_TABLE?

2)
I think we can move 'AlterSubscriptionConflictLogDestination' into the
configuration patch itself (if needed). It is not directly used
anywhere in upgrade flow as such. IIUC, even if upgrade flow uses it,
it will only be used through AlterSubscription.

3)
AlterSubscriptionConflictLogDestination:

+ if (want_table && !has_oldtable)
+ {
+ char relname[NAMEDATALEN];
+
+ snprintf(relname, NAMEDATALEN, "pg_conflict_log_for_subid_%u", sub->oid);
+
+ /*
+ * In upgrade scenarios, the conflict log table already exists. Update
+ * the catalog to record the association.
+ */
+ relid = get_relname_relid(relname, PG_CONFLICT_NAMESPACE);
+ if (!OidIsValid(relid))
+ relid = create_conflict_log_table(sub->oid, sub->name, sub->owner);

So this function will now be used during upgrade where destination is
TABLE/ALL as well as regular Alter-Subscription to change destination
from LOG to TABLE/ALL. In upgrade case, we expect the relid (CLT) to
be present already while in regular case, we don't expect any CLT to
be present.

The above code does not take care of maintaining the sanity checks. It
should be able to distinguish the 2 cases and Assert/Error if the
condition is opposed to what we expect.

4)
Also , I do not understand how can upgrade ever pass this check:

+ if (want_table && !has_oldtable)

It is not obvious how the upgrade flow will pass this check because
theoretically both the old and new setup should have the exact same
configuration; i.e. if  'want_table'  is true, 'has_oldtable' will be
true. We can add a comment to clarify the situation here.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-21T07:11:03Z

On Thu, 21 May 2026 at 05:32, Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Wed, May 20, 2026 at 8:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> ..
> > Comments for 0002:
> > 2. We can rename the schema pg_conflict to a different schema name.
> > Is it ok to hardcode the schema name to 'pg_conflict'?
> > -                errmsg("cannot move objects into or out of CONFLICT schema")));
> > +                errmsg("cannot move objects into or out of
> > pg_conflict schema")));
> >
> > Example:
> > postgres=# ALTER SCHEMA pg_conflict RENAME TO sc1;
> > ALTER SCHEMA
> > postgres=# ALTER TABLE t2 SET SCHEMA sc1;
> > ERROR:  cannot move objects into or out of pg_conflict schema
> >
>
> Yikes!
>
> I am not sure that the error message is the problem here. There are
> worse things that are similar to this. e.g. I found that you can do
> the same trick of renaming the 'pg_catalog' schema, and it breaks
> anything that refers to that schema by name -- all the internal SQL!!
>
> test_pub=# ALTER SCHEMA pg_catalog RENAME TO mycatalog;
> ALTER SCHEMA
> test_pub=# \dRp+
> ERROR:  relation "pg_catalog.pg_publication" does not exist
> LINE 9: FROM pg_catalog.pg_publication
>              ^

I noticed this behavior with several other commands as well. For example:
postgres=# ALTER SCHEMA pg_catalog RENAME TO test;
ALTER SCHEMA
postgres=# \d
ERROR:  relation "pg_catalog.pg_class" does not exist
LINE 6: FROM pg_catalog.pg_class c
             ^
postgres=# \dn
ERROR:  relation "pg_catalog.pg_namespace" does not exist
LINE 4: FROM pg_catalog.pg_namespace n
             ^

I observed similar behavior when creating a table in the renamed schema:
postgres=# CREATE TABLE test.t1(c1 int);
ERROR:  schema "pg_catalog" does not exist
LINE 1: CREATE TABLE test.t1(c1 int);
                     ^

Given that this appears to be a broader issue related to renaming
pg_catalog, I think we can skip handling this case for now. If we
decide to address it, it would be better to handle it together with
the general pg_catalog rename behavior.

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Shlok Kyal <shlok.kyal.oss@gmail.com> — 2026-05-21T07:13:15Z

On Wed, 20 May 2026 at 15:05, vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 19 May 2026 at 12:02, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Mon, May 18, 2026 at 10:35 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Vignesh.
> >
> > Thanks for addressing lots of my previous v33-0001 review comments.
> >
> > Here are some more review comments for the combined v35-0001/0002 patches.
> >
> > ======
> > Commit message.
> >
> > 1.
> > If the user chooses to enable logging to a table (by selecting 'table'
> > or 'all'),
> > an internal logging table named pg_conflict_log_<subid> is automatically
> > created within a dedicated, system-managed 'pg_conflict' namespace to prevent
> > users from manually dropping or altering it. This also prevents accidental
> > name collisions with user-created tables. This table is linked to the
> > subscription via an internal dependency, ensuring it is automatically dropped
> > when the subscription is removed
> >
> > ~
> >
> > The internal name of the CLT table has changed slightly, so the commit
> > message needs updating.
>
> This change is done as part of 0002 review comment fixes patch. I will
> let Dilip do this change when he merges the review comment fixes patch
> to 0001 patch.
>
> > > > ======
> > > > src/backend/executor/execMain.c
> > > >
> > > > 11.
> > > > +
> > > > + /*
> > > > + * Conflict log tables are managed by the system to record logical
> > > > + * replication conflicts.  We allow DELETE and TRUNCATE to permit users to
> > > > + * manually prune these logs, but manual data insertion or modification
> > > > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the
> > > > + * system-generated logs.
> > > > + *
> > > > + * Since TRUNCATE is handled as a separate utility command, we only need
> > > > + * to explicitly permit CMD_DELETE here.
> > > > + */
> > > > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) &&
> > > > + operation != CMD_DELETE)
> > > > + ereport(ERROR,
> > > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> > > > + errmsg("cannot modify or insert data into conflict log table \"%s\"",
> > > > + RelationGetRelationName(resultRel)),
> > > > + errdetail("Conflict log tables are system-managed and only support
> > > > cleanup via DELETE or TRUNCATE.")));
> > > >
> > > > It somehow feels backwards to check "operation != CMD_DELETE", with
> > > > the obscure comment that TRUNCATE is handled elsewhere.
> > > >
> > > > How about just check if "(operation == CMD_INSERT || operation ==
> > > > CMD_UPDATE || operation == CMD_MERGE)".
> > >
> > > I felt the existing is ok here, as it is mentioned "we only need to
> > > explicitly permit CMD_DELETE" . Are you seeing any commands other than
> > > INSERT, UPDATE & MERGE possible here?
> >
> > 9.
> > YMMV.
> >
> > No, I'm not seeing other commands. I guess the current code works.
>
> I preferred the current way in this case.
>
> > ======
> > src/backend/replication/logical/conflict.c
> >
> > > > 13c.
> > > > TBH, I preferred code how it used to be -- where all the CLT constants
> > > > and structs and enums and schemas were kept together. Now they are
> > > > split across conflict.h and conflict.c making it harder to read as
> > > > well as introducing need for static asserts that were not needed
> > > > before.
> > >
> > > No change done, as this change is required. Amit has given the
> > > explanation at [1].
> > >
> >
> > By refactoring the conflict functions into conflict.c, it means nearly
> > everything is now kept together anyhow, just in the .c file instead of
> > the .h file :-)
>
> No change done here because of the reason stated in the earlier mail.
>
> Rest of the comments were fixed.
> The attached v37 version patch has the changes for the same. Also
> Peter's comments on the documentation patch from [1] and Shveta's
> comments from [2] are addressed in the attached patch.
>
> [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com
> [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com
>
Hi Vignesh,

I reviewed v37-0007 patch. Here is some review comments:

1. subinfo[i].subconflictlogdest is assigned multiple times:

+       if (PQgetisnull(res, i, i_sublogdestination))
+           subinfo[i].subconflictlogdest = NULL;
+       else
+           subinfo[i].subconflictlogdest =
+               pg_strdup(PQgetvalue(res, i, i_sublogdestination));
+
+       if (PQgetisnull(res, i, i_sublogdestination))
+           subinfo[i].subconflictlogdest = NULL;
+       else
+           subinfo[i].subconflictlogdest =
+               pg_strdup(PQgetvalue(res, i, i_sublogdestination));

2. I think we should add a version check before:
+   appendPQExpBuffer(query,
+                     "\n\nALTER SUBSCRIPTION %s SET
(conflict_log_destination = %s);\n",
+                     qsubname,
+                     subinfo->subconflictlogdest);

When we run pg_dump on a server with Postgres 18, we get the following output.
ALTER SUBSCRIPTION sub2 SET (conflict_log_destination = (null));

Thanks,
Shlok Kyal

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-21T10:39:23Z

Few comments on doc patches v36-008 and 009 combined:

1)
+       An array of JSON objects representing the local state for each
conflict attempt.

'each conflict attempt' looks misleading. We do not attempt to cause
conflicts; we attempt to apply, but it may result in conflicts.

Shall we rephrase to:
'An array of JSON objects representing the state of existing local
row(s) that caused the conflict.'

There could be multiple rows as well for multiple_unique_conflicts,
thus the 'row(s)'

2)
+   The <link linkend="sql-createsubscription-params-with-conflict-log-destination"><literal>conflict_log_destination</literal></link>
+   parameter automatically creates a dedicated conflict log table.

'conflict_log_destination' parameter does not create the table
automatically unless it is set to table. We shall clarify it.

The conflict_log_destination when set to table or all automatically
creates a dedicated conflict log table.

3)
+   Conflicts that occur during replication are, by default, logged as
plain text

When we say 'Conflicts' here, we shall make it a link to '29.8.
Conflicts' chapter. That way it will be more clear.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Nisha Moond <nisha.moond412@gmail.com> — 2026-05-22T04:51:21Z

On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Rest of the comments were fixed.
> The attached v37 version patch has the changes for the same. Also
> Peter's comments on the documentation patch from [1] and Shveta's
> comments from [2] are addressed in the attached patch.
>

Here are few comments based on v37 testing:

1) Should we consider using TOAST tables for tuple-data columns like
remote_tuple and local_conflicts (the JSON columns)?
This may be a corner case, but if the tuple data becomes too large to
fit into an 8KB heap tuple, then the apply worker keeps failing while
inserting into the CLT with errors like:

  ERROR: row is too big: size 19496, maximum size 8160
  LOG: background worker "logical replication apply worker" (PID
41226) exited with exit code 1

Noticed that even disable_on_error=true does not disable the
subscription in this case. We can think about optimizations such as
deciding when TOAST tables should be created, or avoiding the error by
trimming/capping the data size before inserting into the CLT if don't
want TOAST.
~~~

2) Currently, parallel apply workers do not seem to insert conflicts
into the CLT. The parallel worker logs the conflict to the logfile and
then exits with an error without handling CLT insertion.
A small test to reproduce this with a 't1' table subscription using a CLT table:
-- on publisher
ALTER SYSTEM SET logical_decoding_work_mem = '64kB';
SELECT pg_reload_conf();

-- Create a conflict scenario on subscriber: pre-insert a row that will conflict
INSERT INTO t1 VALUES (99999, 11);

-- on publisher: big transaction that hits the conflict
BEGIN;
INSERT INTO t1 SELECT i, i FROM generate_series(1, 10000) i;
INSERT INTO t1 VALUES (99999, 99); -- this conflicts
COMMIT;

logfile:
ERROR: conflict detected on relation "public.t1": conflict=insert_exists
DETAIL: Could not apply remote change: remote row (99999, 99).
Key already exists in unique index "t1_pkey", modified locally in
transaction 842 at 2026-05-21 21:10:51.497681+05:30: key (a)=(99999),
local row (99999, 42).
...
ERROR: logical replication parallel apply worker exited due to error
CONTEXT: processing remote data for replication origin "pg_16398"
during message type "INSERT" for replication target relation
"public.t1" in transaction 720
logical replication parallel apply worker
processing remote data for replication origin "pg_16398" during
message type "STREAM COMMIT" in transaction 720, finished at
0/01AC9758
LOG: subscription "sub1" has been disabled because of an error
ERROR: lost connection to the logical replication parallel apply worker
LOG: background worker "logical replication parallel worker" (PID
66271) exited with exit code 1
~~~

3) I think somewhere in patch-0005, the remote_tuple and
replica_identity columns may have been swapped.
The replica identity key seems to be written into the remote_tuple
column, while the remote slot row is written into replica_identity,
for example:

postgres=# select relname, conflict_type, remote_xid, remote_tuple,
replica_identity from pg_conflict_log_for_subid_16398;
relname | conflict_type | remote_xid | remote_tuple | replica_identity
---------+-----------------------+------------+--------------+------------------
t1 | insert_exists | 699 | | {"a":3,"b":11}
t1 | update_origin_differs | 700 | {"a":3} | {"a":3,"b":111}
(2 rows)

--
Thanks,
Nisha

Re: Proposal: Conflict log history table for Logical Replication

Nisha Moond <nisha.moond412@gmail.com> — 2026-05-22T10:12:20Z

On Fri, May 22, 2026 at 10:21 AM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Rest of the comments were fixed.
> > The attached v37 version patch has the changes for the same. Also
> > Peter's comments on the documentation patch from [1] and Shveta's
> > comments from [2] are addressed in the attached patch.
> >
>
> Here are few comments based on v37 testing:
>

Here are few more review comments -
1) Patch-0001 + 0002:
In subscription.sql:
 -- Verify the table OID for reap check
 SELECT 'pg_conflict_log_for_subid_' || oid AS internal_tablename FROM
pg_subscription WHERE subname = 'regress_conflict_test1' \gset
 SET client_min_messages = WARNING;
 DROP SUBSCRIPTION regress_conflict_test1;
 -- should return NULL, meaning the conflict log table was reaped via dependency
 SELECT to_regclass(:'internal_tablename');

Here, internal_tablename becomes "pg_conflict_log_*" without the
pg_conflict. schema prefix. So, "SELECT
to_regclass(:'internal_tablename');" will always return NULL even if
the table still exists in the pg_conflict schema, which skips the
actual drop verification scenario.
Should we instead use:
   "SELECT 'pg_conflict.pg_conflict_log_' || oid AS internal_tablename..."
~~~

For Patch-0007:
2)
@@ -2067,9 +2095,31 @@ selectDumpableNamespace(NamespaceInfo *nsinfo,
Archive *fout)
 static void
 selectDumpableTable(TableInfo *tbinfo, Archive *fout)
....
+ if (strcmp(tbinfo->dobj.namespace->dobj.name, "pg_conflict") == 0)
...
+ * Dump pg_conflict tables only during binary upgrade. The schema
+ * is assumed to already exist.
+ */
+ tbinfo->dobj.dump = DUMP_COMPONENT_DEFINITION;
....
+ else
+ tbinfo->dobj.dump = DUMP_COMPONENT_NONE;
+ }
+

For conflict log tables during binary upgrade, we set:
   tbinfo->dobj.dump = DUMP_COMPONENT_DEFINITION;

but then execution falls through to the later logic:
...
  else
    tbinfo->dobj.dump = tbinfo->dobj.namespace->dobj.dump_contains;

which seems to overwrite the earlier 'dobj.dump' value. So it looks
like DUMP_COMPONENT_DEFINITION may never actually survive here.
Should we return from this block instead of continuing further?

3)
@@ -5656,6 +5757,11 @@ dumpSubscription(Archive *fout, const
SubscriptionInfo *subinfo)

  appendPQExpBufferStr(query, ");\n");

+ appendPQExpBuffer(query,
+   "\n\nALTER SUBSCRIPTION %s SET (conflict_log_destination = %s);\n",
+   qsubname,
+   subinfo->subconflictlogdest);
+

The above ALTER SUBSCRIPTION command seems to be dumped
unconditionally for every subscription.
Since the default value during subscription creation is already
"subconflictlogdest = 'log' ", should we emit this command only when
subconflictlogdest is non-NULL and not 'log'?

4)
+ if (PQgetisnull(res, i, i_sublogdestination))
+ subinfo[i].subconflictlogdest = NULL;
+ else
+ subinfo[i].subconflictlogdest =
+ pg_strdup(PQgetvalue(res, i, i_sublogdestination));
+
+ if (PQgetisnull(res, i, i_sublogdestination))
+ subinfo[i].subconflictlogdest = NULL;
+ else
+ subinfo[i].subconflictlogdest =
+ pg_strdup(PQgetvalue(res, i, i_sublogdestination));
+
  /* Decide whether we want to dump it */

Looks like the same if-else block is repeated twice here.

--
Thanks,
Nisha

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-23T06:10:40Z

On Wed, 20 May 2026 at 16:12, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Rest of the comments were fixed.
> > The attached v37 version patch has the changes for the same. Also
> > Peter's comments on the documentation patch from [1] and Shveta's
> > comments from [2] are addressed in the attached patch.
> >
> > [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com
> > [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com
> >
>
> I have not yet looked at v37. But here are a few comments on v36-005,
> 006. I have merged them and reviewed together.
>
> 1)
> +#include "utils/fmgroids.h"
> +#include "utils/json.h"
>
> conflict.c compiles without above inclusions.
>
> 2)
> + bool log_dest_clt = false;
> + bool log_dest_logfile;
>
> A better and more clear name would be log_dest_table instead of
> log_dest_clt here.
>
> 3)
> @@ -6069,6 +6049,8 @@ DisableSubscriptionAndExit(void)
>   */
>   pgstat_report_subscription_error(MyLogicalRepWorker->subid);
>
> + ProcessPendingConflictLogTuple();
>
> It does not look obvious as in why we are trying to process
> conflict-tuple during disable-subscription? A comment will help here.
>
>
> 4)
> tuple_table_slot_to_indextup_json():
>
> + indexDesc = index_open(indexid, NoLock);
> +
> + build_index_datums_from_slot(estate, localrel, slot, indexDesc, values,
> + isnull);
> + tupdesc = RelationGetDescr(indexDesc);
> +
> + /* Bless the tupdesc so it can be looked up by row_to_json. */
> + BlessTupleDesc(tupdesc);
>
> We get the index's relcache pointer and pass it directly to
> BlessTupleDesc which internally changes it by assigning tdtypmod. Is
> this intentional i.e. do we want to change the relcache entry of index
> directly? Shouldn't we copy it (CreateTupleDescCopy) and then Bless
> it?
>
> 5)
> build_conflict_tupledesc() does 'CreateTemplateTupleDesc' and Bless it
> each time the conflict is raised. Since the tuple-descriptor here is
> not going to change, IMO, it will be better to create and bless it
> once and reuse it everytime. We can have a 'static' TupleDesc here.
> Thoughts?

Thanks for the comments, these comments are addressed in the v38
version attached.
Apart from this, the comments from [1], [2], [3], [4], [5], [6], [7],
and [8] are also addressed.

[1] - https://www.postgresql.org/message-id/CAJpy0uC43NTKheuLo%2BMsHG7Sfh-QWQM9QP-EVPL5LChiPfisJw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CANhcyEU8qr9%2BPMU2Kn0qqZakVptVvRsbRu3Ee2Q40YX9aivXww%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAJpy0uB19XxfF2Yj1w%3DC90iVBLMHb%3DDMBZ1h3rqzJhEbTSwtag%40mail.gmail.com
[4] - https://www.postgresql.org/message-id/CAHut%2BPvSaJAYwNUS9GnO6MCTfuPpVLdU1r8cZBf6gjGjvnbWpQ%40mail.gmail.com
[5] - https://www.postgresql.org/message-id/CAHut%2BPtUWTnUD8QpfmNpU8iU6Pg%2BE29nDALYAfMUudad8oYezw%40mail.gmail.com
[6] - https://www.postgresql.org/message-id/CAHut%2BPvW%3DFd-OSM6oe-9D3ycAG0qLfGEnaT%3DBUB%2BPMeUFeEAyQ%40mail.gmail.com
[7] - https://www.postgresql.org/message-id/CAHut%2BPu4ErbjstY86kWbKOepHn623Zp9MNiKW4DoMG3iVdG2fA%40mail.gmail.com
[8] - https://www.postgresql.org/message-id/CANhcyEUGoaSpJKDJaQfrQR6%2B-4%2B_PgQ%3D0DmZZztPAEheMkMw7w%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-23T15:40:08Z

On Wed, May 20, 2026 at 11:01 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> Amit, Vignesh,
>
> A part of 007 patch is about preserving subscription-oid. Another
> thread (origin migration) also needs the same logic as per discussion
> at [1]. And there was a old thread which already attempted preserving
> subscription-oid at [2], but the idea was rejected at that time. Why
> don't we attempt to resume the same thread ([2]) and implement
> preserving subscription-oid as a separate thread as we now have
> multiple dependencies on it? Thoughts?
>

Agreed, but I think we can move the discussion/review to a separate
thread. However, at this stage, we can make initial patches ready and
then move to it.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-25T01:36:42Z

Hi Vignesh,

Some review comments for v38-0001/0002 combined.

======
src/backend/commands/subscriptioncmds.c

AlterSubscriptionConflictLogDestination:

1.
 * Update the conflict log table associated with a subscription when its
 * conflict log destination is changed.

Somehow, that 'its' sounded awkward to me.

SUGGESTION.
When the subscription's 'conflict_log_destination' is changed, update
the conflict log table if required.

~~~

2.
+ * If the new destination requires a conflict log table and none was previously
+ * required, this function validates an existing conflict log table identified
+ * by the subscription specific naming convention or creates a new one.

What does this mean: "validates an existing conflict log table". How
is there an "existing" CLT when you already said "none was previously
required". Maybe this needs more explanation. If it is talking about
"not already associated with another subscription", then it should
just say that.

Anyway, it seems validation that the comment claims this function is
doing is not done here at all, but is really done by
'create_conflict_log_table'.

~~~

3.
+static bool
+AlterSubscriptionConflictLogDestination(Subscription *sub,
+ ConflictLogDest logdest,
+ Oid *conflicttablerelid)

3a.
There was no forward declaration of this static function, but there
was for all the others.

~

3b.
Static functions should use snake-case names.

~~~

4.
Personally, I think it is more natural to read LEFT-TO-RIGHT,
OLD-THEN-NEW, etc., so I felt that the has_oldtable check should
always come before want_table.

Also, the 'ifs' seemed tricky because it's not obvious what
has/need_table combinations are missing. e.g. The following seems
easier to me. And probably lots of comments could be moved to here in
the code as well, instead of in the function comment.

SUGGESTION
if (has_old_table)
{
  /* There is a CLT already. */

  if (!want_table)
  {
    /* Remove it. */
    drop_subscription_dependencies(sub->oid, sub->name, sub->conflictlogrelid);
    update_relid = true;
  }
}
else
{
  /* There was no previous CLT. */

  if (want_table)
  {
    /* Create one. */
    relid = create_conflict_log_table(sub->oid, sub->name, sub->owner);
    update_relid = true;
  }
}

~~~

5.
+static void
+drop_subscription_dependencies(Oid subid, char *subname,
+    Oid subconflictlogrelid)
+{
+ ObjectAddress object;
+ char *conflictrelname;
+
+ conflictrelname = get_rel_name(subconflictlogrelid);
+
+ /*
+ * By using PERFORM_DELETION_SKIP_ORIGINAL, we ensure that only the
+ * conflict log table is deleted while the subscription remains.
+ */
+ ObjectAddressSet(object, SubscriptionRelationId, subid);
+ performDeletion(&object, DROP_CASCADE,
+ PERFORM_DELETION_INTERNAL |
+ PERFORM_DELETION_SKIP_ORIGINAL);
+
+ ereport(NOTICE,
+ errmsg("dropped conflict log table \"%s\" for subscription \"%s\"",
+ get_qualified_objname(PG_CONFLICT_NAMESPACE, conflictrelname),
+ subname));
+}
+

One day, this function might do more than just remove the CLT, so IMO
all this function body should be within a check:

if (OidIsValid(subconflictlogrelid))
{
  /* Drop any dependent CLT */
  ...
}

~~~

DropSubscription

6.
+ if (OidIsValid(subconflictlogrelid))
+ drop_subscription_dependencies(subid, subname, subconflictlogrelid);

Make it unconditional. Instead, add the condition inside the
'drop_subscription_dependencies', per the previous review comment #5.

======
src/test/regress/sql/subscription.sql

7.
+--
+-- PUBLICATION: Verify conflict log tables are not publishable
+--
+-- pg_relation_is_publishable should return false for internal conflict log
+-- tables to prevent them from being accidentally included in publications
+--

Everywhere else, you had removed the word "internal", but this one
(maybe others?) was missed.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-25T04:18:43Z

On Sat, May 23, 2026 at 9:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 20, 2026 at 11:01 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Amit, Vignesh,
> >
> > A part of 007 patch is about preserving subscription-oid. Another
> > thread (origin migration) also needs the same logic as per discussion
> > at [1]. And there was a old thread which already attempted preserving
> > subscription-oid at [2], but the idea was rejected at that time. Why
> > don't we attempt to resume the same thread ([2]) and implement
> > preserving subscription-oid as a separate thread as we now have
> > multiple dependencies on it? Thoughts?
> >
>
> Agreed, but I think we can move the discussion/review to a separate
> thread. However, at this stage, we can make initial patches ready and
> then move to it.
>

Okay, works for me.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-25T04:42:53Z

On Fri, 22 May 2026 at 15:42, Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> Here are few more review comments -
> 1) Patch-0001 + 0002:
> In subscription.sql:
>  -- Verify the table OID for reap check
>  SELECT 'pg_conflict_log_for_subid_' || oid AS internal_tablename FROM
> pg_subscription WHERE subname = 'regress_conflict_test1' \gset
>  SET client_min_messages = WARNING;
>  DROP SUBSCRIPTION regress_conflict_test1;
>  -- should return NULL, meaning the conflict log table was reaped via dependency
>  SELECT to_regclass(:'internal_tablename');
>
> Here, internal_tablename becomes "pg_conflict_log_*" without the
> pg_conflict. schema prefix. So, "SELECT
> to_regclass(:'internal_tablename');" will always return NULL even if
> the table still exists in the pg_conflict schema, which skips the
> actual drop verification scenario.
> Should we instead use:
>    "SELECT 'pg_conflict.pg_conflict_log_' || oid AS internal_tablename..."
> ~~~
>
> For Patch-0007:
> 2)
> @@ -2067,9 +2095,31 @@ selectDumpableNamespace(NamespaceInfo *nsinfo,
> Archive *fout)
>  static void
>  selectDumpableTable(TableInfo *tbinfo, Archive *fout)
> ....
> + if (strcmp(tbinfo->dobj.namespace->dobj.name, "pg_conflict") == 0)
> ...
> + * Dump pg_conflict tables only during binary upgrade. The schema
> + * is assumed to already exist.
> + */
> + tbinfo->dobj.dump = DUMP_COMPONENT_DEFINITION;
> ....
> + else
> + tbinfo->dobj.dump = DUMP_COMPONENT_NONE;
> + }
> +
>
> For conflict log tables during binary upgrade, we set:
>    tbinfo->dobj.dump = DUMP_COMPONENT_DEFINITION;
>
> but then execution falls through to the later logic:
> ...
>   else
>     tbinfo->dobj.dump = tbinfo->dobj.namespace->dobj.dump_contains;
>
> which seems to overwrite the earlier 'dobj.dump' value. So it looks
> like DUMP_COMPONENT_DEFINITION may never actually survive here.
> Should we return from this block instead of continuing further?
>
> 3)
> @@ -5656,6 +5757,11 @@ dumpSubscription(Archive *fout, const
> SubscriptionInfo *subinfo)
>
>   appendPQExpBufferStr(query, ");\n");
>
> + appendPQExpBuffer(query,
> +   "\n\nALTER SUBSCRIPTION %s SET (conflict_log_destination = %s);\n",
> +   qsubname,
> +   subinfo->subconflictlogdest);
> +
>
> The above ALTER SUBSCRIPTION command seems to be dumped
> unconditionally for every subscription.
> Since the default value during subscription creation is already
> "subconflictlogdest = 'log' ", should we emit this command only when
> subconflictlogdest is non-NULL and not 'log'?
>
> 4)
> + if (PQgetisnull(res, i, i_sublogdestination))
> + subinfo[i].subconflictlogdest = NULL;
> + else
> + subinfo[i].subconflictlogdest =
> + pg_strdup(PQgetvalue(res, i, i_sublogdestination));
> +
> + if (PQgetisnull(res, i, i_sublogdestination))
> + subinfo[i].subconflictlogdest = NULL;
> + else
> + subinfo[i].subconflictlogdest =
> + pg_strdup(PQgetvalue(res, i, i_sublogdestination));
> +
>   /* Decide whether we want to dump it */
>
> Looks like the same if-else block is repeated twice here.

Thanks for the comments, the attached v39 version patch has the
changes for the same.

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-26T05:53:51Z

On Mon, 25 May 2026 at 07:07, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Hi Vignesh,
>
> Some review comments for v38-0001/0002 combined.
>
> ======
> src/backend/commands/subscriptioncmds.c
>
> AlterSubscriptionConflictLogDestination:
>
> 1.
>  * Update the conflict log table associated with a subscription when its
>  * conflict log destination is changed.
>
> Somehow, that 'its' sounded awkward to me.
>
> SUGGESTION.
> When the subscription's 'conflict_log_destination' is changed, update
> the conflict log table if required.
>
> ~~~
>
> 2.
> + * If the new destination requires a conflict log table and none was previously
> + * required, this function validates an existing conflict log table identified
> + * by the subscription specific naming convention or creates a new one.
>
> What does this mean: "validates an existing conflict log table". How
> is there an "existing" CLT when you already said "none was previously
> required". Maybe this needs more explanation. If it is talking about
> "not already associated with another subscription", then it should
> just say that.
>
> Anyway, it seems validation that the comment claims this function is
> doing is not done here at all, but is really done by
> 'create_conflict_log_table'.
>
> ~~~
>
> 3.
> +static bool
> +AlterSubscriptionConflictLogDestination(Subscription *sub,
> + ConflictLogDest logdest,
> + Oid *conflicttablerelid)
>
> 3a.
> There was no forward declaration of this static function, but there
> was for all the others.
>
> ~
>
> 3b.
> Static functions should use snake-case names.
>
> ~~~
>
> 4.
> Personally, I think it is more natural to read LEFT-TO-RIGHT,
> OLD-THEN-NEW, etc., so I felt that the has_oldtable check should
> always come before want_table.
>
> Also, the 'ifs' seemed tricky because it's not obvious what
> has/need_table combinations are missing. e.g. The following seems
> easier to me. And probably lots of comments could be moved to here in
> the code as well, instead of in the function comment.
>
> SUGGESTION
> if (has_old_table)
> {
>   /* There is a CLT already. */
>
>   if (!want_table)
>   {
>     /* Remove it. */
>     drop_subscription_dependencies(sub->oid, sub->name, sub->conflictlogrelid);
>     update_relid = true;
>   }
> }
> else
> {
>   /* There was no previous CLT. */
>
>   if (want_table)
>   {
>     /* Create one. */
>     relid = create_conflict_log_table(sub->oid, sub->name, sub->owner);
>     update_relid = true;
>   }
> }
>
> ~~~
>
> 5.
> +static void
> +drop_subscription_dependencies(Oid subid, char *subname,
> +    Oid subconflictlogrelid)
> +{
> + ObjectAddress object;
> + char *conflictrelname;
> +
> + conflictrelname = get_rel_name(subconflictlogrelid);
> +
> + /*
> + * By using PERFORM_DELETION_SKIP_ORIGINAL, we ensure that only the
> + * conflict log table is deleted while the subscription remains.
> + */
> + ObjectAddressSet(object, SubscriptionRelationId, subid);
> + performDeletion(&object, DROP_CASCADE,
> + PERFORM_DELETION_INTERNAL |
> + PERFORM_DELETION_SKIP_ORIGINAL);
> +
> + ereport(NOTICE,
> + errmsg("dropped conflict log table \"%s\" for subscription \"%s\"",
> + get_qualified_objname(PG_CONFLICT_NAMESPACE, conflictrelname),
> + subname));
> +}
> +
>
> One day, this function might do more than just remove the CLT, so IMO
> all this function body should be within a check:
>
> if (OidIsValid(subconflictlogrelid))
> {
>   /* Drop any dependent CLT */
>   ...
> }
>
> ~~~
>
> DropSubscription
>
> 6.
> + if (OidIsValid(subconflictlogrelid))
> + drop_subscription_dependencies(subid, subname, subconflictlogrelid);
>
> Make it unconditional. Instead, add the condition inside the
> 'drop_subscription_dependencies', per the previous review comment #5.
>
> ======
> src/test/regress/sql/subscription.sql
>
> 7.
> +--
> +-- PUBLICATION: Verify conflict log tables are not publishable
> +--
> +-- pg_relation_is_publishable should return false for internal conflict log
> +-- tables to prevent them from being accidentally included in publications
> +--
>
> Everywhere else, you had removed the word "internal", but this one
> (maybe others?) was missed.

Thanks for the comments, these are addressed in the v40 version patch attached.

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-26T06:28:43Z

Hi Vignesh.

I had only one trivial review comment for v40-0001/0002 combined.

======
src/backend/commands/subscriptioncmds.c

1.
+ if (OidIsValid(subconflictlogrelid))
+ {
+ ObjectAddress object;
+ char *conflictrelname;
+
+ /* Drop any dependent conflict log table */
+ conflictrelname = get_rel_name(subconflictlogrelid);

That "Drop any..." comment doesn't have anything to do with the
statement that follows it. I think this comment belongs outside the
if.

e.g.
/* Drop any dependent conflict log table */
if (OidIsValid(subconflictlogrelid))
{
  ...
}

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-26T09:38:19Z

On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Thanks for the comments, the attached v39 version patch has the
> changes for the same.
>

I have not yet looked at v40, but please find a few ocmments on
v39-0001 and 0002 merged together.

1)
heap_create:
+ errdetail("Conflict schema modifications are currently disallowed.")));
LookupCreationNamespace:
+ errmsg("cannot move objects into or out of the pg_conflict schema")));

Can we make it same through-out, either we use 'Conflict schema' at
both the places or pg_conflict schema.  Since in these 2 functions, in
previous messages, we are using names like 'System catalog', 'TOAST
schema' etc, I think we can use Conflict schema at both the places.
What do others think on this?

2)
drop_subscription_dependencies():

+ conflictrelname = get_rel_name(subconflictlogrelid);

We can actually have a sanity check that we got the CLT using the relid.
Assert(conflictrelname != NULL);

3)
+ /*
+ * Special handling for the JSON array type for proper
+ * TupleDescInitEntry call.
+ */
+ if (type_oid == JSONARRAYOID)
+ type_oid = get_array_type(JSONOID);

Why do we have this special handling? Do we expect that 'type_oid' can
be different from JSONARRAYOID if we use get_array_type? On debugging,
I found it to be same pre and post get_array_type()

4)
Do we need to have CommandCounterIncrement() after
heap_create_with_catalog() in create_conflict_log_table()? I think
even if we are not doing any table_open etc for CLT in same
transaction, we should call CommandCounterIncrement() (to be
consistent with other such calls of heap_create_with_catalog and to
make it future proof). Thoughts?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-26T22:50:23Z

On Tue, May 26, 2026 at 7:38 PM shveta malik <shveta.malik@gmail.com> wrote:
...
> 1)
> heap_create:
> + errdetail("Conflict schema modifications are currently disallowed.")));
> LookupCreationNamespace:
> + errmsg("cannot move objects into or out of the pg_conflict schema")));
>
> Can we make it same through-out, either we use 'Conflict schema' at
> both the places or pg_conflict schema.  Since in these 2 functions, in
> previous messages, we are using names like 'System catalog', 'TOAST
> schema' etc, I think we can use Conflict schema at both the places.
> What do others think on this?
>

The suggested name of "Conflict schema" LGTM. My only concern was that
a user may not know where that is referring to. OTOH, things like
"System catalog" have 100s of mentions and whole documentation
chapters dedicated to them. If we go with "Conflict schema", then the
documentation needs to also consistently use that term, describe what
it is for, and make it very easy to look up and discover that
"Conflict schema" is 'pg_conflict'.

Currently (in patches 0008/9) there is very little explanation even
about what pg_conflict is, apart from just observing in passing that
the CLT gets written to that "dedicated namespace". It seems a bit
backwards describing the parent schema by the contents: Instead of
saying when there is a CLT it gets written there, IMO it should be the
other way around, and say there is a "Conflict schema" which is where
the CLTs (if any) reside.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-27T04:09:01Z

On Wed, May 27, 2026 at 4:20 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Tue, May 26, 2026 at 7:38 PM shveta malik <shveta.malik@gmail.com> wrote:
> ...
> > 1)
> > heap_create:
> > + errdetail("Conflict schema modifications are currently disallowed.")));
> > LookupCreationNamespace:
> > + errmsg("cannot move objects into or out of the pg_conflict schema")));
> >
> > Can we make it same through-out, either we use 'Conflict schema' at
> > both the places or pg_conflict schema.  Since in these 2 functions, in
> > previous messages, we are using names like 'System catalog', 'TOAST
> > schema' etc, I think we can use Conflict schema at both the places.
> > What do others think on this?
> >
>
> The suggested name of "Conflict schema" LGTM. My only concern was that
> a user may not know where that is referring to. OTOH, things like
> "System catalog" have 100s of mentions and whole documentation
> chapters dedicated to them. If we go with "Conflict schema", then the
> documentation needs to also consistently use that term, describe what
> it is for, and make it very easy to look up and discover that
> "Conflict schema" is 'pg_conflict'.

I agree that if we use 'Conflict schema' in the error messages, we
need to refer it the same way in doc. Let's wait for others' opinions
on this too.

>
> Currently (in patches 0008/9) there is very little explanation even
> about what pg_conflict is, apart from just observing in passing that
> the CLT gets written to that "dedicated namespace". It seems a bit
> backwards describing the parent schema by the contents: Instead of
> saying when there is a CLT it gets written there, IMO it should be the
> other way around, and say there is a "Conflict schema" which is where
> the CLTs (if any) reside.

Yes, the suggestion makes sense. I will look at the doc patch again for this.

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-27T08:34:37Z

On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> >
> > Thanks for the comments, the attached v39 version patch has the
> > changes for the same.
> >
>
> I have not yet looked at v40, but please find a few ocmments on
> v39-0001 and 0002 merged together.
> 4)
> Do we need to have CommandCounterIncrement() after
> heap_create_with_catalog() in create_conflict_log_table()? I think
> even if we are not doing any table_open etc for CLT in same
> transaction, we should call CommandCounterIncrement() (to be
> consistent with other such calls of heap_create_with_catalog and to
> make it future proof). Thoughts?

I felt this is not required as we are not doing a table open on the
newly created table.

I have fixed the rest of the comments. The attached v41 version patch
has the changes for the same.  Additionally the comments from [1] have
also been fixed.

[1] - https://www.postgresql.org/message-id/CAHut%2BPvB3rUs2ccUxJ1q1YEmvtHN3HJGSEjT4Cbc%3D5pjoGO9Yg%40mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

shveta malik <shveta.malik@gmail.com> — 2026-05-27T10:38:40Z

I have not yet looked at v41. Here are the comments for v40

0003 and 0004: No comments.

0004 and 0005:


1)
In build_local_conflicts_json_array(), we have these:

+ json_datum = heap_copy_tuple_as_datum(tuple, tupdesc);
+
+ /*
+ * Build the higher level JSON datum in format described in function
+ * header.
+ */
+ json_datum = DirectFunctionCall1(row_to_json, json_datum);

We have first allocation to 'json_datum' via
heap_copy_tuple_as_datum() and then second via row_to_json() call. So
we are overwriting first allocation. Which memory context are we using
here for this allocation? IIUC, if the conflict is non-error one, we
may accumulate these memory chunks in long running worker loop which
may gradually bloat the memory. Let me know if my undertstanding is
wrong.

Same situation in tuple_table_slot_to_indextup_json and
tuple_table_slot_to_json_datum as well.

2)
Same in ReportApplyConflict(), if elevel is not ERROR, should we worry
about freeing 'err_detail' after error-reporting or does some
short-lived context handle it?

thanks
Shveta

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-27T21:26:47Z

On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > >
> > > Thanks for the comments, the attached v39 version patch has the
> > > changes for the same.
> > >
> >
> > I have not yet looked at v40, but please find a few ocmments on
> > v39-0001 and 0002 merged together.
> > 4)
> > Do we need to have CommandCounterIncrement() after
> > heap_create_with_catalog() in create_conflict_log_table()? I think
> > even if we are not doing any table_open etc for CLT in same
> > transaction, we should call CommandCounterIncrement() (to be
> > consistent with other such calls of heap_create_with_catalog and to
> > make it future proof). Thoughts?
>
> I felt this is not required as we are not doing a table open on the
> newly created table.
>

Okay, command counter increment would be required here if we further
access that relation in the same command.  I think I am facing a
related problem w.r.t newly created subscription. After applying first
six patches, the create subscription fails as follows:
postgres=# create subscription sub1 connection 'dbname=postgres'
publication pub1 with (conflict_log_destination='all');
ERROR:  dependent subscription was concurrently dropped

I debugged and found that we get the above ERROR when we are trying to
find the subscription which is not yet created. In this case, it seems
to be happening because we are using a subscription that is yet not
created for dependency recording. This raises a question as to why are
we creating the conflict_log_table before subscription, at least this
needs some comments.

*
+ if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE))
+ {
+ if (IsConflictLogTableClass(classForm))
+ {
+ /*
+ * For conflict log tables, allow non-superusers to perform
+ * DELETE and TRUNCATE for cleanup and maintenance. Also allow
+ * INSERT and UPDATE to pass ACL checks so that later checks
+ * can raise the dedicated "cannot modify or insert data into
+ * conflict log table" error instead of a generic permission
+ * denied error. Still restrict USAGE for non-superusers.
+ */
+ mask &= ~(ACL_USAGE);

I see the point of giving a specific error instead of a generic error
but this functionality is used by pg_class_aclmask() which is an
exposed function. If we go with your proposed change, isn't there a
risk that some extension or outside core-code using pg_class_aclmask()
won't invoke that later functionality (CheckValidResultRel())? If we
decide to go this way then we can change this comment as proposed in
the attached?

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-27T22:09:45Z

On Wed, May 27, 2026 at 3:38 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> 2)
> Same in ReportApplyConflict(), if elevel is not ERROR, should we worry
> about freeing 'err_detail' after error-reporting or does some
> short-lived context handle it?
>

Isn't this the case even without this patch? If so, this can be
investigated separately.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-27T23:08:11Z

On Tue, May 26, 2026 at 2:38 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> 2)
> drop_subscription_dependencies():
>
> + conflictrelname = get_rel_name(subconflictlogrelid);
>
> We can actually have a sanity check that we got the CLT using the relid.
> Assert(conflictrelname != NULL);
>

elog will suit this place better as this can't be a direct coding
mistake. I see that at other places we used elog. See
if (result == NULL)
elog(ERROR, "cache lookup failed for index %u", indexId);

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Peter Smith <smithpb2250@gmail.com> — 2026-05-28T01:20:03Z

Hi Vignesh.

Here are some review comments for the v41-0008/9 combined (docs) patch.

======
doc/src/sgml/ddl.sgml

(5.11.6. The Conflict Schema)

1.
+   <para>
+    Similarly, the <literal>pg_conflict</literal> schema (sometimes referred to
+    as the <emphasis>conflict schema</emphasis>) contains system managed
+    conflict log tables used for logical replication conflict tracking. These
+    tables are created and maintained by the system and are not intended for
+    direct user manipulation. Unlike <literal>pg_catalog</literal>, the
+    <literal>pg_catalog</literal> schema is not implicitly included in the
+    search path, so objects within it must be referenced explicitly or by
+    adjusting the search path.
+   </para>

1a.
/Similarly, the/The/

~

1b.
IMO don't say "sometimes".

Also, case. /conflict schema/Conflict schema/

~

1c.
"conflict log tables" -- I think it will be helpful if this includes a
link to "29.8.2. Table-based logging #".

~

1d.
"Unlike <literal>pg_catalog</literal>, the
<literal>pg_catalog</literal> schema..."

typo. That 2nd pg_catalog should say pg_conflict.

======
doc/src/sgml/glossary.sgml

2.
+  <glossentry id="glossary-conflict-schema">
+   <glossterm>conflict schema</glossterm>
+   <glossdef>
+    <para>
+     The <literal>pg_conflict</literal> schema that contains system-managed
+     conflict log tables for logical replication. These tables are created
+     and maintained automatically by the system and are not intended for
+     direct user manipulation. See <xref linkend="ddl-schemas-conflict"/>.
+    </para>
+   </glossdef>
+  </glossentry>
+

case. /conflict schema/Conflict schema/

======
doc/src/sgml/logical-replication.sgml

(29.2. Subscription)

3.
+   automatically manages a dedicated <firstterm>conflict log table</firstterm>,
+   which is created an dropped along with the subscription. This significantly
+   improves post-mortem analysis and operational visibility of the replication
+   setup.

typo.  /created an dropped/created and dropped/

~~~

(29.8.2. Table-based logging)

4.
+    a dedicated conflict log table will be automatically created. This table is
+    created in the <literal>pg_conflict</literal> namespace. The name of the

Instead of "<literal>pg_conflict</literal> namespace", this should now
say "Conflict schema" and have a link to that new docs section.

======
doc/src/sgml/ref/create_subscription.sgml

(Parameters - conflict_log_destination)

5.
+             named <literal>pg_conflict_log_for_subid_&lt;subid&gt;</literal>
+             in the <literal>pg_conflict</literal> schema. This allows for easy

Same as review comment #4. Instead of "<literal>pg_conflict</literal>
schema", this should now say "Conflict schema" and have a link to that
new docs section.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-28T14:49:14Z

On Wed, 27 May 2026 at 14:04, vignesh C <vignesh21@gmail.com> wrote:
>
>
> I have fixed the rest of the comments. The attached v41 version patch
> has the changes for the same.  Additionally the comments from [1] have
> also been fixed.

I was evaluating whether the existing pg_upgrade changes for conflict
log tables can handle the addition of new columns in a future release.
To validate this, I performed the following:
Added two new columns to the conflict log table:
v20_new_col1 TEXT
v20_new_col2 TEXT

These changes are present in patch '0001'.

For adding new columns during binary upgrade, the following
version-specific logic is required in 'pg_dump':
ALTER TABLE pg_conflict.pg_conflict_log_for_subid_oid
ADD COLUMN v20_new_col1 TEXT;

ALTER TABLE pg_conflict.pg_conflict_log_for_subid_oid
ADD COLUMN v20_new_col2 TEXT;

These changes are included in patch '0001'.
One important point here is that when 'ALTER TABLE ... ADD COLUMN' is
run, the server does not rewrite existing rows on disk. Instead, it
only updates the system catalog with the new column metadata.

While selecting data from the table, the server handles this as follows:
1. Deform what is physically present - 'slot_deform_heap_tuple()'
reads the raw tuple bytes from disk, but only up to 't_natts', which
is the number of columns recorded in the tuple header at the time that
row was inserted. It stops there because the tuple has no physical
data for columns added later.
2. Fill in what is missing -   After deforming the tuple, if the
number of populated columns is still less than the number of columns
requested by the query, it calls 'slot_getmissingattrs()' to cover the
gap.   Since the new columns were added with no default value,
'slot_getmissingattrs()' sets:
tts_isnull[attnum] = true;

This is how NULL is returned for the newly added columns in existing rows.

These changes were tested on a new server with the v40 version patch +
'0001' patch.
1. Pre-upgrade state using v40 version patches
Simulated conflicts using a setup where the schema does not include
the new columns:
postgres=# select * from pg_conflict.pg_conflict_log_for_subid_16396 ;
....
(4 rows)

2. Upgrade using 'pg_upgrade'
The upgrade was performed on a cluster initialized with patches v40 +
'0001', and it completed successfully.
Post-upgrade verification:
postgres=# select conflict_type, v20_new_col1, v20_new_col2 from
pg_conflict.pg_conflict_log_for_subid_16396 ;
 conflict_type | v20_new_col1 | v20_new_col2
---------------+--------------+--------------
 insert_exists |              |
 insert_exists |              |
 insert_exists |              |
 insert_exists |              |
(4 rows)

Existing rows were preserved, and the newly added columns are visible
and populated with NULLs, as expected.

3. Post-upgrade conflict insertion
After starting the old publisher again to continue generating conflicts:
postgres=# select conflict_type, v20_new_col1, v20_new_col2 from
pg_conflict.pg_conflict_log_for_subid_16396 ;
 conflict_type | v20_new_col1 | v20_new_col2
---------------+--------------+--------------
 insert_exists |              |
 insert_exists |              |
 insert_exists |              |
 insert_exists |              |
 insert_exists | v20_new_col1 | v20_new_col2
 insert_exists | v20_new_col1 | v20_new_col2
 insert_exists | v20_new_col1 | v20_new_col2
(7 rows)

New conflicts are inserted successfully, and the newly added columns
are correctly populated for new entries.

Based on this testing, the current 'pg_upgrade' framework, along with
the additional dump-time adjustments, appears sufficient to support
schema evolution of conflict log tables, specifically for adding new
columns in future releases.

Thoughts?

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-28T23:41:34Z

On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
>
> On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Rest of the comments were fixed.
> > The attached v37 version patch has the changes for the same. Also
> > Peter's comments on the documentation patch from [1] and Shveta's
> > comments from [2] are addressed in the attached patch.
> >
>
> Here are few comments based on v37 testing:
>
> 1) Should we consider using TOAST tables for tuple-data columns like
> remote_tuple and local_conflicts (the JSON columns)?
> This may be a corner case, but if the tuple data becomes too large to
> fit into an 8KB heap tuple, then the apply worker keeps failing while
> inserting into the CLT with errors like:
>
>   ERROR: row is too big: size 19496, maximum size 8160
>   LOG: background worker "logical replication apply worker" (PID
> 41226) exited with exit code 1
>

In the docs, it is mentioned: "column_value is the column value. The
large column values are truncated to 64 bytes." [1], so I wonder, if
we follow this why we need toast entries? Did you tried any case where
you are getting above ERROR?

> Noticed that even disable_on_error=true does not disable the
> subscription in this case.
>

Hmm, I think we need to have a documented reason if such a case
doesn't disable the subscription with the disable_on_error as true?


[1]: https://www.postgresql.org/docs/devel/logical-replication-conflicts.html

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-29T09:22:58Z

On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > >
> > > > Thanks for the comments, the attached v39 version patch has the
> > > > changes for the same.
> > > >
> > >
> > > I have not yet looked at v40, but please find a few ocmments on
> > > v39-0001 and 0002 merged together.
> > > 4)
> > > Do we need to have CommandCounterIncrement() after
> > > heap_create_with_catalog() in create_conflict_log_table()? I think
> > > even if we are not doing any table_open etc for CLT in same
> > > transaction, we should call CommandCounterIncrement() (to be
> > > consistent with other such calls of heap_create_with_catalog and to
> > > make it future proof). Thoughts?
> >
> > I felt this is not required as we are not doing a table open on the
> > newly created table.
> >
>
> Okay, command counter increment would be required here if we further
> access that relation in the same command.

I think CommandCounterIncrement() is called wherever we need to open
the relation in the same command.  In this particular case we do not
need to open the conflict log table so we do not need to call CCI

  I think I am facing a
> related problem w.r.t newly created subscription. After applying first
> six patches, the create subscription fails as follows:
> postgres=# create subscription sub1 connection 'dbname=postgres'
> publication pub1 with (conflict_log_destination='all');
> ERROR:  dependent subscription was concurrently dropped
>
> I debugged and found that we get the above ERROR when we are trying to
> find the subscription which is not yet created. In this case, it seems
> to be happening because we are using a subscription that is yet not
> created for dependency recording. This raises a question as to why are
> we creating the conflict_log_table before subscription, at least this
> needs some comments.

This error occurs because in the commit below [1], we disallowed
recording a dependency on an object that does not exist. Therefore, we
now need to record the dependency after the subscription is created.
And we create CLT before so that we can add the conflict log relid in
pg_subscription without an additional update, I will add a comment
explaining this.


[1]
commit 2fbb21170e9053720c2c374b21eb650a22b8aaea
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date:   Wed May 27 18:35:58 2026 +0300
    Avoid orphaned objects dependencies


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-29T18:47:49Z

On Fri, May 29, 2026 at 2:23 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>   I think I am facing a
> > related problem w.r.t newly created subscription. After applying first
> > six patches, the create subscription fails as follows:
> > postgres=# create subscription sub1 connection 'dbname=postgres'
> > publication pub1 with (conflict_log_destination='all');
> > ERROR:  dependent subscription was concurrently dropped
> >
> > I debugged and found that we get the above ERROR when we are trying to
> > find the subscription which is not yet created. In this case, it seems
> > to be happening because we are using a subscription that is yet not
> > created for dependency recording. This raises a question as to why are
> > we creating the conflict_log_table before subscription, at least this
> > needs some comments.
>
> This error occurs because in the commit below [1], we disallowed
> recording a dependency on an object that does not exist. Therefore, we
> now need to record the dependency after the subscription is created.
>

But don't we normally create dependency immediately after creating the
object? Do you see such examples at other places in the code?

> And we create CLT before so that we can add the conflict log relid in
> pg_subscription without an additional update,
>

But will this additional update matter to an extent in DDL execution
that we don't follow our usual way to record dependency? I feel unless
we follow similar coding pattern at other places, it is better to
create the CLT after subscription.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-29T21:54:39Z

On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> >
> > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Rest of the comments were fixed.
> > > The attached v37 version patch has the changes for the same. Also
> > > Peter's comments on the documentation patch from [1] and Shveta's
> > > comments from [2] are addressed in the attached patch.
> > >
> >
> > Here are few comments based on v37 testing:
> >
> > 1) Should we consider using TOAST tables for tuple-data columns like
> > remote_tuple and local_conflicts (the JSON columns)?
> > This may be a corner case, but if the tuple data becomes too large to
> > fit into an 8KB heap tuple, then the apply worker keeps failing while
> > inserting into the CLT with errors like:
> >
> >   ERROR: row is too big: size 19496, maximum size 8160
> >   LOG: background worker "logical replication apply worker" (PID
> > 41226) exited with exit code 1
> >
>
> In the docs, it is mentioned: "column_value is the column value. The
> large column values are truncated to 64 bytes." [1], so I wonder, if
> we follow this why we need toast entries? Did you tried any case where
> you are getting above ERROR?

But in this case we are talking about the JSON column of the CLT which
might contain a full local tuple or even multiple local tuples if a
remote tuple conflicts with multiple local rows.  So, IMHO, we need a
toast table. Nisha, have you already tested the scenario? If yes, can
you share your test case?


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-29T22:06:59Z

On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Rest of the comments were fixed.
> > > > The attached v37 version patch has the changes for the same. Also
> > > > Peter's comments on the documentation patch from [1] and Shveta's
> > > > comments from [2] are addressed in the attached patch.
> > > >
> > >
> > > Here are few comments based on v37 testing:
> > >
> > > 1) Should we consider using TOAST tables for tuple-data columns like
> > > remote_tuple and local_conflicts (the JSON columns)?
> > > This may be a corner case, but if the tuple data becomes too large to
> > > fit into an 8KB heap tuple, then the apply worker keeps failing while
> > > inserting into the CLT with errors like:
> > >
> > >   ERROR: row is too big: size 19496, maximum size 8160
> > >   LOG: background worker "logical replication apply worker" (PID
> > > 41226) exited with exit code 1
> > >
> >
> > In the docs, it is mentioned: "column_value is the column value. The
> > large column values are truncated to 64 bytes." [1], so I wonder, if
> > we follow this why we need toast entries? Did you tried any case where
> > you are getting above ERROR?
>
> But in this case we are talking about the JSON column of the CLT which
> might contain a full local tuple or even multiple local tuples if a
> remote tuple conflicts with multiple local rows.  So, IMHO, we need a
> toast table. Nisha, have you already tested the scenario? If yes, can
> you share your test case?

After putting more thought, I think instead of executing a three-step
process i.e. inserting the pg_subscription tuple, creating the table
with its dependency, and then going back to update the tuple with the
new relation ID, it is much cleaner to do it linearly, i.e. we should
create the conflict log table first to get its OID, insert the
subscription tuple pre-populated with that ID, and then record the
dependency. This achieves the exact same state in a single direct
sequence without the redundant catalog update within the same command.
I agree with that code we would have to keep the record dependency
code in CreateSubscription and AlterSubscription functions, but after
putting more thought I think in thoese function we are already
recording subscription dependencies with other object so wouldn't it
be more natural to add this depednecy as well at the same place?

Anyway I am ready to change that if we have strong opinion against
this approach.

Here is the updated patch and changes are
1. 0003 and 0004 are merged on 0001
2. Merged Amit's v41_amit_1.patch.txt to 0002
3. Fix the dependency order issue (i.e. create dependency after
inserting subscription tuple) and merged in 0002

Open Items:
1. Need to create toast table for CLT after testing with larger JSON row
2. Fixed review comments of Shveta on 0004 and 0005
3.  Rebase Vignesh's patch of
"v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we
can do that once we have concensus on whether to create conflict log
table first or insert the subscription row first as based on this
change we would have to rebase this patch again.
4. Once we rebase
"v41-0007-Preserve-conflict-log-destination-and-subscripti" after
dependency order consensus I would rebase doc patch and \dRs+ change
patch of Vignesh.

-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-30T00:31:05Z

On Sat, May 30, 2026 at 3:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > >
> > > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > Rest of the comments were fixed.
> > > > > The attached v37 version patch has the changes for the same. Also
> > > > > Peter's comments on the documentation patch from [1] and Shveta's
> > > > > comments from [2] are addressed in the attached patch.
> > > > >
> > > >
> > > > Here are few comments based on v37 testing:
> > > >
> > > > 1) Should we consider using TOAST tables for tuple-data columns like
> > > > remote_tuple and local_conflicts (the JSON columns)?
> > > > This may be a corner case, but if the tuple data becomes too large to
> > > > fit into an 8KB heap tuple, then the apply worker keeps failing while
> > > > inserting into the CLT with errors like:
> > > >
> > > >   ERROR: row is too big: size 19496, maximum size 8160
> > > >   LOG: background worker "logical replication apply worker" (PID
> > > > 41226) exited with exit code 1
> > > >
> > >
> > > In the docs, it is mentioned: "column_value is the column value. The
> > > large column values are truncated to 64 bytes." [1], so I wonder, if
> > > we follow this why we need toast entries? Did you tried any case where
> > > you are getting above ERROR?
> >
> > But in this case we are talking about the JSON column of the CLT which
> > might contain a full local tuple or even multiple local tuples if a
> > remote tuple conflicts with multiple local rows.  So, IMHO, we need a
> > toast table. Nisha, have you already tested the scenario? If yes, can
> > you share your test case?
>
> After putting more thought, I think instead of executing a three-step
> process i.e. inserting the pg_subscription tuple, creating the table
> with its dependency, and then going back to update the tuple with the
> new relation ID, it is much cleaner to do it linearly, i.e. we should
> create the conflict log table first to get its OID, insert the
> subscription tuple pre-populated with that ID, and then record the
> dependency. This achieves the exact same state in a single direct
> sequence without the redundant catalog update within the same command.
> I agree with that code we would have to keep the record dependency
> code in CreateSubscription and AlterSubscription functions, but after
> putting more thought I think in thoese function we are already
> recording subscription dependencies with other object so wouldn't it
> be more natural to add this depednecy as well at the same place?
>
> Anyway I am ready to change that if we have strong opinion against
> this approach.
>
> Here is the updated patch and changes are
> 1. 0003 and 0004 are merged on 0001
> 2. Merged Amit's v41_amit_1.patch.txt to 0002
> 3. Fix the dependency order issue (i.e. create dependency after
> inserting subscription tuple) and merged in 0002
>
> Open Items:
> 1. Need to create toast table for CLT after testing with larger JSON row
> 2. Fixed review comments of Shveta on 0004 and 0005
> 3.  Rebase Vignesh's patch of
> "v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we
> can do that once we have concensus on whether to create conflict log
> table first or insert the subscription row first as based on this
> change we would have to rebase this patch again.
> 4. Once we rebase
> "v41-0007-Preserve-conflict-log-destination-and-subscripti" after
> dependency order consensus I would rebase doc patch and \dRs+ change
> patch of Vignesh.

Here is a topup patch so create conflict log table after inserting
subscription tuple and then update the tuple with clt relid..

Main changes will look like this[1]

[1]
/*
* If logging to a table is required, physically create it now. We create
* the conflict log table here. Also update the pg_subscription row
* after creating the conflict log table with its reloid.
*/
if (CONFLICTS_LOGGED_TO_TABLE(opts.conflictlogdest))
{
bool replaces[Natts_pg_subscription];
Oid logrelid =
create_conflict_log_table(subid, stmt->subname, owner);

/* Form a new tuple. */
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
memset(replaces, false, sizeof(replaces));

values[Anum_pg_subscription_subconflictlogrelid - 1] =
ObjectIdGetDatum(logrelid);
replaces[Anum_pg_subscription_subconflictlogrelid - 1] =
true;

/* Make subscription tuple visible before updating it. */
CommandCounterIncrement();

tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
replaces);

CatalogTupleUpdate(rel, &tup->t_self, tup);
}


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Nisha Moond <nisha.moond412@gmail.com> — 2026-05-30T02:49:15Z

On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > >
> > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Rest of the comments were fixed.
> > > > The attached v37 version patch has the changes for the same. Also
> > > > Peter's comments on the documentation patch from [1] and Shveta's
> > > > comments from [2] are addressed in the attached patch.
> > > >
> > >
> > > Here are few comments based on v37 testing:
> > >
> > > 1) Should we consider using TOAST tables for tuple-data columns like
> > > remote_tuple and local_conflicts (the JSON columns)?
> > > This may be a corner case, but if the tuple data becomes too large to
> > > fit into an 8KB heap tuple, then the apply worker keeps failing while
> > > inserting into the CLT with errors like:
> > >
> > >   ERROR: row is too big: size 19496, maximum size 8160
> > >   LOG: background worker "logical replication apply worker" (PID
> > > 41226) exited with exit code 1
> > >
> >
> > In the docs, it is mentioned: "column_value is the column value. The
> > large column values are truncated to 64 bytes." [1], so I wonder, if
> > we follow this why we need toast entries? Did you tried any case where
> > you are getting above ERROR?
>
> But in this case we are talking about the JSON column of the CLT which
> might contain a full local tuple or even multiple local tuples if a
> remote tuple conflicts with multiple local rows.  So, IMHO, we need a
> toast table. Nisha, have you already tested the scenario? If yes, can
> you share your test case?
>

Hi Dilip, Amit,
Yes, I tested the scenario. Used below steps to reproduce the error:

#Publisher:
  CREATE TABLE fat2 (id int PRIMARY KEY, col1 text, col2 text);
  INSERT INTO fat2 VALUES (
      1,
      (SELECT string_agg(md5(i::text), '') FROM generate_series(1, 200) i),
      (SELECT string_agg(md5(i::text), '') FROM generate_series(201, 400) i)
  );
  ALTER TABLE fat2 REPLICA IDENTITY FULL;
  CREATE PUBLICATION p3 FOR TABLE fat2;

#Subscriber
 -- create subscription s3 for publication p3 with conflict log table
(after table syncs):
 -- modifying the row locally
  UPDATE fat2 SET col1 = (SELECT string_agg(md5(i::text), '') FROM
generate_series(601, 800) i) WHERE id = 1;

 #Publisher (triggers the conflict):
  UPDATE fat2 SET col1 = (SELECT string_agg(md5(i::text), '') FROM
generate_series(801, 1000) i) WHERE id = 1;

Above should cause the reported failure.

--
Thanks,
Nisha

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-30T08:12:27Z

On Sat, May 30, 2026 at 6:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, May 30, 2026 at 3:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote:
> > > > >
> > > > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > > >
> > > > > > Rest of the comments were fixed.
> > > > > > The attached v37 version patch has the changes for the same. Also
> > > > > > Peter's comments on the documentation patch from [1] and Shveta's
> > > > > > comments from [2] are addressed in the attached patch.
> > > > > >
> > > > >
> > > > > Here are few comments based on v37 testing:
> > > > >
> > > > > 1) Should we consider using TOAST tables for tuple-data columns like
> > > > > remote_tuple and local_conflicts (the JSON columns)?
> > > > > This may be a corner case, but if the tuple data becomes too large to
> > > > > fit into an 8KB heap tuple, then the apply worker keeps failing while
> > > > > inserting into the CLT with errors like:
> > > > >
> > > > >   ERROR: row is too big: size 19496, maximum size 8160
> > > > >   LOG: background worker "logical replication apply worker" (PID
> > > > > 41226) exited with exit code 1
> > > > >
> > > >
> > > > In the docs, it is mentioned: "column_value is the column value. The
> > > > large column values are truncated to 64 bytes." [1], so I wonder, if
> > > > we follow this why we need toast entries? Did you tried any case where
> > > > you are getting above ERROR?
> > >
> > > But in this case we are talking about the JSON column of the CLT which
> > > might contain a full local tuple or even multiple local tuples if a
> > > remote tuple conflicts with multiple local rows.  So, IMHO, we need a
> > > toast table. Nisha, have you already tested the scenario? If yes, can
> > > you share your test case?
> >
> > After putting more thought, I think instead of executing a three-step
> > process i.e. inserting the pg_subscription tuple, creating the table
> > with its dependency, and then going back to update the tuple with the
> > new relation ID, it is much cleaner to do it linearly, i.e. we should
> > create the conflict log table first to get its OID, insert the
> > subscription tuple pre-populated with that ID, and then record the
> > dependency. This achieves the exact same state in a single direct
> > sequence without the redundant catalog update within the same command.
> > I agree with that code we would have to keep the record dependency
> > code in CreateSubscription and AlterSubscription functions, but after
> > putting more thought I think in thoese function we are already
> > recording subscription dependencies with other object so wouldn't it
> > be more natural to add this depednecy as well at the same place?
> >
> > Anyway I am ready to change that if we have strong opinion against
> > this approach.
> >
> > Here is the updated patch and changes are
> > 1. 0003 and 0004 are merged on 0001
> > 2. Merged Amit's v41_amit_1.patch.txt to 0002
> > 3. Fix the dependency order issue (i.e. create dependency after
> > inserting subscription tuple) and merged in 0002
> >
> > Open Items:
> > 1. Need to create toast table for CLT after testing with larger JSON row
> > 2. Fixed review comments of Shveta on 0004 and 0005
> > 3.  Rebase Vignesh's patch of
> > "v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we
> > can do that once we have concensus on whether to create conflict log
> > table first or insert the subscription row first as based on this
> > change we would have to rebase this patch again.
> > 4. Once we rebase
> > "v41-0007-Preserve-conflict-log-destination-and-subscripti" after
> > dependency order consensus I would rebase doc patch and \dRs+ change
> > patch of Vignesh.
>
> Here is a topup patch so create conflict log table after inserting
> subscription tuple and then update the tuple with clt relid..
>
> Main changes will look like this[1]
>
> [1]
> /*
> * If logging to a table is required, physically create it now. We create
> * the conflict log table here. Also update the pg_subscription row
> * after creating the conflict log table with its reloid.
> */
> if (CONFLICTS_LOGGED_TO_TABLE(opts.conflictlogdest))
> {
> bool replaces[Natts_pg_subscription];
> Oid logrelid =
> create_conflict_log_table(subid, stmt->subname, owner);
>
> /* Form a new tuple. */
> memset(values, 0, sizeof(values));
> memset(nulls, false, sizeof(nulls));
> memset(replaces, false, sizeof(replaces));
>
> values[Anum_pg_subscription_subconflictlogrelid - 1] =
> ObjectIdGetDatum(logrelid);
> replaces[Anum_pg_subscription_subconflictlogrelid - 1] =
> true;
>
> /* Make subscription tuple visible before updating it. */
> CommandCounterIncrement();
>
> tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
> replaces);
>
> CatalogTupleUpdate(rel, &tup->t_self, tup);
> }
>

In latest patch set I have fixed Nisha's comments by creating a toast
table, a separate patch
(v43-0005-Create-conflict-log-table-after-inserting-subscr.patch)
attached for creating conflict log table after inserting subscription
row.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-30T18:21:10Z

On Fri, May 29, 2026 at 3:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> After putting more thought, I think instead of executing a three-step
> process i.e. inserting the pg_subscription tuple, creating the table
> with its dependency, and then going back to update the tuple with the
> new relation ID, it is much cleaner to do it linearly, i.e. we should
> create the conflict log table first to get its OID, insert the
> subscription tuple pre-populated with that ID, and then record the
> dependency. This achieves the exact same state in a single direct
> sequence without the redundant catalog update within the same command.
> I agree with that code we would have to keep the record dependency
> code in CreateSubscription and AlterSubscription functions, but after
> putting more thought I think in thoese function we are already
> recording subscription dependencies with other object so wouldn't it
> be more natural to add this depednecy as well at the same place?
>

It makes sense to me and anyway for serverid also we are creating
dependency after creation of subscription, so your solution looks good
to me. One minor suggestion related to this changes:

+ if (CONFLICTS_LOGGED_TO_TABLE(opts.conflictlogdest))
+ {
+ ObjectAddress clt;
+
+ ObjectAddressSet(clt, RelationRelationId, logrelid);
+ recordDependencyOn(&clt, &myself, DEPENDENCY_INTERNAL);
+ }

Let's name clt as cltaddr or cltobj to make it consistent with naming
at some other similar places in code. Change this at both places where
we use this code.

> Anyway I am ready to change that if we have strong opinion against
> this approach.
>
> Here is the updated patch and changes are
> 1. 0003 and 0004 are merged on 0001
> 2. Merged Amit's v41_amit_1.patch.txt to 0002
> 3. Fix the dependency order issue (i.e. create dependency after
> inserting subscription tuple) and merged in 0002
>
> Open Items:
> 1. Need to create toast table for CLT after testing with larger JSON row
> 2. Fixed review comments of Shveta on 0004 and 0005
> 3.  Rebase Vignesh's patch of
> "v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we
> can do that once we have concensus on whether to create conflict log
> table first or insert the subscription row first as based on this
> change we would have to rebase this patch again.
> 4. Once we rebase
> "v41-0007-Preserve-conflict-log-destination-and-subscripti" after
> dependency order consensus I would rebase doc patch and \dRs+ change
> patch of Vignesh.
>

I see that my second comment in email [1] and another comment in email
[2] are still not answered and are neither listed in open items.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BzdaLF7%3DAVKd8xNGTuvPvn8BYSxHfnLZd7whWZ%2Bv3B-Q%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAA4eK1K6tVUmKY-yqKgTX00yrSVAdSZN4Ao761JEXdtQkAYT4g%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T06:02:51Z

On Thu, May 28, 2026 at 4:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, May 26, 2026 at 2:38 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > 2)
> > drop_subscription_dependencies():
> >
> > + conflictrelname = get_rel_name(subconflictlogrelid);
> >
> > We can actually have a sanity check that we got the CLT using the relid.
> > Assert(conflictrelname != NULL);
> >
>
> elog will suit this place better as this can't be a direct coding
> mistake. I see that at other places we used elog. See
> if (result == NULL)
> elog(ERROR, "cache lookup failed for index %u", indexId);

Yes it make sense to report elog, I will change this.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T06:12:54Z

On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > >
> > > > Thanks for the comments, the attached v39 version patch has the
> > > > changes for the same.
> > > >
> > >
> > > I have not yet looked at v40, but please find a few ocmments on
> > > v39-0001 and 0002 merged together.
> > > 4)
> > > Do we need to have CommandCounterIncrement() after
> > > heap_create_with_catalog() in create_conflict_log_table()? I think
> > > even if we are not doing any table_open etc for CLT in same
> > > transaction, we should call CommandCounterIncrement() (to be
> > > consistent with other such calls of heap_create_with_catalog and to
> > > make it future proof). Thoughts?
> >
> > I felt this is not required as we are not doing a table open on the
> > newly created table.
> >
>
> Okay, command counter increment would be required here if we further
> access that relation in the same command.  I think I am facing a
> related problem w.r.t newly created subscription. After applying first
> six patches, the create subscription fails as follows:
> postgres=# create subscription sub1 connection 'dbname=postgres'
> publication pub1 with (conflict_log_destination='all');
> ERROR:  dependent subscription was concurrently dropped
>
> I debugged and found that we get the above ERROR when we are trying to
> find the subscription which is not yet created. In this case, it seems
> to be happening because we are using a subscription that is yet not
> created for dependency recording. This raises a question as to why are
> we creating the conflict_log_table before subscription, at least this
> needs some comments.
>
> *
> + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE))
> + {
> + if (IsConflictLogTableClass(classForm))
> + {
> + /*
> + * For conflict log tables, allow non-superusers to perform
> + * DELETE and TRUNCATE for cleanup and maintenance. Also allow
> + * INSERT and UPDATE to pass ACL checks so that later checks
> + * can raise the dedicated "cannot modify or insert data into
> + * conflict log table" error instead of a generic permission
> + * denied error. Still restrict USAGE for non-superusers.
> + */
> + mask &= ~(ACL_USAGE);
>
> I see the point of giving a specific error instead of a generic error
> but this functionality is used by pg_class_aclmask() which is an
> exposed function. If we go with your proposed change, isn't there a
> risk that some extension or outside core-code using pg_class_aclmask()
> won't invoke that later functionality (CheckValidResultRel())? If we
> decide to go this way then we can change this comment as proposed in
> the attached?

I do not understand this change; my original patch 0001 has like this,
that mean we are only allowing ACL_TRUNCATE and ACL_DELETE for
conflict log table, whats the reason for changing the same in 0002?

  if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE |
ACL_USAGE)) &&
- IsSystemClass(table_oid, classForm) &&
- classForm->relkind != RELKIND_VIEW &&
+ IsConflictClass(classForm) &&
  !superuser_arg(roleid))
- mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
+ mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE);
+ else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE |
ACL_TRUNCATE | ACL_USAGE)) &&
+ IsSystemClass(table_oid, classForm) &&
+ classForm->relkind != RELKIND_VIEW &&
+ !superuser_arg(roleid))
+ mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);



-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T06:57:48Z

On Wed, May 27, 2026 at 4:08 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> I have not yet looked at v41. Here are the comments for v40
>
> 0003 and 0004: No comments.
>
> 0004 and 0005:
>
>
> 1)
> In build_local_conflicts_json_array(), we have these:
>
> + json_datum = heap_copy_tuple_as_datum(tuple, tupdesc);
> +
> + /*
> + * Build the higher level JSON datum in format described in function
> + * header.
> + */
> + json_datum = DirectFunctionCall1(row_to_json, json_datum);
>
> We have first allocation to 'json_datum' via
> heap_copy_tuple_as_datum() and then second via row_to_json() call. So
> we are overwriting first allocation. Which memory context are we using
> here for this allocation? IIUC, if the conflict is non-error one, we
> may accumulate these memory chunks in long running worker loop which
> may gradually bloat the memory. Let me know if my undertstanding is
> wrong.
>
> Same situation in tuple_table_slot_to_indextup_json and
> tuple_table_slot_to_json_datum as well.

IIUC logical these all memory will be allocated under
ApplyMessageContext which is temporary and getting reset on every
logical message, so I think that contex is really for the purpose of
temporary allocation during each message processing and get reset
after the message is processed.

> 2)
> Same in ReportApplyConflict(), if elevel is not ERROR, should we worry
> about freeing 'err_detail' after error-reporting or does some
> short-lived context handle it?

Same is true for this as well.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Amit Kapila <amit.kapila16@gmail.com> — 2026-05-31T11:54:08Z

On Sat, May 30, 2026 at 1:12 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>

Few comments on 0001 and 0002
===========================
1.
+ Oid         subconflictlogrelid; /* Relid of the conflict log table. */
 #ifdef CATALOG_VARLEN /* variable-length fields start here */
+ /*
+ * Strategy for logging replication conflicts:
+ * 'log' - server log only,
+ * 'table' - conflict log table only,
+ * 'all' - both log and table.
+ */
+ text subconflictlogdest BKI_FORCE_NOT_NULL;

'log' sounds redundant in the above two field names. I feel naming
them as subconflictrelid and subconflictdest should be sufficient.

2. If you agree with the above, then let's make similar changes at
other places in the patch. We can change
alter_sub_conflictlogdestination to alter_sub_conflict_destination.
Also, similar to AlterSubscription_refresh and
AlterSubscription_refresh_seq, we can name this new function as
AlterSubscription_conflict_dest.

3. Now, let's consider whether we should change the option name to
conflict_data_destination instead of conflict_log_destination? The
reason I am asking to consider this change is that one of the option
values is 'log', so it sounded a bit odd to name the option as
conflict_log_destination. If we change this then we can consider
changing the name of Enum ConflictLogDest as well.

Apart from above, I have made some changes in the attached. Kindly
review and see which all can be incorporated in the next version.

-- 
With Regards,
Amit Kapila.

Re: Proposal: Conflict log history table for Logical Replication

vignesh C <vignesh21@gmail.com> — 2026-05-31T12:08:05Z

On Sun, 31 May 2026 at 11:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Thanks for the comments, the attached v39 version patch has the
> > > > > changes for the same.
> > > > >
> > > >
> > > > I have not yet looked at v40, but please find a few ocmments on
> > > > v39-0001 and 0002 merged together.
> > > > 4)
> > > > Do we need to have CommandCounterIncrement() after
> > > > heap_create_with_catalog() in create_conflict_log_table()? I think
> > > > even if we are not doing any table_open etc for CLT in same
> > > > transaction, we should call CommandCounterIncrement() (to be
> > > > consistent with other such calls of heap_create_with_catalog and to
> > > > make it future proof). Thoughts?
> > >
> > > I felt this is not required as we are not doing a table open on the
> > > newly created table.
> > >
> >
> > Okay, command counter increment would be required here if we further
> > access that relation in the same command.  I think I am facing a
> > related problem w.r.t newly created subscription. After applying first
> > six patches, the create subscription fails as follows:
> > postgres=# create subscription sub1 connection 'dbname=postgres'
> > publication pub1 with (conflict_log_destination='all');
> > ERROR:  dependent subscription was concurrently dropped
> >
> > I debugged and found that we get the above ERROR when we are trying to
> > find the subscription which is not yet created. In this case, it seems
> > to be happening because we are using a subscription that is yet not
> > created for dependency recording. This raises a question as to why are
> > we creating the conflict_log_table before subscription, at least this
> > needs some comments.
> >
> > *
> > + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE))
> > + {
> > + if (IsConflictLogTableClass(classForm))
> > + {
> > + /*
> > + * For conflict log tables, allow non-superusers to perform
> > + * DELETE and TRUNCATE for cleanup and maintenance. Also allow
> > + * INSERT and UPDATE to pass ACL checks so that later checks
> > + * can raise the dedicated "cannot modify or insert data into
> > + * conflict log table" error instead of a generic permission
> > + * denied error. Still restrict USAGE for non-superusers.
> > + */
> > + mask &= ~(ACL_USAGE);
> >
> > I see the point of giving a specific error instead of a generic error
> > but this functionality is used by pg_class_aclmask() which is an
> > exposed function. If we go with your proposed change, isn't there a
> > risk that some extension or outside core-code using pg_class_aclmask()
> > won't invoke that later functionality (CheckValidResultRel())? If we
> > decide to go this way then we can change this comment as proposed in
> > the attached?
>
> I do not understand this change; my original patch 0001 has like this,
> that mean we are only allowing ACL_TRUNCATE and ACL_DELETE for
> conflict log table, whats the reason for changing the same in 0002?
>
>   if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE |
> ACL_USAGE)) &&
> - IsSystemClass(table_oid, classForm) &&
> - classForm->relkind != RELKIND_VIEW &&
> + IsConflictClass(classForm) &&
>   !superuser_arg(roleid))
> - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
> + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE);
> + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE |
> ACL_TRUNCATE | ACL_USAGE)) &&
> + IsSystemClass(table_oid, classForm) &&
> + classForm->relkind != RELKIND_VIEW &&
> + !superuser_arg(roleid))
> + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);

This was done to fix Shveta's comments from [1] to throw "cannot
modify or insert data into conflict log table" instead of a generic
permission denied error for the owner of the conflict log table.
[1] - https://www.postgresql.org/message-id/CAJpy0uANkzTyUjO2W0=RtaJCGg=VYcwLGGCpqax=zKJgNbB0Hw@mail.gmail.com

Regards,
Vignesh

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T23:06:36Z

On Sun, May 31, 2026 at 5:38 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sun, 31 May 2026 at 11:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > Thanks for the comments, the attached v39 version patch has the
> > > > > > changes for the same.
> > > > > >
> > > > >
> > > > > I have not yet looked at v40, but please find a few ocmments on
> > > > > v39-0001 and 0002 merged together.
> > > > > 4)
> > > > > Do we need to have CommandCounterIncrement() after
> > > > > heap_create_with_catalog() in create_conflict_log_table()? I think
> > > > > even if we are not doing any table_open etc for CLT in same
> > > > > transaction, we should call CommandCounterIncrement() (to be
> > > > > consistent with other such calls of heap_create_with_catalog and to
> > > > > make it future proof). Thoughts?
> > > >
> > > > I felt this is not required as we are not doing a table open on the
> > > > newly created table.
> > > >
> > >
> > > Okay, command counter increment would be required here if we further
> > > access that relation in the same command.  I think I am facing a
> > > related problem w.r.t newly created subscription. After applying first
> > > six patches, the create subscription fails as follows:
> > > postgres=# create subscription sub1 connection 'dbname=postgres'
> > > publication pub1 with (conflict_log_destination='all');
> > > ERROR:  dependent subscription was concurrently dropped
> > >
> > > I debugged and found that we get the above ERROR when we are trying to
> > > find the subscription which is not yet created. In this case, it seems
> > > to be happening because we are using a subscription that is yet not
> > > created for dependency recording. This raises a question as to why are
> > > we creating the conflict_log_table before subscription, at least this
> > > needs some comments.
> > >
> > > *
> > > + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE))
> > > + {
> > > + if (IsConflictLogTableClass(classForm))
> > > + {
> > > + /*
> > > + * For conflict log tables, allow non-superusers to perform
> > > + * DELETE and TRUNCATE for cleanup and maintenance. Also allow
> > > + * INSERT and UPDATE to pass ACL checks so that later checks
> > > + * can raise the dedicated "cannot modify or insert data into
> > > + * conflict log table" error instead of a generic permission
> > > + * denied error. Still restrict USAGE for non-superusers.
> > > + */
> > > + mask &= ~(ACL_USAGE);
> > >
> > > I see the point of giving a specific error instead of a generic error
> > > but this functionality is used by pg_class_aclmask() which is an
> > > exposed function. If we go with your proposed change, isn't there a
> > > risk that some extension or outside core-code using pg_class_aclmask()
> > > won't invoke that later functionality (CheckValidResultRel())? If we
> > > decide to go this way then we can change this comment as proposed in
> > > the attached?
> >
> > I do not understand this change; my original patch 0001 has like this,
> > that mean we are only allowing ACL_TRUNCATE and ACL_DELETE for
> > conflict log table, whats the reason for changing the same in 0002?
> >
> >   if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE |
> > ACL_USAGE)) &&
> > - IsSystemClass(table_oid, classForm) &&
> > - classForm->relkind != RELKIND_VIEW &&
> > + IsConflictClass(classForm) &&
> >   !superuser_arg(roleid))
> > - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
> > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE);
> > + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE |
> > ACL_TRUNCATE | ACL_USAGE)) &&
> > + IsSystemClass(table_oid, classForm) &&
> > + classForm->relkind != RELKIND_VIEW &&
> > + !superuser_arg(roleid))
> > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
>
> This was done to fix Shveta's comments from [1] to throw "cannot
> modify or insert data into conflict log table" instead of a generic
> permission denied error for the owner of the conflict log table.
> [1] - https://www.postgresql.org/message-id/CAJpy0uANkzTyUjO2W0=RtaJCGg=VYcwLGGCpqax=zKJgNbB0Hw@mail.gmail.com

Thanks for pointing it, I will analyze this behavior and give my opinion.


-- 
Regards,
Dilip Kumar
Google

Re: Proposal: Conflict log history table for Logical Replication

Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T23:23:47Z

On Mon, Jun 1, 2026 at 4:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sun, May 31, 2026 at 5:38 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Sun, 31 May 2026 at 11:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > > Thanks for the comments, the attached v39 version patch has the
> > > > > > > changes for the same.
> > > > > > >
> > > > > >
> > > > > > I have not yet looked at v40, but please find a few ocmments on
> > > > > > v39-0001 and 0002 merged together.
> > > > > > 4)
> > > > > > Do we need to have CommandCounterIncrement() after
> > > > > > heap_create_with_catalog() in create_conflict_log_table()? I think
> > > > > > even if we are not doing any table_open etc for CLT in same
> > > > > > transaction, we should call CommandCounterIncrement() (to be
> > > > > > consistent with other such calls of heap_create_with_catalog and to
> > > > > > make it future proof). Thoughts?
> > > > >
> > > > > I felt this is not required as we are not doing a table open on the
> > > > > newly created table.
> > > > >
> > > >
> > > > Okay, command counter increment would be required here if we further
> > > > access that relation in the same command.  I think I am facing a
> > > > related problem w.r.t newly created subscription. After applying first
> > > > six patches, the create subscription fails as follows:
> > > > postgres=# create subscription sub1 connection 'dbname=postgres'
> > > > publication pub1 with (conflict_log_destination='all');
> > > > ERROR:  dependent subscription was concurrently dropped
> > > >
> > > > I debugged and found that we get the above ERROR when we are trying to
> > > > find the subscription which is not yet created. In this case, it seems
> > > > to be happening because we are using a subscription that is yet not
> > > > created for dependency recording. This raises a question as to why are
> > > > we creating the conflict_log_table before subscription, at least this
> > > > needs some comments.
> > > >
> > > > *
> > > > + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE))
> > > > + {
> > > > + if (IsConflictLogTableClass(classForm))
> > > > + {
> > > > + /*
> > > > + * For conflict log tables, allow non-superusers to perform
> > > > + * DELETE and TRUNCATE for cleanup and maintenance. Also allow
> > > > + * INSERT and UPDATE to pass ACL checks so that later checks
> > > > + * can raise the dedicated "cannot modify or insert data into
> > > > + * conflict log table" error instead of a generic permission
> > > > + * denied error. Still restrict USAGE for non-superusers.
> > > > + */
> > > > + mask &= ~(ACL_USAGE);
> > > >
> > > > I see the point of giving a specific error instead of a generic error
> > > > but this functionality is used by pg_class_aclmask() which is an
> > > > exposed function. If we go with your proposed change, isn't there a
> > > > risk that some extension or outside core-code using pg_class_aclmask()
> > > > won't invoke that later functionality (CheckValidResultRel())? If we
> > > > decide to go this way then we can change this comment as proposed in
> > > > the attached?
> > >
> > > I do not understand this change; my original patch 0001 has like this,
> > > that mean we are only allowing ACL_TRUNCATE and ACL_DELETE for
> > > conflict log table, whats the reason for changing the same in 0002?
> > >
> > >   if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE |
> > > ACL_USAGE)) &&
> > > - IsSystemClass(table_oid, classForm) &&
> > > - classForm->relkind != RELKIND_VIEW &&
> > > + IsConflictClass(classForm) &&
> > >   !superuser_arg(roleid))
> > > - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
> > > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE);
> > > + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE |
> > > ACL_TRUNCATE | ACL_USAGE)) &&
> > > + IsSystemClass(table_oid, classForm) &&
> > > + classForm->relkind != RELKIND_VIEW &&
> > > + !superuser_arg(roleid))
> > > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE);
> >
> > This was done to fix Shveta's comments from [1] to throw "cannot
> > modify or insert data into conflict log table" instead of a generic
> > permission denied error for the owner of the conflict log table.
> > [1] - https://www.postgresql.org/message-id/CAJpy0uANkzTyUjO2W0=RtaJCGg=VYcwLGGCpqax=zKJgNbB0Hw@mail.gmail.com
>
> Thanks for pointing it, I will analyze this behavior and give my opinion.

While thinking more about this, wouldn't the behaviour is same as
pg_toast table, I mean, the superuser will get "cannot change TOAST
relation "pg_toast_16404" whereas the owner of the toast will get a
permission denied error?

-- 
Regards,
Dilip Kumar
Google