Thread
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Allow logical replication conflicts to be logged to a table.
- a5918fddf10d master landed
-
Avoid orphaned objects dependencies
- 2fbb21170e90 19 (unreleased) cited
-
Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-05T12:24:01Z
Currently we log conflicts to the server's log file and updates, this approach has limitations, 1) Difficult to query and analyze, parsing plain text log files for conflict details is inefficient. 2) Lack of structured data, key conflict attributes (table, operation, old/new data, LSN, etc.) are not readily available in a structured, queryable format. 3) Difficult for external monitoring tools or custom resolution scripts to consume conflict data directly. This proposal aims to address these limitations by introducing a conflict log history table, providing a structured, and queryable record of all logical replication conflicts. This should be a configurable option whether to log into the conflict log history table, server logs or both. This proposal has two main design questions: =================================== 1. How do we store conflicting tuples from different tables? Using a JSON column to store the row data seems like the most flexible solution, as it can accommodate different table schemas. 2. Should this be a system table or a user table? a) System Table: Storing this in a system catalog is simple, but catalogs aren't designed for ever-growing data. While pg_large_object is an exception, this is not what we generally do IMHO. b) User Table: This offers more flexibility. We could allow a user to specify the table name during CREATE SUBSCRIPTION. Then we choose to either create the table internally or let the user create the table with a predefined schema. A potential drawback is that a user might drop or alter the table. However, we could mitigate this risk by simply logging a WARNING if the table is configured but an insertion fails. I am currently working on a POC patch for the same, but will post that once we have some thoughts on design choices. Schema for the conflict log history table may look like this, although there is a room for discussion on this. Note: I think these fields are self explanatory so I haven't explained them here. conflict_log_table ( logid SERIAL PRIMARY KEY, subid OID, schema_id OID, table_id OID, conflict_type TEXT NOT NULL, operation_type TEXT NOT NULL, replication_origin TEXT, remote_commit_ts TIMESTAMPTZ, local_commit_ts TIMESTAMPTZ, ri_key JSON, remote_tuple JSON, local_tuple JSON, ); Credit: Thanks to Amit Kapila for discussing this offlist and providing some valuable suggestions. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-08-07T06:55:04Z
On Tue, Aug 5, 2025 at 5:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Currently we log conflicts to the server's log file and updates, this > approach has limitations, 1) Difficult to query and analyze, parsing > plain text log files for conflict details is inefficient. 2) Lack of > structured data, key conflict attributes (table, operation, old/new > data, LSN, etc.) are not readily available in a structured, queryable > format. 3) Difficult for external monitoring tools or custom > resolution scripts to consume conflict data directly. > > This proposal aims to address these limitations by introducing a > conflict log history table, providing a structured, and queryable > record of all logical replication conflicts. This should be a > configurable option whether to log into the conflict log history > table, server logs or both. > +1 for the idea. > This proposal has two main design questions: > =================================== > > 1. How do we store conflicting tuples from different tables? > Using a JSON column to store the row data seems like the most flexible > solution, as it can accommodate different table schemas. Yes, that is one option. I have not looked into details myself, but you can also explore 'anyarray' used in pg_statistics to store 'Column data values of the appropriate kind'. > 2. Should this be a system table or a user table? > a) System Table: Storing this in a system catalog is simple, but > catalogs aren't designed for ever-growing data. While pg_large_object > is an exception, this is not what we generally do IMHO. > b) User Table: This offers more flexibility. We could allow a user to > specify the table name during CREATE SUBSCRIPTION. Then we choose to > either create the table internally or let the user create the table > with a predefined schema. > > A potential drawback is that a user might drop or alter the table. > However, we could mitigate this risk by simply logging a WARNING if > the table is configured but an insertion fails. I believe it makes more sense for this to be a catalog table rather than a user table. I wanted to check if we already have a large catalog table of this kind, and I think pg_statistic could be an example of a sizable catalog table. To get a rough idea of how size scales with data, I ran a quick experiment: I created 1000 tables, each with 2 JSON columns, 1 text column, and 2 integer columns. Then, I inserted 1000 rows into each table and ran ANALYZE to collect statistics. Here’s what I observed on a fresh database before and after: Before: pg_statistic row count: 412 Table size: ~256 kB After: pg_statistic row count: 6,412 Table size: ~5.3 MB Although it isn’t an exact comparison, this gives us some insight into how the statistics catalog table size grows with the number of rows. It doesn’t seem excessively large with 6k rows, given the fact that pg_statistic itself is a complex table having many 'anyarray'-type columns. That said, irrespective of what we decide, it would be ideal to offer users an option for automatic purging, perhaps via a retention period parameter like conflict_stats_retention_period (say default to 30 days), or a manual purge API such as purge_conflict_stats('older than date'). I wasn’t able to find any such purge mechanism for PostgreSQL stats tables, but Oracle does provide such purging options for some of their statistics tables (not related to conflicts), see [1], [2]. And to manage it better, it could be range partitioned on timestamp. > I am currently working on a POC patch for the same, but will post that > once we have some thoughts on design choices. > > Schema for the conflict log history table may look like this, although > there is a room for discussion on this. > > Note: I think these fields are self explanatory so I haven't > explained them here. > > conflict_log_table ( > logid SERIAL PRIMARY KEY, > subid OID, > schema_id OID, > table_id OID, > conflict_type TEXT NOT NULL, > operation_type TEXT NOT NULL, I feel operation_type is not needed when we already have conflict_type. The name of 'conflict_type' is enough to give us info on operation-type. > replication_origin TEXT, > remote_commit_ts TIMESTAMPTZ, > local_commit_ts TIMESTAMPTZ, > ri_key JSON, > remote_tuple JSON, > local_tuple JSON, > ); > > Credit: Thanks to Amit Kapila for discussing this offlist and > providing some valuable suggestions. > [1] https://docs.oracle.com/en/database/oracle/oracle-database/21/arpls/DBMS_STATS.html#GUID-8E6413D5-F827-4F57-9FAD-7EC56362A98C [2] https://docs.oracle.com/en/database/oracle/oracle-database/21/arpls/DBMS_STATS.html#GUID-A04AE1C0-5DE1-4AFC-91F8-D35D41DF98A2 thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-08-07T08:13:40Z
On Thu, Aug 7, 2025 at 12:25 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Aug 5, 2025 at 5:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > Currently we log conflicts to the server's log file and updates, this > > approach has limitations, 1) Difficult to query and analyze, parsing > > plain text log files for conflict details is inefficient. 2) Lack of > > structured data, key conflict attributes (table, operation, old/new > > data, LSN, etc.) are not readily available in a structured, queryable > > format. 3) Difficult for external monitoring tools or custom > > resolution scripts to consume conflict data directly. > > > > This proposal aims to address these limitations by introducing a > > conflict log history table, providing a structured, and queryable > > record of all logical replication conflicts. This should be a > > configurable option whether to log into the conflict log history > > table, server logs or both. > > > > +1 for the idea. > > > This proposal has two main design questions: > > =================================== > > > > 1. How do we store conflicting tuples from different tables? > > Using a JSON column to store the row data seems like the most flexible > > solution, as it can accommodate different table schemas. > > Yes, that is one option. I have not looked into details myself, but > you can also explore 'anyarray' used in pg_statistics to store 'Column > data values of the appropriate kind'. > > > 2. Should this be a system table or a user table? > > a) System Table: Storing this in a system catalog is simple, but > > catalogs aren't designed for ever-growing data. While pg_large_object > > is an exception, this is not what we generally do IMHO. > > b) User Table: This offers more flexibility. We could allow a user to > > specify the table name during CREATE SUBSCRIPTION. Then we choose to > > either create the table internally or let the user create the table > > with a predefined schema. > > > > A potential drawback is that a user might drop or alter the table. > > However, we could mitigate this risk by simply logging a WARNING if > > the table is configured but an insertion fails. > > I believe it makes more sense for this to be a catalog table rather > than a user table. I wanted to check if we already have a large > catalog table of this kind, and I think pg_statistic could be an > example of a sizable catalog table. To get a rough idea of how size > scales with data, I ran a quick experiment: I created 1000 tables, > each with 2 JSON columns, 1 text column, and 2 integer columns. Then, > I inserted 1000 rows into each table and ran ANALYZE to collect > statistics. Here’s what I observed on a fresh database before and > after: > > Before: > pg_statistic row count: 412 > Table size: ~256 kB > > After: > pg_statistic row count: 6,412 > Table size: ~5.3 MB > > Although it isn’t an exact comparison, this gives us some insight into > how the statistics catalog table size grows with the number of rows. > It doesn’t seem excessively large with 6k rows, given the fact that > pg_statistic itself is a complex table having many 'anyarray'-type > columns. > > That said, irrespective of what we decide, it would be ideal to offer > users an option for automatic purging, perhaps via a retention period > parameter like conflict_stats_retention_period (say default to 30 > days), or a manual purge API such as purge_conflict_stats('older than > date'). I wasn’t able to find any such purge mechanism for PostgreSQL > stats tables, but Oracle does provide such purging options for some of > their statistics tables (not related to conflicts), see [1], [2]. > And to manage it better, it could be range partitioned on timestamp. > It seems BDR also has one such conflict-log table which is a catalog table and is also partitioned on time. It has a default retention period of 30 days. See 'bdr.conflict_history' mentioned under 'catalogs' in [1] [1]: https://www.enterprisedb.com/docs/pgd/latest/reference/tables-views-functions/#user-visible-catalogs-and-views thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-07T09:38:20Z
On Thu, Aug 7, 2025 at 1:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Aug 7, 2025 at 12:25 PM shveta malik <shveta.malik@gmail.com> wrote: Thanks Shveta for your opinion on the design. > > On Tue, Aug 5, 2025 at 5:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > This proposal aims to address these limitations by introducing a > > > conflict log history table, providing a structured, and queryable > > > record of all logical replication conflicts. This should be a > > > configurable option whether to log into the conflict log history > > > table, server logs or both. > > > > > > > +1 for the idea. Thanks > > > > > This proposal has two main design questions: > > > =================================== > > > > > > 1. How do we store conflicting tuples from different tables? > > > Using a JSON column to store the row data seems like the most flexible > > > solution, as it can accommodate different table schemas. > > > > Yes, that is one option. I have not looked into details myself, but > > you can also explore 'anyarray' used in pg_statistics to store 'Column > > data values of the appropriate kind'. I think conversion from row to json and json to row is convenient and also other extensions like pgactive/bdr also provide as JSON. But we can explore this alternative options as well, thanks > > > 2. Should this be a system table or a user table? > > > a) System Table: Storing this in a system catalog is simple, but > > > catalogs aren't designed for ever-growing data. While pg_large_object > > > is an exception, this is not what we generally do IMHO. > > > b) User Table: This offers more flexibility. We could allow a user to > > > specify the table name during CREATE SUBSCRIPTION. Then we choose to > > > either create the table internally or let the user create the table > > > with a predefined schema. > > > > > > A potential drawback is that a user might drop or alter the table. > > > However, we could mitigate this risk by simply logging a WARNING if > > > the table is configured but an insertion fails. > > > > I believe it makes more sense for this to be a catalog table rather > > than a user table. I wanted to check if we already have a large > > catalog table of this kind, and I think pg_statistic could be an > > example of a sizable catalog table. To get a rough idea of how size > > scales with data, I ran a quick experiment: I created 1000 tables, > > each with 2 JSON columns, 1 text column, and 2 integer columns. Then, > > I inserted 1000 rows into each table and ran ANALYZE to collect > > statistics. Here’s what I observed on a fresh database before and > > after: > > > > Before: > > pg_statistic row count: 412 > > Table size: ~256 kB > > > > After: > > pg_statistic row count: 6,412 > > Table size: ~5.3 MB > > > > Although it isn’t an exact comparison, this gives us some insight into > > how the statistics catalog table size grows with the number of rows. > > It doesn’t seem excessively large with 6k rows, given the fact that > > pg_statistic itself is a complex table having many 'anyarray'-type > > columns. Yeah that's good analysis, apart from this pg_largeobject is also a catalog which grows with each large object and growth rate for that will be very high because it stores large object data in catalog. > > > > That said, irrespective of what we decide, it would be ideal to offer > > users an option for automatic purging, perhaps via a retention period > > parameter like conflict_stats_retention_period (say default to 30 > > days), or a manual purge API such as purge_conflict_stats('older than > > date'). I wasn’t able to find any such purge mechanism for PostgreSQL > > stats tables, but Oracle does provide such purging options for some of > > their statistics tables (not related to conflicts), see [1], [2]. > > And to manage it better, it could be range partitioned on timestamp. Yeah that's an interesting suggestion to timestamp based partitioning it for purging. > It seems BDR also has one such conflict-log table which is a catalog > table and is also partitioned on time. It has a default retention > period of 30 days. See 'bdr.conflict_history' mentioned under > 'catalogs' in [1] > > [1]: https://www.enterprisedb.com/docs/pgd/latest/reference/tables-views-functions/#user-visible-catalogs-and-views Actually bdr is an extension and this table is under extension namespace (bdr.conflict_history) so this is not really a catalog but its a extension managed table. So logically for PostgreSQL its an user table but yeah this is created and managed by the extension. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-08-08T03:28:21Z
On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Aug 7, 2025 at 1:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Thu, Aug 7, 2025 at 12:25 PM shveta malik <shveta.malik@gmail.com> wrote: > > Thanks Shveta for your opinion on the design. > > > > On Tue, Aug 5, 2025 at 5:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > This proposal aims to address these limitations by introducing a > > > > conflict log history table, providing a structured, and queryable > > > > record of all logical replication conflicts. This should be a > > > > configurable option whether to log into the conflict log history > > > > table, server logs or both. > > > > > > > > > > +1 for the idea. > > Thanks > > > > > > > > This proposal has two main design questions: > > > > =================================== > > > > > > > > 1. How do we store conflicting tuples from different tables? > > > > Using a JSON column to store the row data seems like the most flexible > > > > solution, as it can accommodate different table schemas. > > > > > > Yes, that is one option. I have not looked into details myself, but > > > you can also explore 'anyarray' used in pg_statistics to store 'Column > > > data values of the appropriate kind'. > > I think conversion from row to json and json to row is convenient and > also other extensions like pgactive/bdr also provide as JSON. Okay. Agreed. > But we > can explore this alternative options as well, thanks > > > > > 2. Should this be a system table or a user table? > > > > a) System Table: Storing this in a system catalog is simple, but > > > > catalogs aren't designed for ever-growing data. While pg_large_object > > > > is an exception, this is not what we generally do IMHO. > > > > b) User Table: This offers more flexibility. We could allow a user to > > > > specify the table name during CREATE SUBSCRIPTION. Then we choose to > > > > either create the table internally or let the user create the table > > > > with a predefined schema. > > > > > > > > A potential drawback is that a user might drop or alter the table. > > > > However, we could mitigate this risk by simply logging a WARNING if > > > > the table is configured but an insertion fails. > > > > > > I believe it makes more sense for this to be a catalog table rather > > > than a user table. I wanted to check if we already have a large > > > catalog table of this kind, and I think pg_statistic could be an > > > example of a sizable catalog table. To get a rough idea of how size > > > scales with data, I ran a quick experiment: I created 1000 tables, > > > each with 2 JSON columns, 1 text column, and 2 integer columns. Then, > > > I inserted 1000 rows into each table and ran ANALYZE to collect > > > statistics. Here’s what I observed on a fresh database before and > > > after: > > > > > > Before: > > > pg_statistic row count: 412 > > > Table size: ~256 kB > > > > > > After: > > > pg_statistic row count: 6,412 > > > Table size: ~5.3 MB > > > > > > Although it isn’t an exact comparison, this gives us some insight into > > > how the statistics catalog table size grows with the number of rows. > > > It doesn’t seem excessively large with 6k rows, given the fact that > > > pg_statistic itself is a complex table having many 'anyarray'-type > > > columns. > > Yeah that's good analysis, apart from this pg_largeobject is also a > catalog which grows with each large object and growth rate for that > will be very high because it stores large object data in catalog. > > > > > > > That said, irrespective of what we decide, it would be ideal to offer > > > users an option for automatic purging, perhaps via a retention period > > > parameter like conflict_stats_retention_period (say default to 30 > > > days), or a manual purge API such as purge_conflict_stats('older than > > > date'). I wasn’t able to find any such purge mechanism for PostgreSQL > > > stats tables, but Oracle does provide such purging options for some of > > > their statistics tables (not related to conflicts), see [1], [2]. > > > And to manage it better, it could be range partitioned on timestamp. > > Yeah that's an interesting suggestion to timestamp based partitioning > it for purging. > > > It seems BDR also has one such conflict-log table which is a catalog > > table and is also partitioned on time. It has a default retention > > period of 30 days. See 'bdr.conflict_history' mentioned under > > 'catalogs' in [1] > > > > [1]: https://www.enterprisedb.com/docs/pgd/latest/reference/tables-views-functions/#user-visible-catalogs-and-views > > Actually bdr is an extension and this table is under extension > namespace (bdr.conflict_history) so this is not really a catalog but > its a extension managed table. Yes, right. Sorry for confusion. > So logically for PostgreSQL its an > user table but yeah this is created and managed by the extension. > Any idea if the user can alter/drop or perform any DML on it? I could not find any details on this part. > -- > Regards, > Dilip Kumar > Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-08T04:31:03Z
On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > So logically for PostgreSQL its an > > user table but yeah this is created and managed by the extension. > > > > Any idea if the user can alter/drop or perform any DML on it? I could > not find any details on this part. In my experience, for such extension managed tables where we want them to behave like catalog, generally users are just granted with SELECT permission. So although it is not a catalog but for accessibility wise for non admin users it is like a catalog. IMHO, even if we choose to create a user table for conflict log history we can also control the permissions similarly. What's your opinion on this? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-08-08T08:42:33Z
On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > So logically for PostgreSQL its an > > > user table but yeah this is created and managed by the extension. > > > > > > > Any idea if the user can alter/drop or perform any DML on it? I could > > not find any details on this part. > > In my experience, for such extension managed tables where we want them > to behave like catalog, generally users are just granted with SELECT > permission. So although it is not a catalog but for accessibility > wise for non admin users it is like a catalog. IMHO, even if we > choose to create a user table for conflict log history we can also > control the permissions similarly. > Yes, it can be done. Technically there is nothing preventing us from doing it. But in my experience, I have never seen any system-maintained statistics tables to be a user table rather than catalog table. Extensions are a different case; they typically manage their own tables, which are not part of the system catalog. But if any such stats related functionality is part of the core database, it generally makes more sense to implement it as a catalog table (provided there are no major obstacles to doing so). But I am curious to know what others think here. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-08-13T10:08:55Z
On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > So logically for PostgreSQL its an > > > user table but yeah this is created and managed by the extension. > > > > > > > Any idea if the user can alter/drop or perform any DML on it? I could > > not find any details on this part. > > In my experience, for such extension managed tables where we want them > to behave like catalog, generally users are just granted with SELECT > permission. So although it is not a catalog but for accessibility > wise for non admin users it is like a catalog. IMHO, even if we > choose to create a user table for conflict log history we can also > control the permissions similarly. What's your opinion on this? > Yes, I think it is important to control permissions on this table even if it is a user table. How about giving SELECT, DELETE, TRUNCATE permissions to subscription owner assuming we create one such table per subscription? It should be a user table due to following reasons (a) It is an ever growing table by definition and we need some level of user control to manage it (like remove the old data); (b) We may want some sort of partitioning streategy to manage it, even though, we decide to do it ourselves now but in future, we should allow user to also specify it; (c) We may also want user to specify what exact information she wants to get stored considering in future we want resolutions to also be stored in it. See a somewhat similar proposal to store errors during copy by Tom [1]; (d) In a near-by thread, we are discussing storing errors during copy in user table [2] and we have some similarity with that proposal as well. If we agree on this then the next thing to consider is whether we allow users to create such a table or do it ourselves. In the long term, we may want both but for simplicity, we can auto-create ourselves during CREATE SUBSCRIPTION with some option. BTW, if we decide to let user create it then we can consider the idea of TYPED tables as discussed in emails [3][4]. For user tables, we need to consider how to avoid replicating these tables for publications that use FOR ALL TABLES specifier. One idea is to use EXCLUDE table functionality as being discussed in thread [5] but that would also be a bit tricky especially if we decide to create such a table automatically. One naive idea is that internally we skip sending changes from this table for "FOR ALL TABLES" publication, and we shouldn't allow creating publication for this table. OTOH, if we allow the user to create and specify this table, we can ask her to specify with EXCLUDE syntax in publication. This needs more thoughts. [1] - https://www.postgresql.org/message-id/flat/752672.1699474336%40sss.pgh.pa.us#b8450be5645c4252d7d02cf7aca1fc7b [2] - https://www.postgresql.org/message-id/CACJufxH_OJpVra%3D0c4ow8fbxHj7heMcVaTNEPa5vAurSeNA-6Q%40mail.gmail.com [3] - https://www.postgresql.org/message-id/28c420cf-f25d-44f1-89fd-04ef0b2dd3db%40dunslane.net [4] - https://www.postgresql.org/message-id/CADrsxdYG%2B%2BK%3DiKjRm35u03q-Nb0tQPJaqjxnA2mGt5O%3DDht7sw%40mail.gmail.com [5] - https://www.postgresql.org/message-id/CANhcyEW%2BuJB_bvQLEaZCgoRTc1%3Di%2BQnrPPHxZ2%3D0SBSCyj9pkg%40mail.gmail.com -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Alastair Turner <minion@decodable.me> — 2025-08-14T10:56:26Z
On Wed, 13 Aug 2025 at 11:09, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> > wrote: > > > > > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> > wrote: > > > > > > > > So logically for PostgreSQL its an > > > > user table but yeah this is created and managed by the extension. > > > > > > > > > > Any idea if the user can alter/drop or perform any DML on it? I could > > > not find any details on this part. > > > > In my experience, for such extension managed tables where we want them > > to behave like catalog, generally users are just granted with SELECT > > permission. So although it is not a catalog but for accessibility > > wise for non admin users it is like a catalog. IMHO, even if we > > choose to create a user table for conflict log history we can also > > control the permissions similarly. What's your opinion on this? > > > > Yes, I think it is important to control permissions on this table even > if it is a user table. How about giving SELECT, DELETE, TRUNCATE > permissions to subscription owner assuming we create one such table > per subscription? > > It should be a user table due to following reasons (a) It is an ever > growing table by definition and we need some level of user control to > manage it (like remove the old data); (b) We may want some sort of > partitioning streategy to manage it, even though, we decide to do it > ourselves now but in future, we should allow user to also specify it; > (c) We may also want user to specify what exact information she wants > to get stored considering in future we want resolutions to also be > stored in it. See a somewhat similar proposal to store errors during > copy by Tom [1]; (d) In a near-by thread, we are discussing storing > errors during copy in user table [2] and we have some similarity with > that proposal as well. > > If we agree on this then the next thing to consider is whether we > allow users to create such a table or do it ourselves. In the long > term, we may want both but for simplicity, we can auto-create > ourselves during CREATE SUBSCRIPTION with some option. BTW, if we > decide to let user create it then we can consider the idea of TYPED > tables as discussed in emails [3][4]. > Having it be a user table, and specifying the table per subscription sounds good. This is very similar to how the load error tables for CloudBerry behave, for instance. To have both options for table creation, CREATE ... IF NOT EXISTS semantics work well - if the option on CREATE SUBSCRIPTION specifies an existing table of the right type use it, or create one with the name supplied. This would also give the user control over whether to have one table per subscription, one central table or anything in between. Rather than constraining permissions on the table, the CREATE SUBSCRIPTION command could create a dependency relationship between the table and the subscription.This would prevent removal of the table, even by a superuser. > For user tables, we need to consider how to avoid replicating these > tables for publications that use FOR ALL TABLES specifier. One idea is > to use EXCLUDE table functionality as being discussed in thread [5] > but that would also be a bit tricky especially if we decide to create > such a table automatically. One naive idea is that internally we skip > sending changes from this table for "FOR ALL TABLES" publication, and > we shouldn't allow creating publication for this table. OTOH, if we > allow the user to create and specify this table, we can ask her to > specify with EXCLUDE syntax in publication. This needs more thoughts. > If a dependency relationship is established between the error table and the subscription, could this be used as a basis for filtering the error tables from FOR ALL TABLES subscriptions? Regards Alastair
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-08-15T06:52:32Z
On Thu, Aug 14, 2025 at 4:26 PM Alastair Turner <minion@decodable.me> wrote: > > On Wed, 13 Aug 2025 at 11:09, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: >> > >> > On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote: >> > > >> > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: >> > > > >> > > > So logically for PostgreSQL its an >> > > > user table but yeah this is created and managed by the extension. >> > > > >> > > >> > > Any idea if the user can alter/drop or perform any DML on it? I could >> > > not find any details on this part. >> > >> > In my experience, for such extension managed tables where we want them >> > to behave like catalog, generally users are just granted with SELECT >> > permission. So although it is not a catalog but for accessibility >> > wise for non admin users it is like a catalog. IMHO, even if we >> > choose to create a user table for conflict log history we can also >> > control the permissions similarly. What's your opinion on this? >> > >> >> Yes, I think it is important to control permissions on this table even >> if it is a user table. How about giving SELECT, DELETE, TRUNCATE >> permissions to subscription owner assuming we create one such table >> per subscription? >> >> It should be a user table due to following reasons (a) It is an ever >> growing table by definition and we need some level of user control to >> manage it (like remove the old data); (b) We may want some sort of >> partitioning streategy to manage it, even though, we decide to do it >> ourselves now but in future, we should allow user to also specify it; >> (c) We may also want user to specify what exact information she wants >> to get stored considering in future we want resolutions to also be >> stored in it. See a somewhat similar proposal to store errors during >> copy by Tom [1]; (d) In a near-by thread, we are discussing storing >> errors during copy in user table [2] and we have some similarity with >> that proposal as well. >> >> If we agree on this then the next thing to consider is whether we >> allow users to create such a table or do it ourselves. In the long >> term, we may want both but for simplicity, we can auto-create >> ourselves during CREATE SUBSCRIPTION with some option. BTW, if we >> decide to let user create it then we can consider the idea of TYPED >> tables as discussed in emails [3][4]. > > > Having it be a user table, and specifying the table per subscription sounds good. This is very similar to how the load error tables for CloudBerry behave, for instance. To have both options for table creation, CREATE ... IF NOT EXISTS semantics work well - if the option on CREATE SUBSCRIPTION specifies an existing table of the right type use it, or create one with the name supplied. This would also give the user control over whether to have one table per subscription, one central table or anything in between. > Sounds reasonable. I think the first version we can let such a table be created automatically with some option(s) with subscription. Then, in subsequent versions, we can extend the functionality to allow existing tables. > > Rather than constraining permissions on the table, the CREATE SUBSCRIPTION command could create a dependency relationship between the table and the subscription.This would prevent removal of the table, even by a superuser. > Okay, that makes sense. But, we still probably want to disallow users from inserting or updating rows in the conflict table. >> >> For user tables, we need to consider how to avoid replicating these >> tables for publications that use FOR ALL TABLES specifier. One idea is >> to use EXCLUDE table functionality as being discussed in thread [5] >> but that would also be a bit tricky especially if we decide to create >> such a table automatically. One naive idea is that internally we skip >> sending changes from this table for "FOR ALL TABLES" publication, and >> we shouldn't allow creating publication for this table. OTOH, if we >> allow the user to create and specify this table, we can ask her to >> specify with EXCLUDE syntax in publication. This needs more thoughts. > > > If a dependency relationship is established between the error table and the subscription, could this be used as a basis for filtering the error tables from FOR ALL TABLES subscriptions? > Yeah, that is worth considering. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-15T09:00:48Z
On Wed, Aug 13, 2025 at 3:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 8, 2025 at 10:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, Aug 8, 2025 at 8:58 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Thu, Aug 7, 2025 at 3:08 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > So logically for PostgreSQL its an > > > > user table but yeah this is created and managed by the extension. > > > > > > > > > > Any idea if the user can alter/drop or perform any DML on it? I could > > > not find any details on this part. > > > > In my experience, for such extension managed tables where we want them > > to behave like catalog, generally users are just granted with SELECT > > permission. So although it is not a catalog but for accessibility > > wise for non admin users it is like a catalog. IMHO, even if we > > choose to create a user table for conflict log history we can also > > control the permissions similarly. What's your opinion on this? > > > > Yes, I think it is important to control permissions on this table even > if it is a user table. How about giving SELECT, DELETE, TRUNCATE > permissions to subscription owner assuming we create one such table > per subscription? Right, we need to control the permission. I am not sure whether we want a per subscription table or a common one. Earlier I was thinking of a single table, but I think per subscription is not a bad idea especially for managing the permissions. And there can not be a really huge number of subscriptions that we need to worry about creating many conflict log history tables and that too we will only create such tables when users pass that subscription option. > It should be a user table due to following reasons (a) It is an ever > growing table by definition and we need some level of user control to > manage it (like remove the old data); (b) We may want some sort of > partitioning streategy to manage it, even though, we decide to do it > ourselves now but in future, we should allow user to also specify it; Maybe we can partition by range on date (when entry is inserted) . That way it would be easy to get rid of older partitions for users. > (c) We may also want user to specify what exact information she wants > to get stored considering in future we want resolutions to also be > stored in it. See a somewhat similar proposal to store errors during > copy by Tom [1]; (d) In a near-by thread, we are discussing storing > errors during copy in user table [2] and we have some similarity with > that proposal as well. Right, we may consider that as well. > If we agree on this then the next thing to consider is whether we > allow users to create such a table or do it ourselves. In the long > term, we may want both but for simplicity, we can auto-create > ourselves during CREATE SUBSCRIPTION with some option. BTW, if we > decide to let user create it then we can consider the idea of TYPED > tables as discussed in emails [3][4]. Yeah that's an interesting option. > > For user tables, we need to consider how to avoid replicating these > tables for publications that use FOR ALL TABLES specifier. One idea is > to use EXCLUDE table functionality as being discussed in thread [5] > but that would also be a bit tricky especially if we decide to create > such a table automatically. One naive idea is that internally we skip > sending changes from this table for "FOR ALL TABLES" publication, and > we shouldn't allow creating publication for this table. OTOH, if we > allow the user to create and specify this table, we can ask her to > specify with EXCLUDE syntax in publication. This needs more thoughts. Yes this needs more thought, I will think more on this point and respond. Yet another question is about table names, whether we keep some standard name like conflict_log_history_$subid or let users pass the name. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-08-18T06:55:05Z
On Fri, Aug 15, 2025 at 2:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Yet another question is about table names, whether we keep some > standard name like conflict_log_history_$subid or let users pass the > name. > It would be good if we can let the user specify the table_name and if she didn't specify then use an internally generated name. I think it will be somewhat similar to slot_name. However, in this case, there is one challenge which is how can we decide whether the schema of the user provided table_name is correct or not? Do we compare it with the standard schema we are planning to use? One idea to keep things simple for the first version is that we allow users to specify the table_name for storing conflicts but the table should be created internally and if the same name table already exists, we can give an ERROR. Then we can later extend the functionality to even allow storing conflicts in pre-created tables with more checks about its schema. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-20T06:16:55Z
On Mon, Aug 18, 2025 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 15, 2025 at 2:31 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > Yet another question is about table names, whether we keep some > > standard name like conflict_log_history_$subid or let users pass the > > name. > > > > It would be good if we can let the user specify the table_name and if > she didn't specify then use an internally generated name. I think it > will be somewhat similar to slot_name. However, in this case, there is > one challenge which is how can we decide whether the schema of the > user provided table_name is correct or not? Do we compare it with the > standard schema we are planning to use? Ideally we can do that, if you see in this thread [1] there is a patch [2] which first try to validate the table schema and if it doesn't exist it creates it on its own. And it seems fine to me. > One idea to keep things simple for the first version is that we allow > users to specify the table_name for storing conflicts but the table > should be created internally and if the same name table already > exists, we can give an ERROR. Then we can later extend the > functionality to even allow storing conflicts in pre-created tables > with more checks about its schema. That's fair too. I am wondering what namespace we should create this user table in. If we are creating internally, I assume the user should provide a schema qualified name right? [1] https://www.postgresql.org/message-id/flat/752672.1699474336%40sss.pgh.pa.us#b8450be5645c4252d7d02cf7aca1fc7b [2] https://www.postgresql.org/message-id/attachment/152792/v8-0001-Add-a-new-COPY-option-SAVE_ERROR.patch -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-08-20T12:16:29Z
On Wed, Aug 20, 2025 at 11:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Aug 18, 2025 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > One idea to keep things simple for the first version is that we allow > > users to specify the table_name for storing conflicts but the table > > should be created internally and if the same name table already > > exists, we can give an ERROR. Then we can later extend the > > functionality to even allow storing conflicts in pre-created tables > > with more checks about its schema. > > That's fair too. I am wondering what namespace we should create this > user table in. If we are creating internally, I assume the user should > provide a schema qualified name right? > Yeah, but if not provided then we should create it based on search_path similar to what we do when user created the table from psql. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-08-21T03:47:08Z
On Wed, Aug 20, 2025 at 5:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Aug 20, 2025 at 11:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, Aug 18, 2025 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > One idea to keep things simple for the first version is that we allow > > > users to specify the table_name for storing conflicts but the table > > > should be created internally and if the same name table already > > > exists, we can give an ERROR. Then we can later extend the > > > functionality to even allow storing conflicts in pre-created tables > > > with more checks about its schema. > > > > That's fair too. I am wondering what namespace we should create this > > user table in. If we are creating internally, I assume the user should > > provide a schema qualified name right? > > > > Yeah, but if not provided then we should create it based on > search_path similar to what we do when user created the table from > psql. Yeah that makes sense. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-06T09:38:00Z
On Thu, Aug 21, 2025 at 9:17 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Aug 20, 2025 at 5:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Aug 20, 2025 at 11:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Mon, Aug 18, 2025 at 12:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > One idea to keep things simple for the first version is that we allow > > > > users to specify the table_name for storing conflicts but the table > > > > should be created internally and if the same name table already > > > > exists, we can give an ERROR. Then we can later extend the > > > > functionality to even allow storing conflicts in pre-created tables > > > > with more checks about its schema. > > > > > > That's fair too. I am wondering what namespace we should create this > > > user table in. If we are creating internally, I assume the user should > > > provide a schema qualified name right? > > > > > > > Yeah, but if not provided then we should create it based on > > search_path similar to what we do when user created the table from > > psql. While working on the patch, I see there are some open questions 1. We decided to pass the conflict history table name during subscription creation. And it makes sense to create this table when the CREATE SUBSCRIPTION command is executed. A potential concern is that the subscription owner will also own this table, having full control over it, including the ability to drop or alter its schema. This might not be an issue. If an INSERT into the conflict table fails, we can check the table's existence and schema. If they are not as expected, the conflict log history option can be disabled and re-enabled later via ALTER SUBSCRIPTION. 2. A further challenge is how to exclude these tables from publishing changes. If we support a subscription-level log history table and the user publishes ALL TABLES, the output plugin uses is_publishable_relation() to check if a table is publishable. However, applying the same logic here would require checking each subscription on the node to see if the table is designated as a conflict log history table for any subscription, which could be costly. 3. And one last thing is about should we consider dropping this table when we drop the subscription, I think this makes sense as we are internally creating it while creating the subscription. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Alastair Turner <minion@decodable.me> — 2025-09-07T08:12:02Z
Hi Dilip Thanks for working on this, I think it will make conflict detection a lot more useful. On Sat, 6 Sept 2025, 10:38 Dilip Kumar, <dilipbalaut@gmail.com> wrote: > While working on the patch, I see there are some open questions > > 1. We decided to pass the conflict history table name during > subscription creation. And it makes sense to create this table when > the CREATE SUBSCRIPTION command is executed. A potential concern is > that the subscription owner will also own this table, having full > control over it, including the ability to drop or alter its schema. ... > Typed tables and the dependency framework can address this concern. The schema of a typed table cannot be changed. If the subscription is marked as a dependency of the log table, the table cannot be dropped while the subscription exists. > 2. A further challenge is how to exclude these tables from publishing > changes. If we support a subscription-level log history table and the > user publishes ALL TABLES, the output plugin uses > is_publishable_relation() to check if a table is publishable. However, > applying the same logic here would require checking each subscription > on the node to see if the table is designated as a conflict log > history table for any subscription, which could be costly. > Checking the type of a table and/or whether a subscription object depends on it in a certain way would be a far less costly operation to add to is_publishable_relation() > 3. And one last thing is about should we consider dropping this table > when we drop the subscription, I think this makes sense as we are > internally creating it while creating the subscription. > Having to clean up the log table explicitly is likely to annoy users far less than having the conflict data destroyed as a side effect of another operation. I would strongly suggest leaving the table in place when the subscription is dropped. Regards Alastair
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-08T06:31:28Z
On Sun, Sep 7, 2025 at 1:42 PM Alastair Turner <minion@decodable.me> wrote: > > Hi Dilip > > Thanks for working on this, I think it will make conflict detection a lot more useful. Thanks for the suggestions, please find my reply inline. > On Sat, 6 Sept 2025, 10:38 Dilip Kumar, <dilipbalaut@gmail.com> wrote: >> >> While working on the patch, I see there are some open questions >> >> 1. We decided to pass the conflict history table name during >> subscription creation. And it makes sense to create this table when >> the CREATE SUBSCRIPTION command is executed. A potential concern is >> that the subscription owner will also own this table, having full >> control over it, including the ability to drop or alter its schema. > > Typed tables and the dependency framework can address this concern. The schema of a typed table cannot be changed. If the subscription is marked as a dependency of the log table, the table cannot be dropped while the subscription exists. Yeah type table can be useful here, but only concern is when do we create this type. One option is whenever we can create a catalog relation say "conflict_log_history" that will create a type and then for each subscription if we need to create the conflict history table we can create it as "conflict_log_history" type, but this might not be a best option as we are creating catalog just for using this type. Second option is to create a type while creating a table itself but then again the problem remains the same as subscription owners get control over altering the schema of the type itself. So the goal is we want this type to be created such that it can not be altered so IMHO option1 is more suitable i.e. creating conflict_log_history as catalog and per subscription table can be created as this type. >> >> 2. A further challenge is how to exclude these tables from publishing >> changes. If we support a subscription-level log history table and the >> user publishes ALL TABLES, the output plugin uses >> is_publishable_relation() to check if a table is publishable. However, >> applying the same logic here would require checking each subscription >> on the node to see if the table is designated as a conflict log >> history table for any subscription, which could be costly. > > > Checking the type of a table and/or whether a subscription object depends on it in a certain way would be a far less costly operation to add to is_publishable_relation() +1 > >> >> 3. And one last thing is about should we consider dropping this table >> when we drop the subscription, I think this makes sense as we are >> internally creating it while creating the subscription. > > > Having to clean up the log table explicitly is likely to annoy users far less than having the conflict data destroyed as a side effect of another operation. I would strongly suggest leaving the table in place when the subscription is dropped. Thanks for the input, I would like to hear opinions from others as well here. I agree that implicitly getting rid of the conflict history might be problematic but we also need to consider that we are considering dropping this when the whole subscription is dropped. Not sure even after subscription drop users will be interested in conflict history, if yes then they need to be aware of preserving that isn't it. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-09-10T09:55:37Z
On Mon, Sep 8, 2025 at 12:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sun, Sep 7, 2025 at 1:42 PM Alastair Turner <minion@decodable.me> wrote: > > > > Hi Dilip > > > > Thanks for working on this, I think it will make conflict detection a lot more useful. > > Thanks for the suggestions, please find my reply inline. > > > On Sat, 6 Sept 2025, 10:38 Dilip Kumar, <dilipbalaut@gmail.com> wrote: > >> > >> While working on the patch, I see there are some open questions > >> > >> 1. We decided to pass the conflict history table name during > >> subscription creation. And it makes sense to create this table when > >> the CREATE SUBSCRIPTION command is executed. A potential concern is > >> that the subscription owner will also own this table, having full > >> control over it, including the ability to drop or alter its schema. > > > > > Typed tables and the dependency framework can address this concern. The schema of a typed table cannot be changed. If the subscription is marked as a dependency of the log table, the table cannot be dropped while the subscription exists. > > Yeah type table can be useful here, but only concern is when do we > create this type. > How about having this as a built-in type? > One option is whenever we can create a catalog > relation say "conflict_log_history" that will create a type and then > for each subscription if we need to create the conflict history table > we can create it as "conflict_log_history" type, but this might not be > a best option as we are creating catalog just for using this type. > Second option is to create a type while creating a table itself but > then again the problem remains the same as subscription owners get > control over altering the schema of the type itself. So the goal is > we want this type to be created such that it can not be altered so > IMHO option1 is more suitable i.e. creating conflict_log_history as > catalog and per subscription table can be created as this type. > I think having it as a catalog table has drawbacks like who will clean this ever growing table. The one thing is not clear from Alastair's response is that he said to make subscription as a dependency of table, if we do so, then won't it be difficult to even drop subscription and also doesn't that sound reverse of what we want. > >> > >> 2. A further challenge is how to exclude these tables from publishing > >> changes. If we support a subscription-level log history table and the > >> user publishes ALL TABLES, the output plugin uses > >> is_publishable_relation() to check if a table is publishable. However, > >> applying the same logic here would require checking each subscription > >> on the node to see if the table is designated as a conflict log > >> history table for any subscription, which could be costly. > > > > > > Checking the type of a table and/or whether a subscription object depends on it in a certain way would be a far less costly operation to add to is_publishable_relation() > +1 > > > > >> > >> 3. And one last thing is about should we consider dropping this table > >> when we drop the subscription, I think this makes sense as we are > >> internally creating it while creating the subscription. > > > > > > Having to clean up the log table explicitly is likely to annoy users far less than having the conflict data destroyed as a side effect of another operation. I would strongly suggest leaving the table in place when the subscription is dropped. > > Thanks for the input, I would like to hear opinions from others as > well here. > But OTOH, there could be users who want such a table to be dropped. One possibility is that if we user provided us a pre-created table then we leave it to user to remove the table, otherwise, we can remove with drop subscription. BTW, did we decide that we want a conflict-table-per-subscription or one table for all subscriptions, if later, then I guess the problem would be that it has to be a shared table across databases. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-10T10:15:40Z
On Wed, Sep 10, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Sep 8, 2025 at 12:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sun, Sep 7, 2025 at 1:42 PM Alastair Turner <minion@decodable.me> wrote: > > > > > > Hi Dilip > > > > > > Thanks for working on this, I think it will make conflict detection a lot more useful. > > > > Thanks for the suggestions, please find my reply inline. > > > > > On Sat, 6 Sept 2025, 10:38 Dilip Kumar, <dilipbalaut@gmail.com> wrote: > > >> > > >> While working on the patch, I see there are some open questions > > >> > > >> 1. We decided to pass the conflict history table name during > > >> subscription creation. And it makes sense to create this table when > > >> the CREATE SUBSCRIPTION command is executed. A potential concern is > > >> that the subscription owner will also own this table, having full > > >> control over it, including the ability to drop or alter its schema. > > > > > > > > Typed tables and the dependency framework can address this concern. The schema of a typed table cannot be changed. If the subscription is marked as a dependency of the log table, the table cannot be dropped while the subscription exists. > > > > Yeah type table can be useful here, but only concern is when do we > > create this type. > > > > How about having this as a built-in type? Here we will have to create a built-in type of type table which is I think typcategory => 'C' and if we create this type it should be supplied with the "typrelid" that means there should be a backing catalog table. At least thats what I think. > > One option is whenever we can create a catalog > > relation say "conflict_log_history" that will create a type and then > > for each subscription if we need to create the conflict history table > > we can create it as "conflict_log_history" type, but this might not be > > a best option as we are creating catalog just for using this type. > > Second option is to create a type while creating a table itself but > > then again the problem remains the same as subscription owners get > > control over altering the schema of the type itself. So the goal is > > we want this type to be created such that it can not be altered so > > IMHO option1 is more suitable i.e. creating conflict_log_history as > > catalog and per subscription table can be created as this type. > > > > I think having it as a catalog table has drawbacks like who will clean > this ever growing table. No, I didn't mean an ever growing catalog table, I was giving an option to create a catalog table just to create a built-in type and then we will create an actual log history table of this built-in type for each subscription while creating the subscription. So this catalog table will be there but nothing will be inserted to this table and whenever the user supplies a conflict log history table name while creating a subscription that time we will create an actual table and the type of the table will be as the catalog table type. I agree creating a catalog table for this purpose might not be worth it, but I am not yet able to figure out how to create a built-in type of type table without creating the actual table. The one thing is not clear from Alastair's > response is that he said to make subscription as a dependency of > table, if we do so, then won't it be difficult to even drop > subscription and also doesn't that sound reverse of what we want. I assume he means subscription will be dependent on the log table, that means we can not drop the log table as subscription is dependent on this table. > > >> > > >> 2. A further challenge is how to exclude these tables from publishing > > >> changes. If we support a subscription-level log history table and the > > >> user publishes ALL TABLES, the output plugin uses > > >> is_publishable_relation() to check if a table is publishable. However, > > >> applying the same logic here would require checking each subscription > > >> on the node to see if the table is designated as a conflict log > > >> history table for any subscription, which could be costly. > > > > > > > > > Checking the type of a table and/or whether a subscription object depends on it in a certain way would be a far less costly operation to add to is_publishable_relation() > > +1 > > > > > > > >> > > >> 3. And one last thing is about should we consider dropping this table > > >> when we drop the subscription, I think this makes sense as we are > > >> internally creating it while creating the subscription. > > > > > > > > > Having to clean up the log table explicitly is likely to annoy users far less than having the conflict data destroyed as a side effect of another operation. I would strongly suggest leaving the table in place when the subscription is dropped. > > > > Thanks for the input, I would like to hear opinions from others as > > well here. > > > > But OTOH, there could be users who want such a table to be dropped. > One possibility is that if we user provided us a pre-created table > then we leave it to user to remove the table, otherwise, we can remove > with drop subscription. Thanks make sense. BTW, did we decide that we want a > conflict-table-per-subscription or one table for all subscriptions, if > later, then I guess the problem would be that it has to be a shared > table across databases. Right and I don't think there is an option to create a user defined shared table. And I don't think there is any issue creating per subscription conflict log history table, except that the subscription owner should have permission to create the table in the database while creating the subscription, but I think this is expected, either user can get the sufficient privilege or disable the option for conflict log history table. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Alastair Turner <minion@decodable.me> — 2025-09-10T11:01:58Z
On Wed, 10 Sept 2025 at 11:15, Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Wed, Sep 10, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > ... > > > > How about having this as a built-in type? > > Here we will have to create a built-in type of type table which is I > think typcategory => 'C' and if we create this type it should be > supplied with the "typrelid" that means there should be a backing > catalog table. At least thats what I think. > A compound type can be used for building a table, it's not necessary to create a table when creating the type. In user SQL: CREATE TYPE conflict_log_type AS ( conflictid UUID, subid OID, tableid OID, conflicttype TEXT, operationtype TEXT, replication_origin TEXT, remote_commit_ts TIMESTAMPTZ, local_commit_ts TIMESTAMPTZ, ri_key JSON, remote_tuple JSON, local_tuple JSON ); CREATE TABLE my_subscription_conflicts OF conflict_log_type; ... > > The one thing is not clear from Alastair's > > response is that he said to make subscription as a dependency of > > table, if we do so, then won't it be difficult to even drop > > subscription and also doesn't that sound reverse of what we want. > > I assume he means subscription will be dependent on the log table, > that means we can not drop the log table as subscription is dependent > on this table. > Yes, that's what I was proposing. > > > >> > > > >> 2. A further challenge is how to exclude these tables from > publishing > > > >> changes. If we support a subscription-level log history table and > the > > > >> user publishes ALL TABLES, the output plugin uses > > > >> is_publishable_relation() to check if a table is publishable. > However, > > > >> applying the same logic here would require checking each > subscription > > > >> on the node to see if the table is designated as a conflict log > > > >> history table for any subscription, which could be costly. > > > > > > > > > > > > Checking the type of a table and/or whether a subscription object > depends on it in a certain way would be a far less costly operation to add > to is_publishable_relation() > > > +1 > > > > > > > > > > >> > > > >> 3. And one last thing is about should we consider dropping this > table > > > >> when we drop the subscription, I think this makes sense as we are > > > >> internally creating it while creating the subscription. > > > > > > > > > > > > Having to clean up the log table explicitly is likely to annoy users > far less than having the conflict data destroyed as a side effect of > another operation. I would strongly suggest leaving the table in place when > the subscription is dropped. > > > > > > Thanks for the input, I would like to hear opinions from others as > > > well here. > > > > > > > But OTOH, there could be users who want such a table to be dropped. > > One possibility is that if we user provided us a pre-created table > > then we leave it to user to remove the table, otherwise, we can remove > > with drop subscription. > > Thanks make sense. > > BTW, did we decide that we want a > > conflict-table-per-subscription or one table for all subscriptions, if > > later, then I guess the problem would be that it has to be a shared > > table across databases. > > Right and I don't think there is an option to create a user defined > shared table. And I don't think there is any issue creating per > subscription conflict log history table, except that the subscription > owner should have permission to create the table in the database while > creating the subscription, but I think this is expected, either user > can get the sufficient privilege or disable the option for conflict > log history table. > Since subscriptions are created in a particular database, it seems reasonable that error tables would also be created in a particular database.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-10T12:06:41Z
On Wed, Sep 10, 2025 at 4:32 PM Alastair Turner <minion@decodable.me> wrote: > >> Here we will have to create a built-in type of type table which is I >> think typcategory => 'C' and if we create this type it should be >> supplied with the "typrelid" that means there should be a backing >> catalog table. At least thats what I think. > > A compound type can be used for building a table, it's not necessary to create a table when creating the type. In user SQL: > > CREATE TYPE conflict_log_type AS ( > conflictid UUID, > subid OID, > tableid OID, > conflicttype TEXT, > operationtype TEXT, > replication_origin TEXT, > remote_commit_ts TIMESTAMPTZ, > local_commit_ts TIMESTAMPTZ, > ri_key JSON, > remote_tuple JSON, > local_tuple JSON > ); > > CREATE TABLE my_subscription_conflicts OF conflict_log_type; Problem is if you CREATE TYPE just before creating the table that means subscription owners get full control over the type as well it means they can alter the type itself. So logically this TYPE should be a built-in type so that subscription owners do not have control to ALTER the type but they have permission to create a table from this type. But the problem is whenever you create a type it needs to have corresponding relid in pg_class in fact you can just create a type as per your example and see[1] it will get corresponding entry in pg_class. So the problem is if you create a user defined type it will be created under the subscription owner and it defeats the purpose of not allowing to alter the type OTOH if we create a built-in type it needs to have a corresponding entry in pg_class. So what's your proposal, create this type while creating a subscription or as a built-in type, or anything else? [1] postgres[1948123]=# CREATE TYPE conflict_log_type AS (conflictid UUID); postgres[1948123]=# select oid, typrelid, typcategory from pg_type where typname='conflict_log_type'; oid | typrelid | typcategory -------+----------+------------- 16386 | 16384 | C (1 row) postgres[1948123]=# select relname from pg_class where oid=16384; relname ------------------- conflict_log_type -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2025-09-10T19:23:03Z
Hi, On Tue, Aug 5, 2025 at 5:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Currently we log conflicts to the server's log file and updates, this > approach has limitations, 1) Difficult to query and analyze, parsing > plain text log files for conflict details is inefficient. 2) Lack of > structured data, key conflict attributes (table, operation, old/new > data, LSN, etc.) are not readily available in a structured, queryable > format. 3) Difficult for external monitoring tools or custom > resolution scripts to consume conflict data directly. > > This proposal aims to address these limitations by introducing a > conflict log history table, providing a structured, and queryable > record of all logical replication conflicts. This should be a > configurable option whether to log into the conflict log history > table, server logs or both. +1 for the overall idea. Having an option to separate out the conflicts helps analyze the data correctness issues and understand the behavior of conflicts. Parsing server logs file for analysis and debugging is a typical requirement differently met with tools like log_fdw or capture server logs in CSV format for parsing or do text search and analyze etc. > This proposal has two main design questions: > =================================== > > 1. How do we store conflicting tuples from different tables? > Using a JSON column to store the row data seems like the most flexible > solution, as it can accommodate different table schemas. How good is storing conflicts on the table? Is it okay to generate WAL traffic? Is it okay to physically replicate this log table to all replicas? Is it okay to logically replicate this log table to all subscribers and logical decoding clients? How does this table get truncated? If truncation gets delayed, won't it unnecessarily fill up storage? > 2. Should this be a system table or a user table? > a) System Table: Storing this in a system catalog is simple, but > catalogs aren't designed for ever-growing data. While pg_large_object > is an exception, this is not what we generally do IMHO. > b) User Table: This offers more flexibility. We could allow a user to > specify the table name during CREATE SUBSCRIPTION. Then we choose to > either create the table internally or let the user create the table > with a predefined schema. -1 for the system table for sure. > A potential drawback is that a user might drop or alter the table. > However, we could mitigate this risk by simply logging a WARNING if > the table is configured but an insertion fails. > I am currently working on a POC patch for the same, but will post that > once we have some thoughts on design choices. How about streaming the conflicts in fixed format to a separate log file other than regular postgres server log file? All the rules/settings that apply to regular postgres server log files also apply for conflicts server log files (rotation, GUCs, format CSV/JSON/TEXT etc.). This way there's no additional WAL, and we don't have to worry about drop/alter, truncate, delete, update/insert, permission model, physical replication, logical replication, storage space etc. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-09-11T03:13:43Z
On Thu, Sep 11, 2025 at 12:53 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Aug 5, 2025 at 5:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > Currently we log conflicts to the server's log file and updates, this > > approach has limitations, 1) Difficult to query and analyze, parsing > > plain text log files for conflict details is inefficient. 2) Lack of > > structured data, key conflict attributes (table, operation, old/new > > data, LSN, etc.) are not readily available in a structured, queryable > > format. 3) Difficult for external monitoring tools or custom > > resolution scripts to consume conflict data directly. > > > > This proposal aims to address these limitations by introducing a > > conflict log history table, providing a structured, and queryable > > record of all logical replication conflicts. This should be a > > configurable option whether to log into the conflict log history > > table, server logs or both. > > +1 for the overall idea. Having an option to separate out the > conflicts helps analyze the data correctness issues and understand the > behavior of conflicts. > > Parsing server logs file for analysis and debugging is a typical > requirement differently met with tools like log_fdw or capture server > logs in CSV format for parsing or do text search and analyze etc. > > > This proposal has two main design questions: > > =================================== > > > > 1. How do we store conflicting tuples from different tables? > > Using a JSON column to store the row data seems like the most flexible > > solution, as it can accommodate different table schemas. > > How good is storing conflicts on the table? Is it okay to generate WAL > traffic? > Yesh, I think so. One would like to query conflicts and resolutions for those conflicts at a later point to ensure consistency. BTW, if you are worried about WAL traffic, please note conflicts shouldn't be a very often event, so additional WAL should be okay. OTOH, if the conflicts are frequent, anyway, the performance won't be that great as that means there is a kind of ERROR which we have to deal by having resolution for it. > Is it okay to physically replicate this log table to all > replicas? > Yes, that should be okay as we want the conflict_tables to be present after failover. Is it okay to logically replicate this log table to all > subscribers and logical decoding clients? > I think we should avoid this. > How does this table get > truncated? If truncation gets delayed, won't it unnecessarily fill up > storage? > I think it should be users responsibility to clean this table as they better know when the data in the table is obsolete. Eventually, we can also have some policies via options or some other way to get it truncated. IIRC, we also discussed having these as partition tables so that it is easy to discard data. However, for initial version, we may want something simpler. > > 2. Should this be a system table or a user table? > > a) System Table: Storing this in a system catalog is simple, but > > catalogs aren't designed for ever-growing data. While pg_large_object > > is an exception, this is not what we generally do IMHO. > > b) User Table: This offers more flexibility. We could allow a user to > > specify the table name during CREATE SUBSCRIPTION. Then we choose to > > either create the table internally or let the user create the table > > with a predefined schema. > > -1 for the system table for sure. > > > A potential drawback is that a user might drop or alter the table. > > However, we could mitigate this risk by simply logging a WARNING if > > the table is configured but an insertion fails. > > I am currently working on a POC patch for the same, but will post that > > once we have some thoughts on design choices. > > How about streaming the conflicts in fixed format to a separate log > file other than regular postgres server log file? > I would prefer this info to be stored in tables as it would be easy to query them. If we use separate LOGs then we should provide some views to query the LOG. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-12T10:13:21Z
On Thu, Sep 11, 2025 at 8:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 11, 2025 at 12:53 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Tue, Aug 5, 2025 at 5:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > Currently we log conflicts to the server's log file and updates, this > > > approach has limitations, 1) Difficult to query and analyze, parsing > > > plain text log files for conflict details is inefficient. 2) Lack of > > > structured data, key conflict attributes (table, operation, old/new > > > data, LSN, etc.) are not readily available in a structured, queryable > > > format. 3) Difficult for external monitoring tools or custom > > > resolution scripts to consume conflict data directly. > > > > > > This proposal aims to address these limitations by introducing a > > > conflict log history table, providing a structured, and queryable > > > record of all logical replication conflicts. This should be a > > > configurable option whether to log into the conflict log history > > > table, server logs or both. > > > > +1 for the overall idea. Having an option to separate out the > > conflicts helps analyze the data correctness issues and understand the > > behavior of conflicts. > > > > Parsing server logs file for analysis and debugging is a typical > > requirement differently met with tools like log_fdw or capture server > > logs in CSV format for parsing or do text search and analyze etc. > > > > > This proposal has two main design questions: > > > =================================== > > > > > > 1. How do we store conflicting tuples from different tables? > > > Using a JSON column to store the row data seems like the most flexible > > > solution, as it can accommodate different table schemas. > > > > How good is storing conflicts on the table? Is it okay to generate WAL > > traffic? > > > > Yesh, I think so. One would like to query conflicts and resolutions > for those conflicts at a later point to ensure consistency. BTW, if > you are worried about WAL traffic, please note conflicts shouldn't be > a very often event, so additional WAL should be okay. OTOH, if the > conflicts are frequent, anyway, the performance won't be that great as > that means there is a kind of ERROR which we have to deal by having > resolution for it. > > > Is it okay to physically replicate this log table to all > > replicas? > > > > Yes, that should be okay as we want the conflict_tables to be present > after failover. > > Is it okay to logically replicate this log table to all > > subscribers and logical decoding clients? > > > > I think we should avoid this. > > > How does this table get > > truncated? If truncation gets delayed, won't it unnecessarily fill up > > storage? > > > > I think it should be users responsibility to clean this table as they > better know when the data in the table is obsolete. Eventually, we can > also have some policies via options or some other way to get it > truncated. IIRC, we also discussed having these as partition tables so > that it is easy to discard data. However, for initial version, we may > want something simpler. > > > > 2. Should this be a system table or a user table? > > > a) System Table: Storing this in a system catalog is simple, but > > > catalogs aren't designed for ever-growing data. While pg_large_object > > > is an exception, this is not what we generally do IMHO. > > > b) User Table: This offers more flexibility. We could allow a user to > > > specify the table name during CREATE SUBSCRIPTION. Then we choose to > > > either create the table internally or let the user create the table > > > with a predefined schema. > > > > -1 for the system table for sure. > > > > > A potential drawback is that a user might drop or alter the table. > > > However, we could mitigate this risk by simply logging a WARNING if > > > the table is configured but an insertion fails. > > > I am currently working on a POC patch for the same, but will post that > > > once we have some thoughts on design choices. > > > > How about streaming the conflicts in fixed format to a separate log > > file other than regular postgres server log file? > > > > I would prefer this info to be stored in tables as it would be easy to > query them. If we use separate LOGs then we should provide some views > to query the LOG. I was looking into another thread where we provide an error table for COPY [1], it requires the user to pre-create the error table. And inside the COPY command we will validate the table, validation in that context is a one-time process checking for: (1) table existence, (2) ability to acquire a sufficient lock, (3) INSERT privileges, and (4) matching column names and data types. This approach avoids concerns about the user's DROP or ALTER permissions. Our requirement for the logical replication conflict log table differs, as we must validate the target table upon every conflict insertion, not just at subscription creation. A more robust alternative is to perform validation and acquire a lock on the conflict table whenever the subscription worker starts. This prevents modifications (like ALTER or DROP) while the worker is active. When the worker gets restarted, we can re-validate the table and automatically disable the conflict logging feature if validation fails. And this can be enabled by ALTER SUBSCRIPTION by setting the option again. And if we want in first version we can expect user to create the table as per the expected schema and supply it, this will avoid the need of handling how to avoid it from publishing as it will be user's responsibility and then in top up patches we can also allow to create the table internally if tables doesn't exist and then we can find out solution to avoid it from being publish when ALL TABLES are published. Thoughts? [1] https://www.postgresql.org/message-id/CACJufxEo-rsH5v__S3guUhDdXjakC7m7N5wj%3DmOB5rPiySBoQg%40mail.gmail.com -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2025-09-13T00:44:19Z
Hi, On Wed, Sep 10, 2025 at 8:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > How about streaming the conflicts in fixed format to a separate log > > file other than regular postgres server log file? > > I would prefer this info to be stored in tables as it would be easy to > query them. If we use separate LOGs then we should provide some views > to query the LOG. Providing views to query the conflicts LOG is the easiest way than having tables (Probably we must provide both - logging conflicts to tables and separate LOG files). However, wanting the conflicts logs after failovers is something that makes me think the table approach is better. I'm open to more thoughts here. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
-
Re: Proposal: Conflict log history table for Logical Replication
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2025-09-13T00:45:56Z
Hi, On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I was looking into another thread where we provide an error table for > COPY [1], it requires the user to pre-create the error table. And > inside the COPY command we will validate the table, validation in that > context is a one-time process checking for: (1) table existence, (2) > ability to acquire a sufficient lock, (3) INSERT privileges, and (4) > matching column names and data types. This approach avoids concerns > about the user's DROP or ALTER permissions. > > Our requirement for the logical replication conflict log table > differs, as we must validate the target table upon every conflict > insertion, not just at subscription creation. A more robust > alternative is to perform validation and acquire a lock on the > conflict table whenever the subscription worker starts. This prevents > modifications (like ALTER or DROP) while the worker is active. When > the worker gets restarted, we can re-validate the table and > automatically disable the conflict logging feature if validation > fails. And this can be enabled by ALTER SUBSCRIPTION by setting the > option again. Having to worry about ALTER/DROP and adding code to protect seems like an overkill. > And if we want in first version we can expect user to create the table > as per the expected schema and supply it, this will avoid the need of > handling how to avoid it from publishing as it will be user's > responsibility and then in top up patches we can also allow to create > the table internally if tables doesn't exist and then we can find out > solution to avoid it from being publish when ALL TABLES are published. This looks much more simple to start with. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-14T06:53:12Z
On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: Thanks for the feedback Bharath > On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > I was looking into another thread where we provide an error table for > > COPY [1], it requires the user to pre-create the error table. And > > inside the COPY command we will validate the table, validation in that > > context is a one-time process checking for: (1) table existence, (2) > > ability to acquire a sufficient lock, (3) INSERT privileges, and (4) > > matching column names and data types. This approach avoids concerns > > about the user's DROP or ALTER permissions. > > > > Our requirement for the logical replication conflict log table > > differs, as we must validate the target table upon every conflict > > insertion, not just at subscription creation. A more robust > > alternative is to perform validation and acquire a lock on the > > conflict table whenever the subscription worker starts. This prevents > > modifications (like ALTER or DROP) while the worker is active. When > > the worker gets restarted, we can re-validate the table and > > automatically disable the conflict logging feature if validation > > fails. And this can be enabled by ALTER SUBSCRIPTION by setting the > > option again. > > Having to worry about ALTER/DROP and adding code to protect seems like > an overkill. IMHO eventually if we can control that I feel this is a good goal to have. So that we can avoid failure during conflict insertion. We may argue its user's responsibility to not alter the table and we can just check the validity during create/alter subscription. > > And if we want in first version we can expect user to create the table > > as per the expected schema and supply it, this will avoid the need of > > handling how to avoid it from publishing as it will be user's > > responsibility and then in top up patches we can also allow to create > > the table internally if tables doesn't exist and then we can find out > > solution to avoid it from being publish when ALL TABLES are published. > > This looks much more simple to start with. Right. PFA, attached WIP patches, 0001 allow user created tables to provide as input for conflict history tables and we will validate the table during create/alter subscription. 0002 add an option to internally create the table if it does not exist. TODO: - Still patches are WIP and need more work testing for different failure cases - Need to explore an option to create a built-in type (I will start a separate thread for the same) - Need to add test cases - Need to explore options to avoid getting published, but maybe we only need to avoid this when we internally create the table? Here is some basic test I tried: psql -d postgres -c "CREATE TABLE test(a int, b int, primary key(a));" psql -d postgres -p 5433 -c "CREATE SCHEMA myschema" psql -d postgres -p 5433 -c "CREATE TABLE test(a int, b int, primary key(a));" psql -d postgres -p 5433 -c "GRANT INSERT, UPDATE, SELECT, DELETE ON test TO dk " psql -d postgres -c "CREATE PUBLICATION pub FOR ALL TABLES ;" psql -d postgres -p 5433 -c "CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres port=5432' PUBLICATION pub WITH(conflict_log_table=myschema.conflict_log_history)"; psql -d postgres -p 5432 -c "INSERT INTO test VALUES(1,2);" psql -d postgres -p 5433 -c "UPDATE test SET b=10 WHERE a=1;" psql -d postgres -p 5432 -c "UPDATE test SET b=20 WHERE a=1;" postgres[1202034]=# select * from myschema.conflict_log_history ; -[ RECORD 1 ]-----+------------------------------ relid | 16385 local_xid | 763 remote_xid | 757 local_lsn | 0/00000000 remote_commit_lsn | 0/0174AB30 local_commit_ts | 2025-09-14 06:45:00.828874+00 remote_commit_ts | 2025-09-14 06:45:05.845614+00 table_schema | public table_name | test conflict_type | update_origin_differs local_origin | remote_origin | pg_16396 key_tuple | {"a":1,"b":20} local_tuple | {"a":1,"b":10} remote_tuple | {"a":1,"b":20} -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-09-18T08:33:33Z
On Sun, Sep 14, 2025 at 12:23 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > Thanks for the feedback Bharath > > > On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > I was looking into another thread where we provide an error table for > > > COPY [1], it requires the user to pre-create the error table. And > > > inside the COPY command we will validate the table, validation in that > > > context is a one-time process checking for: (1) table existence, (2) > > > ability to acquire a sufficient lock, (3) INSERT privileges, and (4) > > > matching column names and data types. This approach avoids concerns > > > about the user's DROP or ALTER permissions. > > > > > > Our requirement for the logical replication conflict log table > > > differs, as we must validate the target table upon every conflict > > > insertion, not just at subscription creation. A more robust > > > alternative is to perform validation and acquire a lock on the > > > conflict table whenever the subscription worker starts. This prevents > > > modifications (like ALTER or DROP) while the worker is active. When > > > the worker gets restarted, we can re-validate the table and > > > automatically disable the conflict logging feature if validation > > > fails. And this can be enabled by ALTER SUBSCRIPTION by setting the > > > option again. > > > > Having to worry about ALTER/DROP and adding code to protect seems like > > an overkill. > > IMHO eventually if we can control that I feel this is a good goal to > have. So that we can avoid failure during conflict insertion. We may > argue its user's responsibility to not alter the table and we can just > check the validity during create/alter subscription. > If we compare conflict_history_table with the slot that gets created with subscription, one can say the same thing about slots. Users can drop the slots and whole replication will stop. I think this table will be created with the same privileges as the owner of a subscription which can be either a superuser or a user with the privileges of the pg_create_subscription role, so we can rely on such users. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-18T13:26:30Z
On Thu, Sep 18, 2025 at 2:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Sep 14, 2025 at 12:23 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Thanks for the feedback Bharath > > > > > On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > I was looking into another thread where we provide an error table for > > > > COPY [1], it requires the user to pre-create the error table. And > > > > inside the COPY command we will validate the table, validation in that > > > > context is a one-time process checking for: (1) table existence, (2) > > > > ability to acquire a sufficient lock, (3) INSERT privileges, and (4) > > > > matching column names and data types. This approach avoids concerns > > > > about the user's DROP or ALTER permissions. > > > > > > > > Our requirement for the logical replication conflict log table > > > > differs, as we must validate the target table upon every conflict > > > > insertion, not just at subscription creation. A more robust > > > > alternative is to perform validation and acquire a lock on the > > > > conflict table whenever the subscription worker starts. This prevents > > > > modifications (like ALTER or DROP) while the worker is active. When > > > > the worker gets restarted, we can re-validate the table and > > > > automatically disable the conflict logging feature if validation > > > > fails. And this can be enabled by ALTER SUBSCRIPTION by setting the > > > > option again. > > > > > > Having to worry about ALTER/DROP and adding code to protect seems like > > > an overkill. > > > > IMHO eventually if we can control that I feel this is a good goal to > > have. So that we can avoid failure during conflict insertion. We may > > argue its user's responsibility to not alter the table and we can just > > check the validity during create/alter subscription. > > > > If we compare conflict_history_table with the slot that gets created > with subscription, one can say the same thing about slots. Users can > drop the slots and whole replication will stop. I think this table > will be created with the same privileges as the owner of a > subscription which can be either a superuser or a user with the > privileges of the pg_create_subscription role, so we can rely on such > users. Yeah that's a valid point. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Masahiko Sawada <sawada.mshk@gmail.com> — 2025-09-18T18:15:37Z
On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Sep 14, 2025 at 12:23 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sat, Sep 13, 2025 at 6:16 AM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > Thanks for the feedback Bharath > > > > > On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > I was looking into another thread where we provide an error table for > > > > COPY [1], it requires the user to pre-create the error table. And > > > > inside the COPY command we will validate the table, validation in that > > > > context is a one-time process checking for: (1) table existence, (2) > > > > ability to acquire a sufficient lock, (3) INSERT privileges, and (4) > > > > matching column names and data types. This approach avoids concerns > > > > about the user's DROP or ALTER permissions. > > > > > > > > Our requirement for the logical replication conflict log table > > > > differs, as we must validate the target table upon every conflict > > > > insertion, not just at subscription creation. A more robust > > > > alternative is to perform validation and acquire a lock on the > > > > conflict table whenever the subscription worker starts. This prevents > > > > modifications (like ALTER or DROP) while the worker is active. When > > > > the worker gets restarted, we can re-validate the table and > > > > automatically disable the conflict logging feature if validation > > > > fails. And this can be enabled by ALTER SUBSCRIPTION by setting the > > > > option again. > > > > > > Having to worry about ALTER/DROP and adding code to protect seems like > > > an overkill. > > > > IMHO eventually if we can control that I feel this is a good goal to > > have. So that we can avoid failure during conflict insertion. We may > > argue its user's responsibility to not alter the table and we can just > > check the validity during create/alter subscription. > > > > If we compare conflict_history_table with the slot that gets created > with subscription, one can say the same thing about slots. Users can > drop the slots and whole replication will stop. I think this table > will be created with the same privileges as the owner of a > subscription which can be either a superuser or a user with the > privileges of the pg_create_subscription role, so we can rely on such > users. We might want to consider which role inserts the conflict info into the history table. For example, if any table created by a user can be used as the history table for a subscription and the conflict info insertion is performed by the subscription owner, we would end up having the same security issue that was addressed by the run_as_owner subscription option. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-09-20T11:59:02Z
On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > If we compare conflict_history_table with the slot that gets created > > with subscription, one can say the same thing about slots. Users can > > drop the slots and whole replication will stop. I think this table > > will be created with the same privileges as the owner of a > > subscription which can be either a superuser or a user with the > > privileges of the pg_create_subscription role, so we can rely on such > > users. > > We might want to consider which role inserts the conflict info into > the history table. For example, if any table created by a user can be > used as the history table for a subscription and the conflict info > insertion is performed by the subscription owner, we would end up > having the same security issue that was addressed by the run_as_owner > subscription option. > Yeah, I don't think we want to open that door. For user created tables, we should perform actions with table_owner's privilege. In such a case, if one wants to create a subscription with run_as_owner option, she should give DML operation permissions to the subscription owner. OTOH, if we create this table internally (via subscription owner) then irrespective of run_as_owner, we will always insert as subscription_owner. AFAIR, one open point for internally created tables is whether we should skip changes to conflict_history table while replicating changes? The table will be considered under for ALL TABLES publications, if defined? Ideally, these should behave as catalog tables, so one option is to mark them as 'user_catalog_table', or the other option is we have some hard-code checks during replication. The first option has the advantage that it won't write additional WAL for these tables which is otherwise required under wal_level=logical. What other options do we have? -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Masahiko Sawada <sawada.mshk@gmail.com> — 2025-09-23T17:59:12Z
On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > If we compare conflict_history_table with the slot that gets created > > > with subscription, one can say the same thing about slots. Users can > > > drop the slots and whole replication will stop. I think this table > > > will be created with the same privileges as the owner of a > > > subscription which can be either a superuser or a user with the > > > privileges of the pg_create_subscription role, so we can rely on such > > > users. > > > > We might want to consider which role inserts the conflict info into > > the history table. For example, if any table created by a user can be > > used as the history table for a subscription and the conflict info > > insertion is performed by the subscription owner, we would end up > > having the same security issue that was addressed by the run_as_owner > > subscription option. > > > > Yeah, I don't think we want to open that door. For user created > tables, we should perform actions with table_owner's privilege. In > such a case, if one wants to create a subscription with run_as_owner > option, she should give DML operation permissions to the subscription > owner. OTOH, if we create this table internally (via subscription > owner) then irrespective of run_as_owner, we will always insert as > subscription_owner. Agreed. > > AFAIR, one open point for internally created tables is whether we > should skip changes to conflict_history table while replicating > changes? The table will be considered under for ALL TABLES > publications, if defined? Ideally, these should behave as catalog > tables, so one option is to mark them as 'user_catalog_table', or the > other option is we have some hard-code checks during replication. The > first option has the advantage that it won't write additional WAL for > these tables which is otherwise required under wal_level=logical. What > other options do we have? I think conflict history information is subscriber local information so doesn't have to be replicated to another subscriber. Also it could be problematic in cross-major-version replication cases if we break the compatibility of history table definition. I would expect that the history table works as a catalog table in terms of logical decoding/replication. It would probably make sense to reuse the user_catalog_table option for that purpose. If we have a history table for each subscription that wants to record the conflict history (I believe so), it would be hard to go with the second option (having hard-code checks). Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-09-24T10:30:12Z
On Tue, Sep 23, 2025 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > AFAIR, one open point for internally created tables is whether we > > should skip changes to conflict_history table while replicating > > changes? The table will be considered under for ALL TABLES > > publications, if defined? Ideally, these should behave as catalog > > tables, so one option is to mark them as 'user_catalog_table', or the > > other option is we have some hard-code checks during replication. The > > first option has the advantage that it won't write additional WAL for > > these tables which is otherwise required under wal_level=logical. What > > other options do we have? > > I think conflict history information is subscriber local information > so doesn't have to be replicated to another subscriber. Also it could > be problematic in cross-major-version replication cases if we break > the compatibility of history table definition. > Right, this is another reason not to replicate it. > I would expect that the > history table works as a catalog table in terms of logical > decoding/replication. It would probably make sense to reuse the > user_catalog_table option for that purpose. If we have a history table > for each subscription that wants to record the conflict history (I > believe so), it would be hard to go with the second option (having > hard-code checks). > Agreed. Let's wait and see what Dilip or others have to say on this. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-24T11:35:52Z
On Tue, Sep 23, 2025 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > If we compare conflict_history_table with the slot that gets created > > > > with subscription, one can say the same thing about slots. Users can > > > > drop the slots and whole replication will stop. I think this table > > > > will be created with the same privileges as the owner of a > > > > subscription which can be either a superuser or a user with the > > > > privileges of the pg_create_subscription role, so we can rely on such > > > > users. > > > > > > We might want to consider which role inserts the conflict info into > > > the history table. For example, if any table created by a user can be > > > used as the history table for a subscription and the conflict info > > > insertion is performed by the subscription owner, we would end up > > > having the same security issue that was addressed by the run_as_owner > > > subscription option. > > > > > > > Yeah, I don't think we want to open that door. For user created > > tables, we should perform actions with table_owner's privilege. In > > such a case, if one wants to create a subscription with run_as_owner > > option, she should give DML operation permissions to the subscription > > owner. OTOH, if we create this table internally (via subscription > > owner) then irrespective of run_as_owner, we will always insert as > > subscription_owner. > > Agreed. Yeah that makes sense to me as well. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-24T11:40:17Z
On Wed, Sep 24, 2025 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Sep 23, 2025 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > AFAIR, one open point for internally created tables is whether we > > > should skip changes to conflict_history table while replicating > > > changes? The table will be considered under for ALL TABLES > > > publications, if defined? Ideally, these should behave as catalog > > > tables, so one option is to mark them as 'user_catalog_table', or the > > > other option is we have some hard-code checks during replication. The > > > first option has the advantage that it won't write additional WAL for > > > these tables which is otherwise required under wal_level=logical. What > > > other options do we have? > > > > I think conflict history information is subscriber local information > > so doesn't have to be replicated to another subscriber. Also it could > > be problematic in cross-major-version replication cases if we break > > the compatibility of history table definition. > > > > Right, this is another reason not to replicate it. > > > I would expect that the > > history table works as a catalog table in terms of logical > > decoding/replication. It would probably make sense to reuse the > > user_catalog_table option for that purpose. If we have a history table > > for each subscription that wants to record the conflict history (I > > believe so), it would be hard to go with the second option (having > > hard-code checks). > > > > Agreed. Let's wait and see what Dilip or others have to say on this. Yeah I think this makes sense to create as 'user_catalog_table' tables when we internally create them. However, IMHO when a user provides its own table, I believe we should not enforce the restriction for that table to be created as a 'user_catalog_table' table, or do you think we should enforce that property? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Masahiko Sawada <sawada.mshk@gmail.com> — 2025-09-24T18:35:49Z
On Wed, Sep 24, 2025 at 4:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Sep 24, 2025 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Sep 23, 2025 at 11:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Sat, Sep 20, 2025 at 4:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > AFAIR, one open point for internally created tables is whether we > > > > should skip changes to conflict_history table while replicating > > > > changes? The table will be considered under for ALL TABLES > > > > publications, if defined? Ideally, these should behave as catalog > > > > tables, so one option is to mark them as 'user_catalog_table', or the > > > > other option is we have some hard-code checks during replication. The > > > > first option has the advantage that it won't write additional WAL for > > > > these tables which is otherwise required under wal_level=logical. What > > > > other options do we have? > > > > > > I think conflict history information is subscriber local information > > > so doesn't have to be replicated to another subscriber. Also it could > > > be problematic in cross-major-version replication cases if we break > > > the compatibility of history table definition. > > > > > > > Right, this is another reason not to replicate it. > > > > > I would expect that the > > > history table works as a catalog table in terms of logical > > > decoding/replication. It would probably make sense to reuse the > > > user_catalog_table option for that purpose. If we have a history table > > > for each subscription that wants to record the conflict history (I > > > believe so), it would be hard to go with the second option (having > > > hard-code checks). > > > > > > > Agreed. Let's wait and see what Dilip or others have to say on this. > > Yeah I think this makes sense to create as 'user_catalog_table' tables > when we internally create them. However, IMHO when a user provides > its own table, I believe we should not enforce the restriction for > that table to be created as a 'user_catalog_table' table, or do you > think we should enforce that property? I find that's a user's responsibility, so I would not enforce that property for user-provided-tables. BTW what is the main use case for supporting the use of user-provided tables for the history table? I think we basically don't want the history table to be updated by any other processes than apply workers, so it would make more sense that such a table is created internally and tied to the subscription. I'm less convinced that it has enough upside to warrant the complexity. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-25T05:39:59Z
On Sat, Sep 20, 2025 at 5:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > If we compare conflict_history_table with the slot that gets created > > > with subscription, one can say the same thing about slots. Users can > > > drop the slots and whole replication will stop. I think this table > > > will be created with the same privileges as the owner of a > > > subscription which can be either a superuser or a user with the > > > privileges of the pg_create_subscription role, so we can rely on such > > > users. > > > > We might want to consider which role inserts the conflict info into > > the history table. For example, if any table created by a user can be > > used as the history table for a subscription and the conflict info > > insertion is performed by the subscription owner, we would end up > > having the same security issue that was addressed by the run_as_owner > > subscription option. > > > > Yeah, I don't think we want to open that door. For user created > tables, we should perform actions with table_owner's privilege. In > such a case, if one wants to create a subscription with run_as_owner > option, she should give DML operation permissions to the subscription > owner. OTOH, if we create this table internally (via subscription > owner) then irrespective of run_as_owner, we will always insert as > subscription_owner. > > AFAIR, one open point for internally created tables is whether we > should skip changes to conflict_history table while replicating > changes? The table will be considered under for ALL TABLES > publications, if defined? Ideally, these should behave as catalog > tables, so one option is to mark them as 'user_catalog_table', or the > other option is we have some hard-code checks during replication. The > first option has the advantage that it won't write additional WAL for > these tables which is otherwise required under wal_level=logical. What > other options do we have? I was doing more analysis and testing for 'use_catalog_table', so what I found is when a table is marked as 'use_catalog_table', it will log extra information i.e. CID[1] so that these tables can be used for scanning as well during decoding like catalog tables using historical snapshot. And I have checked the code and tested as well 'use_catalog_table' does get streamed with ALL TABLE options. Am I missing something or are we thinking of changing the behavior of use_catalog_table so that they do not get decoded, but I think that will change the existing behaviour so might not be a good option, yet another idea is to invent some other option for which purpose called 'conflict_history_purpose' but maybe that doesn't justify the purpose of the new option IMHO. [1] /* * For logical decode we need combo CIDs to properly decode the * catalog */ if (RelationIsAccessibleInLogicalDecoding(relation)) log_heap_new_cid(relation, &tp); -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-25T06:23:45Z
On Thu, Sep 25, 2025 at 11:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Sep 20, 2025 at 5:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Sep 18, 2025 at 11:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Thu, Sep 18, 2025 at 1:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > If we compare conflict_history_table with the slot that gets created > > > > with subscription, one can say the same thing about slots. Users can > > > > drop the slots and whole replication will stop. I think this table > > > > will be created with the same privileges as the owner of a > > > > subscription which can be either a superuser or a user with the > > > > privileges of the pg_create_subscription role, so we can rely on such > > > > users. > > > > > > We might want to consider which role inserts the conflict info into > > > the history table. For example, if any table created by a user can be > > > used as the history table for a subscription and the conflict info > > > insertion is performed by the subscription owner, we would end up > > > having the same security issue that was addressed by the run_as_owner > > > subscription option. > > > > > > > Yeah, I don't think we want to open that door. For user created > > tables, we should perform actions with table_owner's privilege. In > > such a case, if one wants to create a subscription with run_as_owner > > option, she should give DML operation permissions to the subscription > > owner. OTOH, if we create this table internally (via subscription > > owner) then irrespective of run_as_owner, we will always insert as > > subscription_owner. > > > > AFAIR, one open point for internally created tables is whether we > > should skip changes to conflict_history table while replicating > > changes? The table will be considered under for ALL TABLES > > publications, if defined? Ideally, these should behave as catalog > > tables, so one option is to mark them as 'user_catalog_table', or the > > other option is we have some hard-code checks during replication. The > > first option has the advantage that it won't write additional WAL for > > these tables which is otherwise required under wal_level=logical. What > > other options do we have? > > I was doing more analysis and testing for 'use_catalog_table', so what > I found is when a table is marked as 'use_catalog_table', it will log > extra information i.e. CID[1] so that these tables can be used for > scanning as well during decoding like catalog tables using historical > snapshot. And I have checked the code and tested as well > 'use_catalog_table' does get streamed with ALL TABLE options. Am I > missing something or are we thinking of changing the behavior of > use_catalog_table so that they do not get decoded, but I think that > will change the existing behaviour so might not be a good option, yet > another idea is to invent some other option for which purpose called > 'conflict_history_purpose' but maybe that doesn't justify the purpose > of the new option IMHO. > > [1] > /* > * For logical decode we need combo CIDs to properly decode the > * catalog > */ > if (RelationIsAccessibleInLogicalDecoding(relation)) > log_heap_new_cid(relation, &tp); > Meanwhile I am also exploring the option where we can just CREATE TYPE in initialize_data_directory() during initdb, basically we will create this type in template1 so that it will be available in all the databases, and that would simplify the table creation whether we create internally or we allow user to create it. And while checking is_publishable_class we can check the type and avoid publishing those tables. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-25T10:49:33Z
On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > [1] > > /* > > * For logical decode we need combo CIDs to properly decode the > > * catalog > > */ > > if (RelationIsAccessibleInLogicalDecoding(relation)) > > log_heap_new_cid(relation, &tp); > > > > Meanwhile I am also exploring the option where we can just CREATE TYPE > in initialize_data_directory() during initdb, basically we will create > this type in template1 so that it will be available in all the > databases, and that would simplify the table creation whether we > create internally or we allow user to create it. And while checking > is_publishable_class we can check the type and avoid publishing those > tables. > Based on my off list discussion with Amit, one option could be to set HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict history table, for that we can not use SPI interface to insert instead we will have to directly call the heap_insert() to add this option. Since we do not want to create any trigger etc on this table, direct insert should be fine, but if we plan to create this table as partitioned table in future then direct heap insert might not work. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-26T11:12:11Z
On Thu, Sep 25, 2025 at 4:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > [1] > > > /* > > > * For logical decode we need combo CIDs to properly decode the > > > * catalog > > > */ > > > if (RelationIsAccessibleInLogicalDecoding(relation)) > > > log_heap_new_cid(relation, &tp); > > > > > > > Meanwhile I am also exploring the option where we can just CREATE TYPE > > in initialize_data_directory() during initdb, basically we will create > > this type in template1 so that it will be available in all the > > databases, and that would simplify the table creation whether we > > create internally or we allow user to create it. And while checking > > is_publishable_class we can check the type and avoid publishing those > > tables. > > > > Based on my off list discussion with Amit, one option could be to set > HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict > history table, for that we can not use SPI interface to insert instead > we will have to directly call the heap_insert() to add this option. > Since we do not want to create any trigger etc on this table, direct > insert should be fine, but if we plan to create this table as > partitioned table in future then direct heap insert might not work. Upon further reflection, I realized that while this approach avoids streaming inserts to the conflict log history table, it still requires that table to exist on the subscriber node upon subscription creation, which isn't ideal. We have two main options to address this: Option1: When calling pg_get_publication_tables(), if the 'alltables' option is used, we can scan all subscriptions and explicitly ignore (filter out) all conflict history tables. This will not be very costly as this will scan the subscriber when pg_get_publication_tables() is called, which is only called during create subscription/alter subscription on the remote node. Option2: Alternatively, we could introduce a table creation option, like a 'non-publishable' flag, to prevent a table from being streamed entirely. I believe this would be a valuable, independent feature for users who want to create certain tables without including them in logical replication. I prefer option2, as I feel this can add value independent of this patch. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-09-27T15:23:28Z
On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Sep 25, 2025 at 4:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > [1] > > > > /* > > > > * For logical decode we need combo CIDs to properly decode the > > > > * catalog > > > > */ > > > > if (RelationIsAccessibleInLogicalDecoding(relation)) > > > > log_heap_new_cid(relation, &tp); > > > > > > > > > > Meanwhile I am also exploring the option where we can just CREATE TYPE > > > in initialize_data_directory() during initdb, basically we will create > > > this type in template1 so that it will be available in all the > > > databases, and that would simplify the table creation whether we > > > create internally or we allow user to create it. And while checking > > > is_publishable_class we can check the type and avoid publishing those > > > tables. > > > > > > > Based on my off list discussion with Amit, one option could be to set > > HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict > > history table, for that we can not use SPI interface to insert instead > > we will have to directly call the heap_insert() to add this option. > > Since we do not want to create any trigger etc on this table, direct > > insert should be fine, but if we plan to create this table as > > partitioned table in future then direct heap insert might not work. > > Upon further reflection, I realized that while this approach avoids > streaming inserts to the conflict log history table, it still requires > that table to exist on the subscriber node upon subscription creation, > which isn't ideal. > I am not able to understand what exact problem you are seeing here. I was thinking that during the CREATE SUBSCRIPTION command, a new table with user provided name will be created similar to how we create a slot. The difference would be that we create a slot on the remote/publisher node but this table will be created locally. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-27T15:54:15Z
On Sat, Sep 27, 2025 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I am not able to understand what exact problem you are seeing here. I > was thinking that during the CREATE SUBSCRIPTION command, a new table > with user provided name will be created similar to how we create a > slot. The difference would be that we create a slot on the > remote/publisher node but this table will be created locally. > That's not an issue, the problem here we are discussing is the conflict history table which is created on the subscriber node should not be published when this node subscription node create another publisher with ALL TABLE option. So we found a option for inserting into this table with HEAP_INSERT_NO_LOGICAL flag so that those insert will not be decoded, but what about another not subscribing from this publisher, they should have this table because when ALL TABLES are published subscriber node expect all user table to present there even if its changes are not published. Consider below example Node1: CREATE PUBLICATION pub_node1.. Node2: CREATE SUBSCRIPTION sub.. PUBLICATION pub_node1 WITH(conflict_history_table='my_conflict_table'); CREATE PUBLICATION pub_node2 FOR ALL TABLE; Node3: CREATE SUBSCRIPTION sub1.. PUBLICATION pub_node2; --this will expect 'my_conflict_table' to exist here because when it will call pg_get_publication_tables() from Node2 it will also get the 'my_conflict_table' along with other user tables. And as a solution I wanted to avoid this table to be avoided when pg_get_publication_tables() is being called. Option1: We can see if table name is listed as conflict history table in any of the subscribers on Node2 we will ignore this. Option2: Provide a new table option to mark table as non publishable table when ALL TABLE option is provided, I think this option can be useful independently as well. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-09-27T21:13:39Z
On Sat, Sep 27, 2025 at 9:24 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Sep 27, 2025 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > I am not able to understand what exact problem you are seeing here. I > > was thinking that during the CREATE SUBSCRIPTION command, a new table > > with user provided name will be created similar to how we create a > > slot. The difference would be that we create a slot on the > > remote/publisher node but this table will be created locally. > > > That's not an issue, the problem here we are discussing is the > conflict history table which is created on the subscriber node should > not be published when this node subscription node create another > publisher with ALL TABLE option. So we found a option for inserting > into this table with HEAP_INSERT_NO_LOGICAL flag so that those insert > will not be decoded, but what about another not subscribing from this > publisher, they should have this table because when ALL TABLES are > published subscriber node expect all user table to present there even > if its changes are not published. Consider below example > > Node1: > CREATE PUBLICATION pub_node1.. > > Node2: > CREATE SUBSCRIPTION sub.. PUBLICATION pub_node1 > WITH(conflict_history_table='my_conflict_table'); > CREATE PUBLICATION pub_node2 FOR ALL TABLE; > > Node3: > CREATE SUBSCRIPTION sub1.. PUBLICATION pub_node2; --this will expect > 'my_conflict_table' to exist here because when it will call > pg_get_publication_tables() from Node2 it will also get the > 'my_conflict_table' along with other user tables. > > And as a solution I wanted to avoid this table to be avoided when > pg_get_publication_tables() is being called. > Option1: We can see if table name is listed as conflict history table > in any of the subscribers on Node2 we will ignore this. > Option2: Provide a new table option to mark table as non publishable > table when ALL TABLE option is provided, I think this option can be > useful independently as well. > I agree that option-2 is useful and IIUC, we are already working on something similar in thread [1]. However, it is better to use option-1 here because we are using non-user specified mechanism to skip changes during replication, so following the same during other times is preferable. Once we have that other feature [1], we can probably optimize this code to use it without taking input from the user. The other reason of not going with the option-2 in the way you are proposing is that it doesn't seem like a good idea to have multiple ways to specify skipping tables from publishing. I find the approach being discussed in thread [1] a generic and better than a new table-level option. [1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-28T11:45:41Z
On Sun, Sep 28, 2025 at 2:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I agree that option-2 is useful and IIUC, we are already working on > something similar in thread [1]. However, it is better to use option-1 > here because we are using non-user specified mechanism to skip changes > during replication, so following the same during other times is > preferable. Once we have that other feature [1], we can probably > optimize this code to use it without taking input from the user. The > other reason of not going with the option-2 in the way you are > proposing is that it doesn't seem like a good idea to have multiple > ways to specify skipping tables from publishing. I find the approach > being discussed in thread [1] a generic and better than a new > table-level option. > > [1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com I understand the current discussion revolves around using an EXCEPT clause (for tables/schemas/columns) during publication creation. But what we want is to mark some table which will be excluded permanently from publication, because we can not expect users to explicitly exclude them while creating publication. So, I propose we add a "non-publishable" property to tables themselves. This is a more valuable option for users who are certain that certain tables should never be replicated. By marking a table as non-publishable, we save users the effort of repeatedly listing it in the EXCEPT option for every new publication. Both methods have merit, but the proposed table property addresses the need for a permanent, system-wide exclusion. See below test with a quick hack, what I am referring to. postgres[2730657]=# CREATE TABLE test(a int) WITH (NON_PUBLISHABLE_TABLE = true); CREATE TABLE postgres[2730657]=# CREATE PUBLICATION pub FOR ALL TABLES ; CREATE PUBLICATION postgres[2730657]=# select pg_get_publication_tables('pub'); pg_get_publication_tables --------------------------- (0 rows) But I agree this is an additional table option which might need consensus, so meanwhile we can proceed with option2, I will prepare patches with option-2 and as a add on patch I will propose option-1. And this option-1 patch can be discussed in a separate thread as well. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-09-29T09:57:23Z
On Sun, Sep 28, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sun, Sep 28, 2025 at 2:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > I agree that option-2 is useful and IIUC, we are already working on > > something similar in thread [1]. However, it is better to use option-1 > > here because we are using non-user specified mechanism to skip changes > > during replication, so following the same during other times is > > preferable. Once we have that other feature [1], we can probably > > optimize this code to use it without taking input from the user. The > > other reason of not going with the option-2 in the way you are > > proposing is that it doesn't seem like a good idea to have multiple > > ways to specify skipping tables from publishing. I find the approach > > being discussed in thread [1] a generic and better than a new > > table-level option. > > > > [1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com > > I understand the current discussion revolves around using an EXCEPT > clause (for tables/schemas/columns) during publication creation. But > what we want is to mark some table which will be excluded permanently > from publication, because we can not expect users to explicitly > exclude them while creating publication. > > So, I propose we add a "non-publishable" property to tables > themselves. This is a more valuable option for users who are certain > that certain tables should never be replicated. > > By marking a table as non-publishable, we save users the effort of > repeatedly listing it in the EXCEPT option for every new publication. > Both methods have merit, but the proposed table property addresses the > need for a permanent, system-wide exclusion. > > See below test with a quick hack, what I am referring to. > > postgres[2730657]=# CREATE TABLE test(a int) WITH > (NON_PUBLISHABLE_TABLE = true); > CREATE TABLE > postgres[2730657]=# CREATE PUBLICATION pub FOR ALL TABLES ; > CREATE PUBLICATION > postgres[2730657]=# select pg_get_publication_tables('pub'); > pg_get_publication_tables > --------------------------- > (0 rows) > > > But I agree this is an additional table option which might need > consensus, so meanwhile we can proceed with option2, I will prepare > patches with option-2 and as a add on patch I will propose option-1. > And this option-1 patch can be discussed in a separate thread as well. So here is the patch set using option-2, with this when alltable option is used and we get pg_get_publication_tables(), this will check the relid against the conflict history tables in the subscribers and those tables will not be added to the list. I will start a separate thread for proposing the patch I sent in previous email. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-11T10:18:55Z
On Mon, Sep 29, 2025 at 3:27 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sun, Sep 28, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sun, Sep 28, 2025 at 2:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > I agree that option-2 is useful and IIUC, we are already working on > > > something similar in thread [1]. However, it is better to use option-1 > > > here because we are using non-user specified mechanism to skip changes > > > during replication, so following the same during other times is > > > preferable. Once we have that other feature [1], we can probably > > > optimize this code to use it without taking input from the user. The > > > other reason of not going with the option-2 in the way you are > > > proposing is that it doesn't seem like a good idea to have multiple > > > ways to specify skipping tables from publishing. I find the approach > > > being discussed in thread [1] a generic and better than a new > > > table-level option. > > > > > > [1] - https://www.postgresql.org/message-id/CANhcyEVt2CBnG7MOktaPPV4rYapHR-VHe5%3DqoziTZh1L9SVc6w%40mail.gmail.com > > > > I understand the current discussion revolves around using an EXCEPT > > clause (for tables/schemas/columns) during publication creation. But > > what we want is to mark some table which will be excluded permanently > > from publication, because we can not expect users to explicitly > > exclude them while creating publication. > > > > So, I propose we add a "non-publishable" property to tables > > themselves. This is a more valuable option for users who are certain > > that certain tables should never be replicated. > > > > By marking a table as non-publishable, we save users the effort of > > repeatedly listing it in the EXCEPT option for every new publication. > > Both methods have merit, but the proposed table property addresses the > > need for a permanent, system-wide exclusion. > > > > See below test with a quick hack, what I am referring to. > > > > postgres[2730657]=# CREATE TABLE test(a int) WITH > > (NON_PUBLISHABLE_TABLE = true); > > CREATE TABLE > > postgres[2730657]=# CREATE PUBLICATION pub FOR ALL TABLES ; > > CREATE PUBLICATION > > postgres[2730657]=# select pg_get_publication_tables('pub'); > > pg_get_publication_tables > > --------------------------- > > (0 rows) > > > > > > But I agree this is an additional table option which might need > > consensus, so meanwhile we can proceed with option2, I will prepare > > patches with option-2 and as a add on patch I will propose option-1. > > And this option-1 patch can be discussed in a separate thread as well. > > So here is the patch set using option-2, with this when alltable > option is used and we get pg_get_publication_tables(), this will check > the relid against the conflict history tables in the subscribers and > those tables will not be added to the list. I will start a separate > thread for proposing the patch I sent in previous email. > I have started going through this thread. Is it possible to rebase the patches and post? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-11T10:26:56Z
On Tue, Nov 11, 2025 at 3:49 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Sep 29, 2025 at 3:27 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > I have started going through this thread. Is it possible to rebase the > patches and post? Thanks Shveta, I will post the rebased patch by tomorrow. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-12T06:50:55Z
On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Sep 25, 2025 at 4:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > [1] > > > > /* > > > > * For logical decode we need combo CIDs to properly decode the > > > > * catalog > > > > */ > > > > if (RelationIsAccessibleInLogicalDecoding(relation)) > > > > log_heap_new_cid(relation, &tp); > > > > > > > > > > Meanwhile I am also exploring the option where we can just CREATE TYPE > > > in initialize_data_directory() during initdb, basically we will create > > > this type in template1 so that it will be available in all the > > > databases, and that would simplify the table creation whether we > > > create internally or we allow user to create it. And while checking > > > is_publishable_class we can check the type and avoid publishing those > > > tables. > > > > > > > Based on my off list discussion with Amit, one option could be to set > > HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict > > history table, for that we can not use SPI interface to insert instead > > we will have to directly call the heap_insert() to add this option. > > Since we do not want to create any trigger etc on this table, direct > > insert should be fine, but if we plan to create this table as > > partitioned table in future then direct heap insert might not work. > > Upon further reflection, I realized that while this approach avoids > streaming inserts to the conflict log history table, it still requires > that table to exist on the subscriber node upon subscription creation, > which isn't ideal. > > We have two main options to address this: > > Option1: > When calling pg_get_publication_tables(), if the 'alltables' option is > used, we can scan all subscriptions and explicitly ignore (filter out) > all conflict history tables. This will not be very costly as this > will scan the subscriber when pg_get_publication_tables() is called, > which is only called during create subscription/alter subscription on > the remote node. > > Option2: > Alternatively, we could introduce a table creation option, like a > 'non-publishable' flag, to prevent a table from being streamed > entirely. I believe this would be a valuable, independent feature for > users who want to create certain tables without including them in > logical replication. > > I prefer option2, as I feel this can add value independent of this patch. > I agree that marking tables with a flag to easily exclude them during publishing would be cleaner. In the current patch, for an ALL-TABLES publication, we scan pg_subscription for each table in pg_class to check its subconflicttable and decide whether to ignore it. But since this only happens during create/alter subscription and refresh publication, the overhead should be acceptable. Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good enhancement but since we already have the EXCEPT list built in a separate thread, that might be sufficient for now. IMO, such conflict-tables should be marked internally (for example, with a ‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily identified within the system, without requiring users to explicitly specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to see what others think on this. For the time being, the current implementation looks fine, considering it runs only during a few publication-related DDL operations. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-12T09:10:28Z
On Wed, Nov 12, 2025 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > I agree that marking tables with a flag to easily exclude them during > publishing would be cleaner. In the current patch, for an ALL-TABLES > publication, we scan pg_subscription for each table in pg_class to > check its subconflicttable and decide whether to ignore it. But since > this only happens during create/alter subscription and refresh > publication, the overhead should be acceptable. Thanks for your opinion. > Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good > enhancement but since we already have the EXCEPT list built in a > separate thread, that might be sufficient for now. IMO, such > conflict-tables should be marked internally (for example, with a > ‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily > identified within the system, without requiring users to explicitly > specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to > see what others think on this. > For the time being, the current implementation looks fine, considering > it runs only during a few publication-related DDL operations. +1 Here is the rebased patch, changes apart from rebasing it 1) Dropped the conflict history table during drop subscription 2) Added test cases for testing the conflict history table behavior with CREATE/ALTER/DROP subscription TODO: 1) Need more thoughts on the table schema whether we need to capture more items or shall we drop some fields if we think those are not necessary. 2) Logical replication test for generating conflict and capturing in conflict history table. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-12T09:44:06Z
On Wed, Nov 12, 2025 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Nov 12, 2025 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > I agree that marking tables with a flag to easily exclude them during > > publishing would be cleaner. In the current patch, for an ALL-TABLES > > publication, we scan pg_subscription for each table in pg_class to > > check its subconflicttable and decide whether to ignore it. But since > > this only happens during create/alter subscription and refresh > > publication, the overhead should be acceptable. > > Thanks for your opinion. > > > Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good > > enhancement but since we already have the EXCEPT list built in a > > separate thread, that might be sufficient for now. IMO, such > > conflict-tables should be marked internally (for example, with a > > ‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily > > identified within the system, without requiring users to explicitly > > specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to > > see what others think on this. > > For the time being, the current implementation looks fine, considering > > it runs only during a few publication-related DDL operations. > > +1 > > Here is the rebased patch, changes apart from rebasing it > 1) Dropped the conflict history table during drop subscription > 2) Added test cases for testing the conflict history table behavior > with CREATE/ALTER/DROP subscription Thanks. > TODO: > 1) Need more thoughts on the table schema whether we need to capture > more items or shall we drop some fields if we think those are not > necessary. Yes, this needs some more thoughts. I will review. I feel since design is somewhat agreed upon, we may handle code-correction/completion. I have not looked at the rebased patch yet, but here are a few comments based on old-version. Few observations related to publication. ------------------------------ (In the below comments, clt/CLT implies Conflict Log Table) 1) 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. 2) '\d+ clt' shows all-tables publication name. I feel we should not show that for clt. 3) I am able to create a publication for clt table, should it be allowed? create subscription sub1 connection '...' publication pub1 WITH(conflict_log_table='clt'); create publication pub3 for table clt; 4) Is there a reason we have not made '!IsConflictHistoryRelid' check as part of is_publishable_class() itself? If we do so, other code-logics will also get clt as non-publishable always (and will solve a few of the above issues I think). IIUC, there is no place where we want to mark CLT as publishable or is there any? 5) Also, I feel we can add some documentation now to help others to understand/review the patch better without going through the long thread. Few observations related to conflict-logging: ------------------------------ 1) I found that for the conflicts which ultimately result in Error, we do not insert any conflict-record in clt. a) Example: insert_exists, update_Exists create table tab1 (i int primary key, j int); sub: insert into tab1 values(30,10); pub: insert into tab1 values(30,10); ERROR: conflict detected on relation "public.tab1": conflict=insert_exists No record in clt. sub: <some pre-data needed> update tab1 set i=40 where i = 30; pub: update tab1 set i=40 where i = 20; ERROR: conflict detected on relation "public.tab1": conflict=update_exists No record in clt. b) Another question related to this is, since these conflicts (which results in error) keep on happening until user resolves these or skips these or 'disable_on_error' is set. Then are we going to insert these multiple times? We do count these in 'confl_insert_exists' and 'confl_update_exists' everytime, so it makes sense to log those each time in clt as well. Thoughts? 2) Conflicts where row on sub is missing, local_ts incorrectly inserted. It is '2000-01-01 05:30:00+05:30'. Should it be Null or something indicating that it is not applicable for this conflict-type? Example: delete_missing, update_missing pub: insert into tab1 values(10,10); insert into tab1 values(20,10); sub: delete from tab1 where i=10; pub: delete from tab1 where i=10; thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-13T09:09:02Z
On Wed, Nov 12, 2025 at 3:14 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Nov 12, 2025 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, Nov 12, 2025 at 12:21 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > I agree that marking tables with a flag to easily exclude them during > > > publishing would be cleaner. In the current patch, for an ALL-TABLES > > > publication, we scan pg_subscription for each table in pg_class to > > > check its subconflicttable and decide whether to ignore it. But since > > > this only happens during create/alter subscription and refresh > > > publication, the overhead should be acceptable. > > > > Thanks for your opinion. > > > > > Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good > > > enhancement but since we already have the EXCEPT list built in a > > > separate thread, that might be sufficient for now. IMO, such > > > conflict-tables should be marked internally (for example, with a > > > ‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily > > > identified within the system, without requiring users to explicitly > > > specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to > > > see what others think on this. > > > For the time being, the current implementation looks fine, considering > > > it runs only during a few publication-related DDL operations. > > > > +1 > > > > Here is the rebased patch, changes apart from rebasing it > > 1) Dropped the conflict history table during drop subscription > > 2) Added test cases for testing the conflict history table behavior > > with CREATE/ALTER/DROP subscription > > Thanks. > > > TODO: > > 1) Need more thoughts on the table schema whether we need to capture > > more items or shall we drop some fields if we think those are not > > necessary. > > Yes, this needs some more thoughts. I will review. > > I feel since design is somewhat agreed upon, we may handle > code-correction/completion. I have not looked at the rebased patch > yet, but here are a few comments based on old-version. > > Few observations related to publication. > ------------------------------ > > (In the below comments, clt/CLT implies Conflict Log Table) > > 1) > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. > > 2) > '\d+ clt' shows all-tables publication name. I feel we should not > show that for clt. > > 3) > I am able to create a publication for clt table, should it be allowed? > > create subscription sub1 connection '...' publication pub1 > WITH(conflict_log_table='clt'); > create publication pub3 for table clt; > > 4) > Is there a reason we have not made '!IsConflictHistoryRelid' check as > part of is_publishable_class() itself? If we do so, other code-logics > will also get clt as non-publishable always (and will solve a few of > the above issues I think). IIUC, there is no place where we want to > mark CLT as publishable or is there any? > > 5) Also, I feel we can add some documentation now to help others to > understand/review the patch better without going through the long > thread. > > > Few observations related to conflict-logging: > ------------------------------ > 1) > I found that for the conflicts which ultimately result in Error, we do > not insert any conflict-record in clt. > > a) > Example: insert_exists, update_Exists > create table tab1 (i int primary key, j int); > sub: insert into tab1 values(30,10); > pub: insert into tab1 values(30,10); > ERROR: conflict detected on relation "public.tab1": conflict=insert_exists > No record in clt. > > sub: > <some pre-data needed> > update tab1 set i=40 where i = 30; > pub: update tab1 set i=40 where i = 20; > ERROR: conflict detected on relation "public.tab1": conflict=update_exists > No record in clt. > > b) > Another question related to this is, since these conflicts (which > results in error) keep on happening until user resolves these or skips > these or 'disable_on_error' is set. Then are we going to insert these > multiple times? We do count these in 'confl_insert_exists' and > 'confl_update_exists' everytime, so it makes sense to log those each > time in clt as well. Thoughts? > > 2) > Conflicts where row on sub is missing, local_ts incorrectly inserted. > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something > indicating that it is not applicable for this conflict-type? > > Example: delete_missing, update_missing > pub: > insert into tab1 values(10,10); > insert into tab1 values(20,10); > sub: delete from tab1 where i=10; > pub: delete from tab1 where i=10; > 3) We also need to think how we are going to display the info in case of multiple_unique_conflicts as there could be multiple local and remote tuples conflicting for one single operation. Example: create table conf_tab (a int primary key, b int unique, c int unique); sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4); pub: insert into conf_tab values (2,3,4); ERROR: conflict detected on relation "public.conf_tab": conflict=multiple_unique_conflicts DETAIL: Key already exists in unique index "conf_tab_pkey", modified locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30. Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4). Key already exists in unique index "conf_tab_b_key", modified locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30. Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4). Key already exists in unique index "conf_tab_c_key", modified locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30. Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4). CONTEXT: processing remote data for replication origin "pg_16392" during message type "INSERT" for replication target relation "public.conf_tab" in transaction 781, finished at 0/017FDDA0 Currently in clt, we have singular terms such as 'key_tuple', 'local_tuple', 'remote_tuple'. Shall we have multiple rows inserted? But it does not look reasonable to have multiple rows inserted for a single conflict raised. I will think more about this. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-13T15:47:11Z
On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > Few observations related to publication. > > ------------------------------ Thanks Shveta, for testing and sharing your thoughts. IMHO for conflict log tables it should be good enough if we restrict it when ALL TABLE options are used, I don't think we need to put extra effort to completely restrict it even if users want to explicitly list it into the publication. > > > > (In the below comments, clt/CLT implies Conflict Log Table) > > > > 1) > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. This function is used while publishing every single change and I don't think we want to add a cost to check each subscription to identify whether the table is listed as CLT. > > 2) > > '\d+ clt' shows all-tables publication name. I feel we should not > > show that for clt. I think we should fix this. > > 3) > > I am able to create a publication for clt table, should it be allowed? I believe we should not do any specific handling to restrict this but I am open for the opinions. > > create subscription sub1 connection '...' publication pub1 > > WITH(conflict_log_table='clt'); > > create publication pub3 for table clt; > > > > 4) > > Is there a reason we have not made '!IsConflictHistoryRelid' check as > > part of is_publishable_class() itself? If we do so, other code-logics > > will also get clt as non-publishable always (and will solve a few of > > the above issues I think). IIUC, there is no place where we want to > > mark CLT as publishable or is there any? IMHO the main reason is performance. > > 5) Also, I feel we can add some documentation now to help others to > > understand/review the patch better without going through the long > > thread. Make sense, I will do that in the next version. > > > > Few observations related to conflict-logging: > > ------------------------------ > > 1) > > I found that for the conflicts which ultimately result in Error, we do > > not insert any conflict-record in clt. > > > > a) > > Example: insert_exists, update_Exists > > create table tab1 (i int primary key, j int); > > sub: insert into tab1 values(30,10); > > pub: insert into tab1 values(30,10); > > ERROR: conflict detected on relation "public.tab1": conflict=insert_exists > > No record in clt. > > > > sub: > > <some pre-data needed> > > update tab1 set i=40 where i = 30; > > pub: update tab1 set i=40 where i = 20; > > ERROR: conflict detected on relation "public.tab1": conflict=update_exists > > No record in clt. Yeah that interesting need to put thought on how to commit this record when an outer transaction is aborted as we do not have autonomous transactions which are generally used for this kind of logging. But we can explore more options like inserting into conflict log tables outside the outer transaction. > > b) > > Another question related to this is, since these conflicts (which > > results in error) keep on happening until user resolves these or skips > > these or 'disable_on_error' is set. Then are we going to insert these > > multiple times? We do count these in 'confl_insert_exists' and > > 'confl_update_exists' everytime, so it makes sense to log those each > > time in clt as well. Thoughts? I think it make sense to insert every time we see the conflict, but it would be good to have opinion from others as well. > > 2) > > Conflicts where row on sub is missing, local_ts incorrectly inserted. > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something > > indicating that it is not applicable for this conflict-type? > > > > Example: delete_missing, update_missing > > pub: > > insert into tab1 values(10,10); > > insert into tab1 values(20,10); > > sub: delete from tab1 where i=10; > > pub: delete from tab1 where i=10; Sure I will test this. > > 3) > We also need to think how we are going to display the info in case of > multiple_unique_conflicts as there could be multiple local and remote > tuples conflicting for one single operation. Example: > > create table conf_tab (a int primary key, b int unique, c int unique); > > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4); > > pub: insert into conf_tab values (2,3,4); > > ERROR: conflict detected on relation "public.conf_tab": > conflict=multiple_unique_conflicts > DETAIL: Key already exists in unique index "conf_tab_pkey", modified > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4). > Key already exists in unique index "conf_tab_b_key", modified locally > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4). > Key already exists in unique index "conf_tab_c_key", modified locally > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4). > CONTEXT: processing remote data for replication origin "pg_16392" > during message type "INSERT" for replication target relation > "public.conf_tab" in transaction 781, finished at 0/017FDDA0 > > Currently in clt, we have singular terms such as 'key_tuple', > 'local_tuple', 'remote_tuple'. Shall we have multiple rows inserted? > But it does not look reasonable to have multiple rows inserted for a > single conflict raised. I will think more about this. Currently I am inserting multiple records in the conflict history table, the same as each tuple is logged, but couldn't find any better way for this. Another option is to use an array of tuples instead of a single tuple but not sure this might make things more complicated to process by any external tool. But you are right, this needs more discussion. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-17T06:24:11Z
On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > Few observations related to publication. > > > ------------------------------ > > Thanks Shveta, for testing and sharing your thoughts. IMHO for > conflict log tables it should be good enough if we restrict it when > ALL TABLE options are used, I don't think we need to put extra effort > to completely restrict it even if users want to explicitly list it > into the publication. > > > > > > > (In the below comments, clt/CLT implies Conflict Log Table) > > > > > > 1) > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. After putting more thought I have changed this to return false for clt, as this is just an exposed function not called by pgoutput layer. > > > 2) > > > '\d+ clt' shows all-tables publication name. I feel we should not > > > show that for clt. > Fixed > > > > 3) > > > I am able to create a publication for clt table, should it be allowed? > > I believe we should not do any specific handling to restrict this but > I am open for the opinions. Restricting this as well, lets see what others think. > > > > 5) Also, I feel we can add some documentation now to help others to > > > understand/review the patch better without going through the long > > > thread. > > Make sense, I will do that in the next version. Done that but not compiled the docs as I don't currently have the setup so added as WIP patch. > > > 2) > > > Conflicts where row on sub is missing, local_ts incorrectly inserted. > > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something > > > indicating that it is not applicable for this conflict-type? > > > > > > Example: delete_missing, update_missing > > > pub: > > > insert into tab1 values(10,10); > > > insert into tab1 values(20,10); > > > sub: delete from tab1 where i=10; > > > pub: delete from tab1 where i=10; > > Sure I will test this. I have fixed this. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-18T10:09:46Z
On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > Few observations related to publication. > > > ------------------------------ > > Thanks Shveta, for testing and sharing your thoughts. IMHO for > conflict log tables it should be good enough if we restrict it when > ALL TABLE options are used, I don't think we need to put extra effort > to completely restrict it even if users want to explicitly list it > into the publication. > > > > > > > (In the below comments, clt/CLT implies Conflict Log Table) > > > > > > 1) > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. > > This function is used while publishing every single change and I don't > think we want to add a cost to check each subscription to identify > whether the table is listed as CLT. > > > > 2) > > > '\d+ clt' shows all-tables publication name. I feel we should not > > > show that for clt. > > I think we should fix this. > > > > 3) > > > I am able to create a publication for clt table, should it be allowed? > > I believe we should not do any specific handling to restrict this but > I am open for the opinions. > > > > create subscription sub1 connection '...' publication pub1 > > > WITH(conflict_log_table='clt'); > > > create publication pub3 for table clt; > > > > > > 4) > > > Is there a reason we have not made '!IsConflictHistoryRelid' check as > > > part of is_publishable_class() itself? If we do so, other code-logics > > > will also get clt as non-publishable always (and will solve a few of > > > the above issues I think). IIUC, there is no place where we want to > > > mark CLT as publishable or is there any? > > IMHO the main reason is performance. > > > > 5) Also, I feel we can add some documentation now to help others to > > > understand/review the patch better without going through the long > > > thread. > > Make sense, I will do that in the next version. > > > > > > > Few observations related to conflict-logging: > > > ------------------------------ > > > 1) > > > I found that for the conflicts which ultimately result in Error, we do > > > not insert any conflict-record in clt. > > > > > > a) > > > Example: insert_exists, update_Exists > > > create table tab1 (i int primary key, j int); > > > sub: insert into tab1 values(30,10); > > > pub: insert into tab1 values(30,10); > > > ERROR: conflict detected on relation "public.tab1": conflict=insert_exists > > > No record in clt. > > > > > > sub: > > > <some pre-data needed> > > > update tab1 set i=40 where i = 30; > > > pub: update tab1 set i=40 where i = 20; > > > ERROR: conflict detected on relation "public.tab1": conflict=update_exists > > > No record in clt. > > Yeah that interesting need to put thought on how to commit this record > when an outer transaction is aborted as we do not have autonomous > transactions which are generally used for this kind of logging. Right > But > we can explore more options like inserting into conflict log tables > outside the outer transaction. Yes, that seems the way to me. I could not find any such existing reference/usage in code though. > > > > b) > > > Another question related to this is, since these conflicts (which > > > results in error) keep on happening until user resolves these or skips > > > these or 'disable_on_error' is set. Then are we going to insert these > > > multiple times? We do count these in 'confl_insert_exists' and > > > 'confl_update_exists' everytime, so it makes sense to log those each > > > time in clt as well. Thoughts? > > I think it make sense to insert every time we see the conflict, but it > would be good to have opinion from others as well. > > > > 2) > > > Conflicts where row on sub is missing, local_ts incorrectly inserted. > > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something > > > indicating that it is not applicable for this conflict-type? > > > > > > Example: delete_missing, update_missing > > > pub: > > > insert into tab1 values(10,10); > > > insert into tab1 values(20,10); > > > sub: delete from tab1 where i=10; > > > pub: delete from tab1 where i=10; > > Sure I will test this. > > > > > 3) > > We also need to think how we are going to display the info in case of > > multiple_unique_conflicts as there could be multiple local and remote > > tuples conflicting for one single operation. Example: > > > > create table conf_tab (a int primary key, b int unique, c int unique); > > > > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4); > > > > pub: insert into conf_tab values (2,3,4); > > > > ERROR: conflict detected on relation "public.conf_tab": > > conflict=multiple_unique_conflicts > > DETAIL: Key already exists in unique index "conf_tab_pkey", modified > > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4). > > Key already exists in unique index "conf_tab_b_key", modified locally > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4). > > Key already exists in unique index "conf_tab_c_key", modified locally > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4). > > CONTEXT: processing remote data for replication origin "pg_16392" > > during message type "INSERT" for replication target relation > > "public.conf_tab" in transaction 781, finished at 0/017FDDA0 > > > > Currently in clt, we have singular terms such as 'key_tuple', > > 'local_tuple', 'remote_tuple'. Shall we have multiple rows inserted? > > But it does not look reasonable to have multiple rows inserted for a > > single conflict raised. I will think more about this. > > Currently I am inserting multiple records in the conflict history > table, the same as each tuple is logged, but couldn't find any better > way for this. Another option is to use an array of tuples instead of a > single tuple but not sure this might make things more complicated to > process by any external tool. It’s arguable and hard to say what the correct behaviour should be. I’m slightly leaning toward having a single row per conflict. IMO, overall the confl_* counters in pg_stat_subscription_stats should align with the number of entries in the conflict history table, which implies one row even for multiple_unique_conflicts. But I also understand that this approach could make things complicated for external tools. For now, we can proceed with logging multiple rows for a single multiple_unique_conflicts occurrence and wait to hear others’ opinions. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-18T11:17:14Z
On Mon, Nov 17, 2025 at 11:54 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > Few observations related to publication. > > > > ------------------------------ > > > > Thanks Shveta, for testing and sharing your thoughts. IMHO for > > conflict log tables it should be good enough if we restrict it when > > ALL TABLE options are used, I don't think we need to put extra effort > > to completely restrict it even if users want to explicitly list it > > into the publication. > > > > > > > > > > (In the below comments, clt/CLT implies Conflict Log Table) > > > > > > > > 1) > > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. > > After putting more thought I have changed this to return false for > clt, as this is just an exposed function not called by pgoutput layer. > > > > > 2) > > > > '\d+ clt' shows all-tables publication name. I feel we should not > > > > show that for clt. > > > Fixed > > > > > > > 3) > > > > I am able to create a publication for clt table, should it be allowed? > > > > I believe we should not do any specific handling to restrict this but > > I am open for the opinions. > > Restricting this as well, lets see what others think. > > > > > > > > 5) Also, I feel we can add some documentation now to help others to > > > > understand/review the patch better without going through the long > > > > thread. > > > > Make sense, I will do that in the next version. > Done that but not compiled the docs as I don't currently have the > setup so added as WIP patch. > > > > > > 2) > > > > Conflicts where row on sub is missing, local_ts incorrectly inserted. > > > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something > > > > indicating that it is not applicable for this conflict-type? > > > > > > > > Example: delete_missing, update_missing > > > > pub: > > > > insert into tab1 values(10,10); > > > > insert into tab1 values(20,10); > > > > sub: delete from tab1 where i=10; > > > > pub: delete from tab1 where i=10; > > > > Sure I will test this. > > I have fixed this. Thanks for the patch. Some feedback about the clt: 1) local_origin is always NULL in my tests for all conflict types I tried. 2) Do we need 'key_tuple' as such or replica_identity is enough/better? I see 'key_tuple' inserted as {"i":10,"j":null} for delete_missing case where query was 'delete from tab1 where i=10'; here 'i' is PK; which seems okay. But it is '{"i":20,"j":200}' for update_origin_differ case where query was 'update tab1 set j=200 where i =20'. Here too RI is 'i' alone. I feel 'j' should not be part of the key but let me know if I have misunderstood. IMO, 'j' being part of remote_tuple should be good enough. 3) Do we need to have a timestamp column as well to say when conflict was recorded? Or local_commit_ts, remote_commit_ts are sufficient? Thoughts 4) Also, it makes sense if we have 'conflict_type' next to 'relid'. I feel relid and conflict_type are primary columns and rest are related details. 5) Do we need table_schema, table_name when we have relid already? If we want to retain these, we can name them as schemaname and relname to be consistent with all other stats tables. IMO, then the order can be: relid, schemaname, relname, conflcit_type and then the rest of the details. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-11-19T01:30:49Z
Hi Dilip. I started to look at this thread. Here are some comments for patch v4-0001. ===== GENERAL 1. There's some inconsistency in how this new table is called at different times : a) "conflict table" b) "conflict log table" c) "conflict log history table" d) "conflict history" My preference was (b). Making this consistent will have impacts on many macros, variables, comments, function names, etc. ~~~ 2. What about enhancements to description \dRs+ so the subscription conflict log table is displayed? ~~~ 3. What about enhancements to the tab-complete code? ====== src/backend/commands/subscriptioncmds.c 4. #define SUBOPT_MAX_RETENTION_DURATION 0x00008000 #define SUBOPT_LSN 0x00010000 #define SUBOPT_ORIGIN 0x00020000 +#define SUBOPT_CONFLICT_TABLE 0x00030000 Bug? Shouldn't that be 0x00040000. ~~~ 5. + char *conflicttable; XLogRecPtr lsn; } SubOpts; IMO 'conflicttable' looks too much like 'conflictable', which may cause some confusion on first reading. ~~~ 6. +static void CreateConflictLogTable(Oid namespaceId, char *conflictrel); +static void DropConflictLogTable(Oid namespaceId, char *conflictrel); AFAIK it is more conventional for the static functions to be snake_case and the extern functions to use CamelCase. So these would be: - create_conflict_log_table - drop_conflict_log_table ~~~ CreateSubscription: 7. + /* If conflict log table name is given than create the table. */ + if (opts.conflicttable) + CreateConflictLogTable(conflict_table_nspid, conflict_table); + typo: /If conflict/If a conflict/ typo: "than" ~~~ AlterSubscription: 8. - SUBOPT_ORIGIN); + SUBOPT_ORIGIN | + SUBOPT_CONFLICT_TABLE); The line wrapping doesn't seem necessary. ~~~ 9. + replaces[Anum_pg_subscription_subconflictnspid - 1] = true; + replaces[Anum_pg_subscription_subconflicttable - 1] = true; + + CreateConflictLogTable(nspid, relname); + } + What are the rules regarding replacing one log table with a different log table for the same subscription? I didn't see anything about this scenario, nor any test cases. ~~~ CreateConflictLogTable: 10. + /* + * Check if table with same name already present, if so report an error + * as currently we do not support user created table as conflict log + * table. + */ Is the comment about "user-created table" strictly correct? e.g. Won't you encounter the same problem if there are 2 subscriptions trying to set the same-named conflict log table? SUGGESTION Report an error if the specified conflict log table already exists. ~~~ DropConflictLogTable: 11. + /* + * Drop conflict log table if exist, use if exists ensures the command + * won't error if the table is already gone. + */ The reason for EXISTS was already mentioned in the function comment. SUGGESTION Drop the conflict log table if it exists. ====== src/backend/replication/logical/conflict.c 12. +static Datum TupleTableSlotToJsonDatum(TupleTableSlot *slot); + +static void InsertConflictLog(Relation rel, + TransactionId local_xid, + TimestampTz local_ts, + ConflictType conflict_type, + RepOriginId origin_id, + TupleTableSlot *searchslot, + TupleTableSlot *localslot, + TupleTableSlot *remoteslot); Same as earlier comment #6 -- isn't it conventional to use snake_case for the static function names? ~~~ TupleTableSlotToJsonDatum: 13. + * This would be a new internal helper function for logical replication + * Needs to handle various data types and potentially TOASTed data What's this comment about? Something doesn't look quite right. ~~~ InsertConflictLog: 14. + /* TODO: proper error code */ + relid = get_relname_relid(relname, nspid); + if (!OidIsValid(relid)) + elog(ERROR, "conflict log history table does not exists"); + conflictrel = table_open(relid, RowExclusiveLock); + if (conflictrel == NULL) + elog(ERROR, "could not open conflict log history table"); 14a. What's the TODO comment for? Are you going to replace these elogs? ~ 14b. Typo: "does not exists" ~ 14c. An unnecessary double-blank line follows this code fragment. ~~~ 15. + /* Populate the values and nulls arrays */ + attno = 0; + values[attno] = ObjectIdGetDatum(RelationGetRelid(rel)); + attno++; + + if (TransactionIdIsValid(local_xid)) + values[attno] = TransactionIdGetDatum(local_xid); + else + nulls[attno] = true; + attno++; + + if (TransactionIdIsValid(remote_xid)) + values[attno] = TransactionIdGetDatum(remote_xid); + else + nulls[attno] = true; + attno++; + + values[attno] = LSNGetDatum(remote_final_lsn); + attno++; + + if (local_ts > 0) + values[attno] = TimestampTzGetDatum(local_ts); + else + nulls[attno] = true; + attno++; + + if (remote_commit_ts > 0) + values[attno] = TimestampTzGetDatum(remote_commit_ts); + else + nulls[attno] = true; + attno++; + + values[attno] = + CStringGetTextDatum(get_namespace_name(RelationGetNamespace(rel))); + attno++; + + values[attno] = CStringGetTextDatum(RelationGetRelationName(rel)); + attno++; + + values[attno] = CStringGetTextDatum(ConflictTypeNames[conflict_type]); + attno++; + + if (origin_id != InvalidRepOriginId) + replorigin_by_oid(origin_id, true, &origin); + + if (origin != NULL) + values[attno] = CStringGetTextDatum(origin); + else + nulls[attno] = true; + attno++; + + if (replorigin_session_origin != InvalidRepOriginId) + replorigin_by_oid(replorigin_session_origin, true, &remote_origin); + + if (remote_origin != NULL) + values[attno] = CStringGetTextDatum(remote_origin); + else + nulls[attno] = true; + attno++; + + if (searchslot != NULL) + values[attno] = TupleTableSlotToJsonDatum(searchslot); + else + nulls[attno] = true; + attno++; + + if (localslot != NULL) + values[attno] = TupleTableSlotToJsonDatum(localslot); + else + nulls[attno] = true; + attno++; + + if (remoteslot != NULL) + values[attno] = TupleTableSlotToJsonDatum(remoteslot); + else + nulls[attno] = true; + 15a. It might be simpler to just post-increment that 'attno' in all the assignments and save a dozen lines of code: e.g. values[attno++] = ... ~ 15b. Also, put a sanity Assert check at the end, like: Assert(attno + 1 == MAX_CONFLICT_ATTR_NUM); ====== src/backend/utils/cache/lsyscache.c 16. + if (isnull) + { + ReleaseSysCache(tup); + return NULL; + } + + *nspid = subform->subconflictnspid; + relname = pstrdup(TextDatumGetCString(datum)); + + ReleaseSysCache(tup); + + return relname; It would be tidier to have a single release/return by coding this slightly differently. SUGGESTION: char *relname = NULL; ... if (!isnull) { *nspid = subform->subconflictnspid; relname = pstrdup(TextDatumGetCString(datum)); } ReleaseSysCache(tup); return relname; ====== src/include/catalog/pg_subscription.h 17. + Oid subconflictnspid; /* Namespace Oid in which the conflict history + * table is created. */ Would it be better to make these 2 new member names more alike, since they go together. e.g. confl_table_nspid confl_table_name ====== src/include/replication/conflict.h 18. +#define MAX_CONFLICT_ATTR_NUM 15 I felt this doesn't really belong here. Just define it atop/within the function InsertConflictLog() ~~~ 19. extern void InitConflictIndexes(ResultRelInfo *relInfo); + #endif Spurious whitespace change not needed for this patch. ====== src/test/regress/sql/subscription.sql 20. How about adding some more test scenarios: e.g.1. ALTER the conflict log table of some subscription that already has one e.g.2. Have multiple subscriptions that specify the same conflict log table ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-11-19T06:29:45Z
Here are some comments for the patch v4-0002. ====== GENERAL 1. The patch should include test cases: - to confirm an error happens when attempting to publish clt - to confirm \dt+ clt is not showing the ALL TABLES publication - to confirm that SQL function pg_relation_is_publishable givesthe expected result - etc. ====== Commit Message 1. When all table option is used with publication don't publish the conflict history tables. ~ Maybe reword that using uppercase for keywords, like: SUGGESTION A conflict log table will not be published by a FOR ALL TABLES publication. ====== src/backend/catalog/pg_publication.c check_publication_add_relation: 3. + /* Can't be created as conflict log table */ + if (IsConflictLogRelid(RelationGetRelid(targetrel))) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot add relation \"%s\" to publication", + RelationGetRelationName(targetrel)), + errdetail("This operation is not supported for conflict log tables."))); 3a. Typo in comment. SUGGESTION Can't be a conflict log table ~ 3b. I was wondering if this check should be moved to the bottom of the function. I think IsConflictLogRelid() is the most inefficient of all these conditions, so it is better to give the other ones a chance to fail quickly before needing to check for clt. ~~~ pg_relation_is_publishable: 4. /* - * SQL-callable variant of the above + * SQL-callable variant of the above and this should not be a conflict log rel * * This returns null when the relation does not exist. This is intended to be * used for example in psql to avoid gratuitous errors when there are I felt this new comment should be in the code, instead of in the function comment. SUGGESTION /* subscription conflict log tables are not published */ result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) && !IsConflictLogRelid(relid); ~~~ 5. It seemed strange that function pg_relation_is_publishable(PG_FUNCTION_ARGS) is checking IsConflictLogRelid, but function is_publishable_relation(Relation rel) is not. ~~~ GetAllPublicationRelations: 6. + /* conflict history tables are not published. */ if (is_publishable_class(relid, relForm) && + !IsConflictLogRelid(relid) && !(relForm->relispartition && pubviaroot)) result = lappend_oid(result, relid); Inconsistent "history table" terminology. Maybe this comment should be identical to the other one above. e.g. /* subscription conflict log tables are not published */ ====== src/backend/commands/subscriptioncmds.c IsConflictLogRelid: 8. +/* + * Is relation used as a conflict log table + * + * Scan all the subscription and check whether the relation is used as + * conflict log table. + */ typo: "all the subscription" Also, the 2nd sentence repeats the purpose of the function; I don't think you need to say it twice. SUGGESTION Check if the specified relation is used as a conflict log table by any subscription. ~~~ 9. + if (relname == NULL) + continue; + if (relid == get_relname_relid(relname, nspid)) + { + found = true; + break; + } It seemed unnecessary to separate out the 'continue' like that. In passing, consider renaming that generic 'found' to be the proper meaning of the boolean. SUGGESTION if (relname && relid == get_relname_relid(relname, nspid)) { is_clt = true; break; } ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-11-19T07:10:05Z
Hi Dilip, FYI, patch v4-0003 (docs) needs rebasing due to ada78cd. ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-19T10:16:20Z
On Tue, Nov 18, 2025 at 4:47 PM shveta malik <shveta.malik@gmail.com> wrote: > > Thanks for the patch. Some feedback about the clt: > > 1) > local_origin is always NULL in my tests for all conflict types I tried. You need to set the replication origin as shown below On subscriber side: --------------------------- SELECT pg_replication_origin_create('my_remote_source_2'); SELECT pg_replication_origin_session_setup('my_remote_source_2'); UPDATE test SET b=200 where a=1; On remote: --------------- UPDATE test SET b=300 where a=1; -- conflicting operation with local node On subscriber ------------------ postgres[1514377]=# select local_origin, remote_origin from myschema.conflict_log_history2 ; local_origin | remote_origin --------------------+--------------------- my_remote_source_2 | pg_16396 > 2) > Do we need 'key_tuple' as such or replica_identity is enough/better? > I see 'key_tuple' inserted as {"i":10,"j":null} for delete_missing > case where query was 'delete from tab1 where i=10'; here 'i' is PK; > which seems okay. > But it is '{"i":20,"j":200}' for update_origin_differ case where query > was 'update tab1 set j=200 where i =20'. Here too RI is 'i' alone. I > feel 'j' should not be part of the key but let me know if I have > misunderstood. IMO, 'j' being part of remote_tuple should be good > enough. Yeah we should display the replica identity only, I assumed in ReportApplyConflict() the searchslot should only have RI tuple but it is sending a remote tuple in the searchslot, so might need to extract the RI from this slot, I will work on this. > 3) > Do we need to have a timestamp column as well to say when conflict was > recorded? Or local_commit_ts, remote_commit_ts are sufficient? > Thoughts You mean we can record the timestamp now while inserting, not sure if it will add some more meaningful information than remote_commit_ts, but let's see what others think. > 4) > Also, it makes sense if we have 'conflict_type' next to 'relid'. I > feel relid and conflict_type are primary columns and rest are related > details. Sure > 5) > Do we need table_schema, table_name when we have relid already? If we > want to retain these, we can name them as schemaname and relname to be > consistent with all other stats tables. IMO, then the order can be: > relid, schemaname, relname, conflcit_type and then the rest of the > details. Yeah this makes the table denormalized as we can fetch this information by joining with pg_class, but I think it might be better for readability, lets see what others think, for now I will reorder as suggested. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-19T11:19:41Z
On Wed, Nov 19, 2025 at 3:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Nov 18, 2025 at 4:47 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Thanks for the patch. Some feedback about the clt: > > > > 1) > > local_origin is always NULL in my tests for all conflict types I tried. > > You need to set the replication origin as shown below > On subscriber side: > --------------------------- > SELECT pg_replication_origin_create('my_remote_source_2'); > SELECT pg_replication_origin_session_setup('my_remote_source_2'); > UPDATE test SET b=200 where a=1; > > On remote: > --------------- > UPDATE test SET b=300 where a=1; -- conflicting operation with local node > > On subscriber > ------------------ > postgres[1514377]=# select local_origin, remote_origin from > myschema.conflict_log_history2 ; > local_origin | remote_origin > --------------------+--------------------- > my_remote_source_2 | pg_16396 Okay, I see, thanks! > > > 2) > > Do we need 'key_tuple' as such or replica_identity is enough/better? > > I see 'key_tuple' inserted as {"i":10,"j":null} for delete_missing > > case where query was 'delete from tab1 where i=10'; here 'i' is PK; > > which seems okay. > > But it is '{"i":20,"j":200}' for update_origin_differ case where query > > was 'update tab1 set j=200 where i =20'. Here too RI is 'i' alone. I > > feel 'j' should not be part of the key but let me know if I have > > misunderstood. IMO, 'j' being part of remote_tuple should be good > > enough. > > Yeah we should display the replica identity only, I assumed in > ReportApplyConflict() the searchslot should only have RI tuple but it > is sending a remote tuple in the searchslot, so might need to extract > the RI from this slot, I will work on this. yeah, we have extracted it already in errdetail_apply_conflict()->build_tuple_value_details(). See it dumps it in log: LOG: conflict detected on relation "public.tab1": conflict=update_origin_differs DETAIL: Updating the row that was modified locally in transaction 768 at 2025-11-18 12:09:19.658502+05:30. Existing local row (20, 100); remote row (20, 200); replica identity (i)=(20). We somehow need to reuse it. > > > 3) > > Do we need to have a timestamp column as well to say when conflict was > > recorded? Or local_commit_ts, remote_commit_ts are sufficient? > > Thoughts > > You mean we can record the timestamp now while inserting, not sure if > it will add some more meaningful information than remote_commit_ts, > but let's see what others think. > On rethinking, we can skip it. The commit-ts of both sides are enough. > > 4) > > Also, it makes sense if we have 'conflict_type' next to 'relid'. I > > feel relid and conflict_type are primary columns and rest are related > > details. > > Sure > > > 5) > > Do we need table_schema, table_name when we have relid already? If we > > want to retain these, we can name them as schemaname and relname to be > > consistent with all other stats tables. IMO, then the order can be: > > relid, schemaname, relname, conflcit_type and then the rest of the > > details. > > Yeah this makes the table denormalized as we can fetch this > information by joining with pg_class, but I think it might be better > for readability, lets see what others think, for now I will reorder as > suggested. > Okay, works for me if we want to keep these. I see that most of the other statistics tables (pg_stat_all_indexes, pg_statio_all_tables, pg_statio_all_sequences etc) that maintain a relid also retain the names. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-20T12:08:21Z
On Wed, Nov 19, 2025 at 7:01 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Dilip. > > I started to look at this thread. Here are some comments for patch v4-0001. Thanks Peter for your review, worked on most of the comments for 0001 > > ===== > GENERAL > > 1. > There's some inconsistency in how this new table is called at different times : > a) "conflict table" > b) "conflict log table" > c) "conflict log history table" > d) "conflict history" > > My preference was (b). Making this consistent will have impacts on > many macros, variables, comments, function names, etc. Yeah even my preference is b) so used everywhere. > ~~~ > > 2. > What about enhancements to description \dRs+ so the subscription > conflict log table is displayed? Done, I have displayed the conflict log table name, not sure shall we display complete schema qualified name, if so we might need to join with pg_namespace. > ~~~ > > 3. > What about enhancements to the tab-complete code? Done > ====== > src/backend/commands/subscriptioncmds.c > > 4. > #define SUBOPT_MAX_RETENTION_DURATION 0x00008000 > #define SUBOPT_LSN 0x00010000 > #define SUBOPT_ORIGIN 0x00020000 > +#define SUBOPT_CONFLICT_TABLE 0x00030000 > > Bug? Shouldn't that be 0x00040000. Yeah, fixed. > ~~~ > > 5. > + char *conflicttable; > XLogRecPtr lsn; > } SubOpts; > > IMO 'conflicttable' looks too much like 'conflictable', which may > cause some confusion on first reading. Changed to conflictlogtable > ~~~ > > 6. > +static void CreateConflictLogTable(Oid namespaceId, char *conflictrel); > +static void DropConflictLogTable(Oid namespaceId, char *conflictrel); > > AFAIK it is more conventional for the static functions to be > snake_case and the extern functions to use CamelCase. So these would > be: > - create_conflict_log_table > - drop_conflict_log_table Done > ~~~ > > CreateSubscription: > > 7. > + /* If conflict log table name is given than create the table. */ > + if (opts.conflicttable) > + CreateConflictLogTable(conflict_table_nspid, conflict_table); > + > > typo: /If conflict/If a conflict/ > > typo: "than" Fixed > ~~~ > > AlterSubscription: > > 8. > - SUBOPT_ORIGIN); > + SUBOPT_ORIGIN | > + SUBOPT_CONFLICT_TABLE); > > The line wrapping doesn't seem necessary. Without wrapping it crosses 80 characters per line limit. > ~~~ > > 9. > + replaces[Anum_pg_subscription_subconflictnspid - 1] = true; > + replaces[Anum_pg_subscription_subconflicttable - 1] = true; > + > + CreateConflictLogTable(nspid, relname); > + } > + > > What are the rules regarding replacing one log table with a different > log table for the same subscription? I didn't see anything about this > scenario, nor any test cases. Added test and updated the code as well, so if we set different log table, we will drop the old and create new table, however if you set the same table, just NOTICE will be issued and table will not be created again. > ~~~ > > CreateConflictLogTable: > > 10. > + /* > + * Check if table with same name already present, if so report an error > + * as currently we do not support user created table as conflict log > + * table. > + */ > > Is the comment about "user-created table" strictly correct? e.g. Won't > you encounter the same problem if there are 2 subscriptions trying to > set the same-named conflict log table? > > SUGGESTION > Report an error if the specified conflict log table already exists. Done > ~~~ > > DropConflictLogTable: > > 11. > + /* > + * Drop conflict log table if exist, use if exists ensures the command > + * won't error if the table is already gone. > + */ > > The reason for EXISTS was already mentioned in the function comment. > > SUGGESTION > Drop the conflict log table if it exists. Done > ====== > src/backend/replication/logical/conflict.c > > 12. > +static Datum TupleTableSlotToJsonDatum(TupleTableSlot *slot); > + > +static void InsertConflictLog(Relation rel, > + TransactionId local_xid, > + TimestampTz local_ts, > + ConflictType conflict_type, > + RepOriginId origin_id, > + TupleTableSlot *searchslot, > + TupleTableSlot *localslot, > + TupleTableSlot *remoteslot); > > Same as earlier comment #6 -- isn't it conventional to use snake_case > for the static function names? Done > ~~~ > > TupleTableSlotToJsonDatum: > > 13. > + * This would be a new internal helper function for logical replication > + * Needs to handle various data types and potentially TOASTed data > > What's this comment about? Something doesn't look quite right. Hmm, that's bad, fixed. > ~~~ > > InsertConflictLog: > > 14. > + /* TODO: proper error code */ > + relid = get_relname_relid(relname, nspid); > + if (!OidIsValid(relid)) > + elog(ERROR, "conflict log history table does not exists"); > + conflictrel = table_open(relid, RowExclusiveLock); > + if (conflictrel == NULL) > + elog(ERROR, "could not open conflict log history table"); > > 14a. > What's the TODO comment for? Are you going to replace these elogs? replaced with ereport > ~ > > 14b. > Typo: "does not exists" fixed > ~ > > 14c. > An unnecessary double-blank line follows this code fragment. fixed > ~~~ > > 15. > + /* Populate the values and nulls arrays */ > + attno = 0; > + values[attno] = ObjectIdGetDatum(RelationGetRelid(rel)); > + attno++; > + > + if (TransactionIdIsValid(local_xid)) > + values[attno] = TransactionIdGetDatum(local_xid); > + else > + nulls[attno] = true; > + attno++; > + > + if (TransactionIdIsValid(remote_xid)) > + values[attno] = TransactionIdGetDatum(remote_xid); > + else > + nulls[attno] = true; > + attno++; > + > + values[attno] = LSNGetDatum(remote_final_lsn); > + attno++; > + > + if (local_ts > 0) > + values[attno] = TimestampTzGetDatum(local_ts); > + else > + nulls[attno] = true; > + attno++; > + > + if (remote_commit_ts > 0) > + values[attno] = TimestampTzGetDatum(remote_commit_ts); > + else > + nulls[attno] = true; > + attno++; > + > + values[attno] = > + CStringGetTextDatum(get_namespace_name(RelationGetNamespace(rel))); > + attno++; > + > + values[attno] = CStringGetTextDatum(RelationGetRelationName(rel)); > + attno++; > + > + values[attno] = CStringGetTextDatum(ConflictTypeNames[conflict_type]); > + attno++; > + > + if (origin_id != InvalidRepOriginId) > + replorigin_by_oid(origin_id, true, &origin); > + > + if (origin != NULL) > + values[attno] = CStringGetTextDatum(origin); > + else > + nulls[attno] = true; > + attno++; > + > + if (replorigin_session_origin != InvalidRepOriginId) > + replorigin_by_oid(replorigin_session_origin, true, &remote_origin); > + > + if (remote_origin != NULL) > + values[attno] = CStringGetTextDatum(remote_origin); > + else > + nulls[attno] = true; > + attno++; > + > + if (searchslot != NULL) > + values[attno] = TupleTableSlotToJsonDatum(searchslot); > + else > + nulls[attno] = true; > + attno++; > + > + if (localslot != NULL) > + values[attno] = TupleTableSlotToJsonDatum(localslot); > + else > + nulls[attno] = true; > + attno++; > + > + if (remoteslot != NULL) > + values[attno] = TupleTableSlotToJsonDatum(remoteslot); > + else > + nulls[attno] = true; > + > > 15a. > It might be simpler to just post-increment that 'attno' in all the > assignments and save a dozen lines of code: > e.g. values[attno++] = ... Yeah done that > ~ > > 15b. > Also, put a sanity Assert check at the end, like: > Assert(attno + 1 == MAX_CONFLICT_ATTR_NUM); Done > > ====== > src/backend/utils/cache/lsyscache.c > > 16. > + if (isnull) > + { > + ReleaseSysCache(tup); > + return NULL; > + } > + > + *nspid = subform->subconflictnspid; > + relname = pstrdup(TextDatumGetCString(datum)); > + > + ReleaseSysCache(tup); > + > + return relname; > > It would be tidier to have a single release/return by coding this > slightly differently. > > SUGGESTION: > > char *relname = NULL; > ... > if (!isnull) > { > *nspid = subform->subconflictnspid; > relname = pstrdup(TextDatumGetCString(datum)); > } > > ReleaseSysCache(tup); > return relname; Right, changed it. > ====== > src/include/catalog/pg_subscription.h > > 17. > + Oid subconflictnspid; /* Namespace Oid in which the conflict history > + * table is created. */ > > Would it be better to make these 2 new member names more alike, since > they go together. e.g. > confl_table_nspid > confl_table_name In pg_subscription.h all field follows same convention without "_" so I have changed to subconflictlognspid subconflictlogtable > ====== > src/include/replication/conflict.h > > 18. > +#define MAX_CONFLICT_ATTR_NUM 15 > > I felt this doesn't really belong here. Just define it atop/within the > function InsertConflictLog() Done > ~~~ > > 19. > extern void InitConflictIndexes(ResultRelInfo *relInfo); > + > #endif > > Spurious whitespace change not needed for this patch. Fixed > ====== > src/test/regress/sql/subscription.sql > > 20. > How about adding some more test scenarios: > e.g.1. ALTER the conflict log table of some subscription that already has one > e.g.2. Have multiple subscriptions that specify the same conflict log table Added Pending: 1) fixed review comments of 0002 and 0003 2) Need to add replica identity tuple instead of full tuple - reported by Shveta 3) Keeping the logs in case of outer transaction failure by moving log insertion outside the main transaction - reported by Shveta -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-11-21T02:05:43Z
Thanks for addressing all my previous review comment of v4. Here are some more comments for the latest patch v5-0001. ====== GENERAL 1. There are still a couple of place remainig where this new table was not consistent called a "Conflict Log Table" (e.g. search for "history") e.g. Subject: [PATCH v5] Add configurable conflict log history table for Logical Replication e.g. + /* Insert conflict details to log history table. */ e.g. +-- CONFLICT LOG HISTORY TABLE TESTS ~~~ 2. Is automatically dropping the log tables always what the user might want to happen? Maybe someone want them lying around afterwards for later analysis -- I don't really know the answer; Just wondering if this is (a) good to be tidy or (b) bad to remove user flexibility. Or maybe the answer is leave if but make sure to add more documentation to say "if you are going to want to do some post analysis then be sure to copy this table data before it gets automatically dropped". ====== Commit message. 3. User-Defined Table: The conflict log is stored in a user-managed table rather than a system catalog. ~ I felt "User-defined" makes it sound like the user does CREATE TABLE themselves and has some control over the schema. Maybe say "User-Managed Table:" instead? ====== src/backend/commands/subscriptioncmds.c 4. #define SUBOPT_LSN 0x00010000 #define SUBOPT_ORIGIN 0x00020000 +#define SUBOPT_CONFLICT_LOG_TABLE 0x00040000 Whitespace alignment. ~~~ AlterSubscription: 5. + values[Anum_pg_subscription_subconflictlognspid - 1] = + ObjectIdGetDatum(nspid); + values[Anum_pg_subscription_subconflictlogtable - 1] = + CStringGetTextDatum(relname); + + replaces[Anum_pg_subscription_subconflictlognspid - 1] = true; + replaces[Anum_pg_subscription_subconflictlogtable - 1] = true; Something feels back-to-front, because if the same clt is being re-used (like the NOTICE part taht follows) then why do you need to reassign and say replaces[] = true here? ~~~ 6. + /* + * If the subscription already has the conflict log table + * set to the exact same name and namespace currently being + * specified, and that table exists, just give notice and + * skip creation. + */ Is there a simpler way to say the same thing? SUGGESTION If the subscription already uses this conflict log table and it exists, just issue a notice. ~~~ 7. + ereport(NOTICE, + (errmsg("skipping table creation because \"%s.%s\" is already set as conflict log table", + nspname, relname))); I wasn't sure you need to say "skipping table creation because"... it seems kind of internal details. How about just: \"%s.%s\" is already in use as the conflict log table for this subscription ~~~ 8. + /* + * Drop the existing conflict log table if we are + * setting a new table. + */ The comment didn't feel right by implying there is something to drop. SUGGESTION Create the conflict log table after dropping any pre-existing one. ~~~ drop_conflict_log_table: 9. + /* Drop the conflict log table if it exist. */ typo: /exist./exists./ ====== src/backend/replication/logical/conflict.c 10. +static Datum +tuple_table_slot_to_json_datum(TupleTableSlot *slot) +{ + HeapTuple tuple = ExecCopySlotHeapTuple(slot); + Datum datum = heap_copy_tuple_as_datum(tuple, slot->tts_tupleDescriptor); + Datum json; + + if (TupIsNull(slot)) + return 0; + + json = DirectFunctionCall1(row_to_json, datum); + heap_freetuple(tuple); + + return json; +} Bug? Shouldn't that TupIsNull(slot) check *precede* using that slot for the tuple/datum assignments? ~~~ insert_conflict_log: 11. + Datum values[MAX_CONFLICT_ATTR_NUM]; + bool nulls[MAX_CONFLICT_ATTR_NUM]; + Oid nspid; + Oid relid; + Relation conflictrel = NULL; + int attno; + int options = HEAP_INSERT_NO_LOGICAL; + char *relname; + char *origin = NULL; + char *remote_origin = NULL; + HeapTuple tup; I felt some of these var names can be confusing: 11A. e.g. "conflictlogrel" (instead of 'conflictrel') would emphasise this is the rel of the log file, not the rel that encountered a conflict. ~ 11B. Similarly, maybe 'relname' could be 'conflictlogtable', which is also what it was called elsewhere. ~ 11C. AFAICT, the 'relid' is really the relid of the conflict log. So, maybe name it as it 'confliglogreid', otherwise it seems confusing when there is already parameter called 'rel' that is unrelated to thia 'relid'. ~~~ 12. + if (searchslot != NULL) + values[attno++] = tuple_table_slot_to_json_datum(searchslot); + else + nulls[attno++] = true; + + if (localslot != NULL) + values[attno++] = tuple_table_slot_to_json_datum(localslot); + else + nulls[attno++] = true; + + if (remoteslot != NULL) + values[attno++] = tuple_table_slot_to_json_datum(remoteslot); + else + nulls[attno++] = true; That function tuple_table_slot_to_json_datum() has potential to return 0. Is that something that needs checking, so you can assign nulls[] = true? ====== src/backend/replication/logical/worker.c 13. +char * +get_subscription_conflict_log_table(Oid subid, Oid *nspid) +{ + HeapTuple tup; + Datum datum; + bool isnull; + char *relname = NULL; + Form_pg_subscription subform; + + tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid)); + + if (!HeapTupleIsValid(tup)) + return NULL; + + subform = (Form_pg_subscription) GETSTRUCT(tup); + + /* Get conflict log table name. */ + datum = SysCacheGetAttr(SUBSCRIPTIONOID, + tup, + Anum_pg_subscription_subconflictlogtable, + &isnull); + if (!isnull) + { + *nspid = subform->subconflictlognspid; + relname = pstrdup(TextDatumGetCString(datum)); + } + + ReleaseSysCache(tup); + return relname; +} You could consider assigning *nspid = InvalidOid when 'isnull' is true, so then you don't have to rely on the caller pre-assigning a default sane value. YMMV. ====== src/bin/psql/tab-complete.in.c 14. - COMPLETE_WITH("binary", "connect", "copy_data", "create_slot", + COMPLETE_WITH("binary", "connect", "conflict_log_table", "copy_data", "create_slot", 'conflict_log_table' comes before 'connect' alphabetically. ====== src/test/regress/sql/subscription.sql 15. +-- ok - change the conlfict log table name for existing subscription already had old table +ALTER SUBSCRIPTION regress_conflict_test2 SET (conflict_log_table = 'public.regress_conflict_log3'); +SELECT subname, subconflictlogtable, subconflictlognspid = (SELECT oid FROM pg_namespace WHERE nspname = 'public') AS is_public_schema +FROM pg_subscription WHERE subname = 'regress_conflict_test2'; + typos in comment. - /conlfict/conlflict/ - /for existing subscription already had old table/for an existing subscription that already had one/ ~~~ 16. +-- check new table should be created and old should be dropped SUGGESTION check the new table was created and the old table was dropped ~~~ 17. +-- ok (NOTICE) - try to set the conflict log table which is used by same subscription +ALTER SUBSCRIPTION regress_conflict_test2 SET (conflict_log_table = 'public.regress_conflict_log3'); + +-- fail - try to use the conflict log table being used by some other subscription +ALTER SUBSCRIPTION regress_conflict_test2 SET (conflict_log_table = 'public.regress_conflict_log1'); Make those 2 comment more alike: SUGGESTIONS -- ok (NOTICE) - set conflict_log_table to one already used by this subscription ... -- fail - set conflict_log_table to one already used by a different subscription ~~~ 18. Missing tests for describe \dRs+. e.g. there are already dozens of \dRs+ examples where there is no clt assigned, but I did not see any tests where the clt *is* assigned. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-24T06:21:40Z
On Thu, Nov 20, 2025 at 5:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > I was working on these pending items, there is something where I got stuck, I am exploring this more but would like to share the problem. > 2) Need to add replica identity tuple instead of full tuple - reported by Shveta I have worked on fixing this along with other comments by Peter, now we can see only RI tuple is inserted as part of the key_tuple, IMHO lets keep the name as key tuple as it will use the primary key or unique key if no explicit replicate identity is set, thoughts? postgres[3048044]=# select * from myschema.conflict_log_history2; relid | schemaname | relname | conflict_type | local_xid | remote_xid | remote_commit_lsn | local_commit_ts | remote_commit_ts | local_o rigin | remote_origin | key_tuple | local_tuple | remote_tuple -------+------------+---------+-----------------------+-----------+------------+-------------------+-------------------------------+-------------------------------+-------- ------+---------------+-----------+----------------+---------------- 16385 | public | test | update_origin_differs | 765 | 759 | 0/0174F2E8 | 2025-11-24 06:16:50.468263+00 | 2025-11-24 06:16:55.483507+00 | | pg_16396 | {"a":1} | {"a":1,"b":10} | {"a":1,"b":20} Now pending work status 1) fixed review comments of 0002 and 0003 - Pending 2) Need to add replica identity tuple instead of full tuple -- Done 3) Keeping the logs in case of outer transaction failure by moving log insertion outside the main transaction - reported by Shveta - Pending 4) Run pgindent -- planning to do it after we complete the first level of review - Pending 5) Subscription test cases for logging the actual conflicts - Pending -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-11-25T03:33:18Z
Hi Dilip. Here are a couple of review comments for v6-0001. ====== GENERAL. 1. Firstly, here is one of my "what if" ideas... The current patch is described as making a "structured, queryable record of all logical replication conflicts". What if we go bigger than that? What if this were made a more generic "structured, queryable record of logical replication activity"? AFAIK, there don't have to be too many logic changes to achieve this. e.g. I'm imagining it is mostly: * Rename the subscription parameter "conflict_log_table" to "log_table" or similar. * Remove/modify the "conflict_" name part from many of the variables and function names. * Add another 'type' column to the log table -- e.g. everything this patch writes can be type="CONFL", or type='c', or whatever. * Maybe tweak/add some of the other columns for more generic future use Anyway, it might be worth considering this now, before everything becomes set in stone with a conflict-only focus, making it too difficult to add more potential/unknown log table enhancements later. Thoughts? ====== src/backend/replication/logical/conflict.c 2. +#include "funcapi.h" +#include "funcapi.h" double include of the same header. ~~~ 3. +static Datum tuple_table_slot_to_ri_json_datum(EState *estate, + Relation localrel, + Oid replica_index, + TupleTableSlot *slot); + +static void insert_conflict_log(EState *estate, Relation rel, + TransactionId local_xid, + TimestampTz local_ts, + ConflictType conflict_type, + RepOriginId origin_id, + TupleTableSlot *searchslot, + TupleTableSlot *localslot, + TupleTableSlot *remoteslot); There were no spaces between any of the other static declarations, so why is this one different? ~~~ insert_conflict_log: insert_conflict_log: 4. +#define MAX_CONFLICT_ATTR_NUM 15 + Datum values[MAX_CONFLICT_ATTR_NUM]; + bool nulls[MAX_CONFLICT_ATTR_NUM]; + Oid nspid; + Oid confliglogreid; + Relation conflictlogrel = NULL; + int attno; + int options = HEAP_INSERT_NO_LOGICAL; + char *conflictlogtable; + char *origin = NULL; + char *remote_origin = NULL; + HeapTuple tup; Typo: Oops. Looks like that typo originated from my previous review comment, and you took it as-is. /confliglogreid/confliglogrelid/ ~~~ 5. + if (searchslot != NULL && !TupIsNull(searchslot)) { - tableslot = table_slot_create(localrel, &estate->es_tupleTable); - tableslot = ExecCopySlot(tableslot, slot); + Oid replica_index = GetRelationIdentityOrPK(rel); + + /* + * If the table has a valid replica identity index, build the index + * json datum from key value. Otherwise, construct it from the complete + * tuple in REPLICA IDENTITY FULL cases. + */ + if (OidIsValid(replica_index)) + values[attno++] = tuple_table_slot_to_ri_json_datum(estate, rel, + replica_index, + searchslot); + else + values[attno++] = tuple_table_slot_to_json_datum(searchslot); } + else + nulls[attno++] = true; - /* - * Initialize ecxt_scantuple for potential use in FormIndexDatum when - * index expressions are present. - */ - GetPerTupleExprContext(estate)->ecxt_scantuple = tableslot; + if (localslot != NULL && !TupIsNull(localslot)) + values[attno++] = tuple_table_slot_to_json_datum(localslot); + else + nulls[attno++] = true; - /* - * The values/nulls arrays passed to BuildIndexValueDescription should be - * the results of FormIndexDatum, which are the "raw" input to the index - * AM. - */ - FormIndexDatum(BuildIndexInfo(indexDesc), tableslot, estate, values, isnull); + if (remoteslot != NULL && !TupIsNull(remoteslot)) + values[attno++] = tuple_table_slot_to_json_datum(remoteslot); + else + nulls[attno++] = true; AFAIK, the TupIsNull() already includes the NULL check anyway, so you don't need to double up those. I saw at least 3 conditions above where the code could be simpler. e.g. BEFORE + if (remoteslot != NULL && !TupIsNull(remoteslot)) SUGGESTION if (!TupIsNull(remoteslot)) ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-25T08:29:08Z
On Tue, Nov 25, 2025 at 9:03 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Dilip. > > Here are a couple of review comments for v6-0001. > > ====== > GENERAL. > > 1. > Firstly, here is one of my "what if" ideas... > > The current patch is described as making a "structured, queryable > record of all logical replication conflicts". > > What if we go bigger than that? What if this were made a more generic > "structured, queryable record of logical replication activity"? > > AFAIK, there don't have to be too many logic changes to achieve this. > e.g. I'm imagining it is mostly: > > * Rename the subscription parameter "conflict_log_table" to > "log_table" or similar. > * Remove/modify the "conflict_" name part from many of the variables > and function names. > * Add another 'type' column to the log table -- e.g. everything this > patch writes can be type="CONFL", or type='c', or whatever. > * Maybe tweak/add some of the other columns for more generic future use > > Anyway, it might be worth considering this now, before everything > becomes set in stone with a conflict-only focus, making it too > difficult to add more potential/unknown log table enhancements later. > > Thoughts? Yeah that's an interesting thought for sure, but honestly I believe the conflict log table only for storing the conflict and conflict resolution related data is standard followed across the databases who provide active-active setup e.g. Oracle Golden Gate, BDR, pg active, so IMHO to keep the feature clean and focused, we should follow the same. I will work on other review comments and post the patch soon. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-25T10:36:22Z
On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > On a separate note, I've been considering how to manage conflict log insertions when an error causes the outer transaction to abort, which seems to be a non-trivial. Here is what I have in mind: ====================== First, prepare_conflict_log() would be executed from ReportApplyConflict(). This function would handle all preliminary work, such as preparing the tuple for the conflict log table. Second, insert_conflict_log() would be executed. If the error level in ReportApplyConflict() is LOG, the insertion would occur directly. Otherwise, the log information would be stored in a global variable and inserted in a separate transaction once we exit start_apply() due to the error. @shveta malik @Amit Kapila let me know what you think? Or do you think it can be simplified? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-25T14:34:50Z
On Tue, Nov 25, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On a separate note, I've been considering how to manage conflict log > insertions when an error causes the outer transaction to abort, which > seems to be a non-trivial. > > Here is what I have in mind: > ====================== > First, prepare_conflict_log() would be executed from > ReportApplyConflict(). This function would handle all preliminary > work, such as preparing the tuple for the conflict log table. Second, > insert_conflict_log() would be executed. If the error level in > ReportApplyConflict() is LOG, the insertion would occur directly. > Otherwise, the log information would be stored in a global variable > and inserted in a separate transaction once we exit start_apply() due > to the error. > > @shveta malik @Amit Kapila let me know what you think? Or do you > think it can be simplified? While digging more into this I am wondering why CT_MULTIPLE_UNIQUE_CONFLICTS is reported as an error and all other conflicts as LOG? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-26T08:35:48Z
On Tue, Nov 25, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On a separate note, I've been considering how to manage conflict log > insertions when an error causes the outer transaction to abort, which > seems to be a non-trivial. > > Here is what I have in mind: > ====================== > First, prepare_conflict_log() would be executed from > ReportApplyConflict(). This function would handle all preliminary > work, such as preparing the tuple for the conflict log table. Second, > insert_conflict_log() would be executed. If the error level in > ReportApplyConflict() is LOG, the insertion would occur directly. > Otherwise, the log information would be stored in a global variable > and inserted in a separate transaction once we exit start_apply() due > to the error. > > @shveta malik @Amit Kapila let me know what you think? Or do you > think it can be simplified? > I could not think of a better way. This idea works for me. I had doubts if it will be okay to start a new transaction in catch-block (if we plan to do it in start_apply's), but then I found few other functions doing it (see do_autovacuum, perform_work_item, _SPI_commit). So IMO, we should be good. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-26T10:45:27Z
On Wed, Nov 26, 2025 at 2:05 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Nov 25, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > On a separate note, I've been considering how to manage conflict log > > insertions when an error causes the outer transaction to abort, which > > seems to be a non-trivial. > > > > Here is what I have in mind: > > ====================== > > First, prepare_conflict_log() would be executed from > > ReportApplyConflict(). This function would handle all preliminary > > work, such as preparing the tuple for the conflict log table. Second, > > insert_conflict_log() would be executed. If the error level in > > ReportApplyConflict() is LOG, the insertion would occur directly. > > Otherwise, the log information would be stored in a global variable > > and inserted in a separate transaction once we exit start_apply() due > > to the error. > > > > @shveta malik @Amit Kapila let me know what you think? Or do you > > think it can be simplified? > > > > I could not think of a better way. This idea works for me. I had > doubts if it will be okay to start a new transaction in catch-block > (if we plan to do it in start_apply's), but then I found few other > functions doing it (see do_autovacuum, perform_work_item, > _SPI_commit). So IMO, we should be good. > On re-reading, I think you were not suggesting to handle it in the CATCH block. Where exactly once we exit start_apply? But since the situation will arise only in case of ERROR, I thought handling in catch-block could be one option. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-26T11:18:50Z
On Wed, Nov 26, 2025 at 4:15 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Nov 26, 2025 at 2:05 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Nov 25, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Tue, Nov 25, 2025 at 1:59 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On a separate note, I've been considering how to manage conflict log > > > insertions when an error causes the outer transaction to abort, which > > > seems to be a non-trivial. > > > > > > Here is what I have in mind: > > > ====================== > > > First, prepare_conflict_log() would be executed from > > > ReportApplyConflict(). This function would handle all preliminary > > > work, such as preparing the tuple for the conflict log table. Second, > > > insert_conflict_log() would be executed. If the error level in > > > ReportApplyConflict() is LOG, the insertion would occur directly. > > > Otherwise, the log information would be stored in a global variable > > > and inserted in a separate transaction once we exit start_apply() due > > > to the error. > > > > > > @shveta malik @Amit Kapila let me know what you think? Or do you > > > think it can be simplified? > > > > > > > I could not think of a better way. This idea works for me. I had > > doubts if it will be okay to start a new transaction in catch-block > > (if we plan to do it in start_apply's), but then I found few other > > functions doing it (see do_autovacuum, perform_work_item, > > _SPI_commit). So IMO, we should be good. > > > > On re-reading, I think you were not suggesting to handle it in the > CATCH block. Where exactly once we exit start_apply? > But since the situation will arise only in case of ERROR, I thought > handling in catch-block could be one option. Yeah it makes sense to handle in catch block, I have done that in the attached patch and also handled other comments by Peter. Now pending work status 1) fixed review comments of 0002 and 0003 - Pending 2) Need to add replica identity tuple instead of full tuple -- Done 3) Keeping the logs in case of outer transaction failure by moving log insertion outside the main transaction - reported by Shveta - Done (might need more validation and testing) 4) Run pgindent -- planning to do it after we complete the first level of review - Pending 5) Subscription test cases for logging the actual conflicts - Pending -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-11-27T00:59:59Z
Hi Dilip. Some review comments for v7-0001. ====== src/backend/replication/logical/conflict.c 1. + /* Insert conflict details to conflict log table. */ + if (conflictlogrel) + { + /* + * Prepare the conflict log tuple. If the error level is below + * ERROR, insert it immediately. Otherwise, defer the insertion to + * a new transaction after the current one aborts, ensuring the log + * tuple is not rolled back. + */ + conflictlogtuple = prepare_conflict_log_tuple(estate, + relinfo->ri_RelationDesc, + conflictlogrel, + conflicttuple->xmin, + conflicttuple->ts, type, + conflicttuple->origin, + searchslot, conflicttuple->slot, + remoteslot); + if (elevel < ERROR) + { + InsertConflictLogTuple(conflictlogrel, conflictlogtuple); + heap_freetuple(conflictlogtuple); + } + else + MyLogicalRepWorker->conflict_log_tuple = conflictlogtuple; + + table_close(conflictlogrel, AccessExclusiveLock); + } + } + IMO, some refactoring would help simplify conflictlogtuple processing. e.g. i) You don't need any separate 'conflictlogtuple' var - Use MyLogicalRepWorker->conflict_log_tuple always for this purpose ii) prepare_conflict_log_tuple() - Change this to a void; it will always side-effect MyLogicalRepWorker->conflict_log_tuple - Assert MyLogicalRepWorker->conflict_log_tuple must be NULL on entry iii) InsertConflictLogTuple() - The 2nd param it not needed if you always use MyLogicalRepWorker->conflict_log_tuple - Asserts MyLogicalRepWorker->conflict_log_tuple is not NULL, then writes it - BTW, I felt that heap_freetuple could also be done here too - Finally, sets to MyLogicalRepWorker->conflict_log_tuple to NULL (ready for the next conflict) ~~~ InsertConflictLogTuple: 2. +/* + * InsertConflictLogTuple + * + * Persistently records the input conflict log tuple into the conflict log + * table. It uses HEAP_INSERT_NO_LOGICAL to explicitly block logical decoding + * of the tuple inserted into the conflict log table. + */ +void +InsertConflictLogTuple(Relation conflictlogrel, HeapTuple tup) +{ + int options = HEAP_INSERT_NO_LOGICAL; + + heap_insert(conflictlogrel, tup, GetCurrentCommandId(true), options, NULL); +} See the above review comment (iii), for some suggested changes to this function. ~~~ prepare_conflict_log_tuple: 3. + * The caller is responsible for explicitly freeing the returned heap tuple + * after inserting. + */ +static HeapTuple +prepare_conflict_log_tuple(EState *estate, Relation rel, As per the above review comment (iii), I thought the Insert function could handle the freeing. ~~~ 4. + oldctx = MemoryContextSwitchTo(ApplyContext); + tup = heap_form_tuple(RelationGetDescr(conflictlogrel), values, nulls); + MemoryContextSwitchTo(oldctx); - return index_value; + return tup; Per the above comment (ii), change this to assign to MyLogicalRepWorker->conflict_log_tuple. ====== src/backend/replication/logical/worker.c start_apply: 5. + /* + * Insert the pending conflict log tuple under a new transaction. + */ /Insert the/Insert any/ ~~~ 6. + InsertConflictLogTuple(conflictlogrel, + MyLogicalRepWorker->conflict_log_tuple); + heap_freetuple(MyLogicalRepWorker->conflict_log_tuple); + MyLogicalRepWorker->conflict_log_tuple = NULL; Per earlier reqview comment (iii), remove the 2nd parm to InsertConflictLogTuple, and those other 2 statements can also be handled within InsertConflictLogTuple. ====== src/include/replication/worker_internal.h 7. + /* Store conflict log tuple to be inserted before worker exit. */ + HeapTuple conflict_log_tuple; + Per my above suggestions, this member comment becomes something more like "A conflict log tuple which is prepared but not yet written. */ ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-11-27T12:20:00Z
On Thu, Nov 27, 2025 at 6:30 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Dilip. Some review comments for v7-0001. > > ====== > src/backend/replication/logical/conflict.c > > 1. > + /* Insert conflict details to conflict log table. */ > + if (conflictlogrel) > + { > + /* > + * Prepare the conflict log tuple. If the error level is below > + * ERROR, insert it immediately. Otherwise, defer the insertion to > + * a new transaction after the current one aborts, ensuring the log > + * tuple is not rolled back. > + */ > + conflictlogtuple = prepare_conflict_log_tuple(estate, > + relinfo->ri_RelationDesc, > + conflictlogrel, > + conflicttuple->xmin, > + conflicttuple->ts, type, > + conflicttuple->origin, > + searchslot, conflicttuple->slot, > + remoteslot); > + if (elevel < ERROR) > + { > + InsertConflictLogTuple(conflictlogrel, conflictlogtuple); > + heap_freetuple(conflictlogtuple); > + } > + else > + MyLogicalRepWorker->conflict_log_tuple = conflictlogtuple; > + > + table_close(conflictlogrel, AccessExclusiveLock); > + } > + } > + > > IMO, some refactoring would help simplify conflictlogtuple processing. e.g. > > i) You don't need any separate 'conflictlogtuple' var > - Use MyLogicalRepWorker->conflict_log_tuple always for this purpose > ii) prepare_conflict_log_tuple() > - Change this to a void; it will always side-effect > MyLogicalRepWorker->conflict_log_tuple > - Assert MyLogicalRepWorker->conflict_log_tuple must be NULL on entry > iii) InsertConflictLogTuple() > - The 2nd param it not needed if you always use > MyLogicalRepWorker->conflict_log_tuple > - Asserts MyLogicalRepWorker->conflict_log_tuple is not NULL, then writes it > - BTW, I felt that heap_freetuple could also be done here too > - Finally, sets to MyLogicalRepWorker->conflict_log_tuple to NULL > (ready for the next conflict) > > ~~~ > > InsertConflictLogTuple: > > 2. > +/* > + * InsertConflictLogTuple > + * > + * Persistently records the input conflict log tuple into the conflict log > + * table. It uses HEAP_INSERT_NO_LOGICAL to explicitly block logical decoding > + * of the tuple inserted into the conflict log table. > + */ > +void > +InsertConflictLogTuple(Relation conflictlogrel, HeapTuple tup) > +{ > + int options = HEAP_INSERT_NO_LOGICAL; > + > + heap_insert(conflictlogrel, tup, GetCurrentCommandId(true), options, NULL); > +} > > See the above review comment (iii), for some suggested changes to this function. > > ~~~ > > prepare_conflict_log_tuple: > > 3. > + * The caller is responsible for explicitly freeing the returned heap tuple > + * after inserting. > + */ > +static HeapTuple > +prepare_conflict_log_tuple(EState *estate, Relation rel, > > As per the above review comment (iii), I thought the Insert function > could handle the freeing. > > ~~~ > > 4. > + oldctx = MemoryContextSwitchTo(ApplyContext); > + tup = heap_form_tuple(RelationGetDescr(conflictlogrel), values, nulls); > + MemoryContextSwitchTo(oldctx); > > - return index_value; > + return tup; > > Per the above comment (ii), change this to assign to > MyLogicalRepWorker->conflict_log_tuple. > > ====== > src/backend/replication/logical/worker.c > > start_apply: > > 5. > + /* > + * Insert the pending conflict log tuple under a new transaction. > + */ > > /Insert the/Insert any/ > > ~~~ > > 6. > + InsertConflictLogTuple(conflictlogrel, > + MyLogicalRepWorker->conflict_log_tuple); > + heap_freetuple(MyLogicalRepWorker->conflict_log_tuple); > + MyLogicalRepWorker->conflict_log_tuple = NULL; > > Per earlier reqview comment (iii), remove the 2nd parm to > InsertConflictLogTuple, and those other 2 statements can also be > handled within InsertConflictLogTuple. > > ====== > src/include/replication/worker_internal.h > > 7. > + /* Store conflict log tuple to be inserted before worker exit. */ > + HeapTuple conflict_log_tuple; > + > > Per my above suggestions, this member comment becomes something more > like "A conflict log tuple which is prepared but not yet written. */ > I have fixed all these comments and also the comments of 0002, now I feel we can actually merge 0001 and 0002, so I have merged both of them. Now pending work status 1) fixed review comments of 0003 2) Run pgindent -- planning to do it after we complete the first level of review 3) Subscription TAP test for logging the actual conflicts -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-11-28T00:35:59Z
Hi Dilip. Some review comments for v8-0001. ====== Commit message 1. When the patches 0001 and 0002 got merged, I think the commit message should have been updated also to say something along the lines of: When ALL TABLES or ALL TABLES IN SCHEMA is used with publication won't publish the clt. ====== src/backend/catalog/pg_publication.c check_publication_add_relation: 2. + /* Can't be conflict log table */ + if (IsConflictLogRelid(RelationGetRelid(targetrel))) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot add relation \"%s\" to publication", + RelationGetRelationName(targetrel)), + errdetail("This operation is not supported for conflict log tables."))); Should it also show the schema name of the clt in the message? ====== src/backend/commands/subscriptioncmds.c 3. +/* + * Check if the specified relation is used as a conflict log table by any + * subscription. + */ +bool +IsConflictLogRelid(Oid relid) Most places refer to the clt. Wondering if this function ought to be called 'IsConflictLogTable'. ====== src/backend/replication/logical/conflict.c InsertConflictLogTuple: 4. + /* A valid tuple must be prepared and store into MyLogicalRepWorker. */ typo: /store into/stored in/ ~~~ prepare_conflict_log_tuple: 5. - index_close(indexDesc, NoLock); + oldctx = MemoryContextSwitchTo(ApplyContext); + tup = heap_form_tuple(RelationGetDescr(conflictlogrel), values, nulls); + MemoryContextSwitchTo(oldctx); - return index_value; + /* Store conflict_log_tuple into the worker slot for inserting it later. */ + MyLogicalRepWorker->conflict_log_tuple = tup; 5a. I don't think you need the 'tup' variable. Just assign directly to MyLogicalRepWorker->conflict_log_tuple. ~ 5b. "worker slot" -- I don't think this is a "slot". ====== src/backend/replication/logical/worker.c 6. + /* Open conflict log table. */ + conflictlogrel = GetConflictLogTableRel(); + InsertConflictLogTuple(conflictlogrel); + MyLogicalRepWorker->conflict_log_tuple = NULL; + table_close(conflictlogrel, AccessExclusiveLock); Maybe that comment should say: /* Open conflict log table and write the tuple. */ ====== src/include/replication/conflict.h 7. + /* A conflict log tuple which is prepared but not yet inserted. */ + HeapTuple conflict_log_tuple; + typo: /which/that/ (sorry, this one is my bad from a previous review comment) ====== src/test/regress/expected/subscription.out 8. +-- ok - change the conflict log table name for an existing subscription that already had one +CREATE SCHEMA clt; +ALTER SUBSCRIPTION regress_conflict_test2 SET (conflict_log_table = 'clt.regress_conflict_log3'); +SELECT subname, subconflictlogtable, subconflictlognspid = (SELECT oid FROM pg_namespace WHERE nspname = 'public') AS is_public_schema +FROM pg_subscription WHERE subname = 'regress_conflict_test2'; + subname | subconflictlogtable | is_public_schema +------------------------+-----------------------+------------------ + regress_conflict_test2 | regress_conflict_log3 | f +(1 row) + +\dRs+ + List of subscriptions + Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Retain dead tuples | Max retention duration | Retention active | Synchronous commit | Conninfo | Skip LSN | Conflict log table +------------------------+---------------------------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------------------+------------------------+------------------+--------------------+-----------------------------+------------+----------------------- + regress_conflict_test1 | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | t | f | f | f | 0 | f | off | dbname=regress_doesnotexist | 0/00000000 | regress_conflict_log1 + regress_conflict_test2 | regress_subscription_user | f | {testpub} | f | parallel | d | f | any | t | f | f | f | 0 | f | off | dbname=regress_doesnotexist | 0/00000000 | regress_conflict_log3 +(2 rows) ~ After going to the trouble of specifying the CLT on a different schema, that information is lost by the \dRs+. How about also showing the CLT schema name (at least when it is not "public") in the \dRs+ output. ~~~ 9. +-- ok - conflict_log_table should not be published with ALL TABLE +CREATE PUBLICATION pub FOR TABLES IN SCHEMA clt; +SELECT * FROM pg_publication_tables WHERE pubname = 'pub'; + pubname | schemaname | tablename | attnames | rowfilter +---------+------------+-----------+----------+----------- +(0 rows) Perhaps you should repeat this same test but using FOR ALL TABLES, instead of only FOR TABLES IN SCHEMA ====== src/test/regress/sql/subscription.sql 10. In one of the tests, you could call the function pg_relation_is_publishable(clt) to verify that it returns false. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-11-28T06:54:27Z
On Thu, 27 Nov 2025 at 17:50, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Nov 27, 2025 at 6:30 AM Peter Smith <smithpb2250@gmail.com> wrote: > > I have fixed all these comments and also the comments of 0002, now I > feel we can actually merge 0001 and 0002, so I have merged both of > them. I just started to have a look at the patch, while using I found lock level used is not correct: I felt the reason is that table is opened with RowExclusiveLock but closed in AccessExclusiveLock: + /* If conflict log table is not set for the subscription just return. */ + conflictlogtable = get_subscription_conflict_log_table( + MyLogicalRepWorker->subid, &nspid); + if (conflictlogtable == NULL) + { + pfree(conflictlogtable); + return NULL; + } + + conflictlogrelid = get_relname_relid(conflictlogtable, nspid); + if (OidIsValid(conflictlogrelid)) + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); .... + if (elevel < ERROR) + InsertConflictLogTuple(conflictlogrel); + + table_close(conflictlogrel, AccessExclusiveLock); .... 2025-11-28 12:17:55.631 IST [504133] WARNING: you don't own a lock of type AccessExclusiveLock 2025-11-28 12:17:55.631 IST [504133] CONTEXT: processing remote data for replication origin "pg_16402" during message type "INSERT" for replication target relation "public.t1" in transaction 761, finished at 0/01789AB8 2025-11-28 12:17:58.033 IST [504133] WARNING: you don't own a lock of type AccessExclusiveLock 2025-11-28 12:17:58.033 IST [504133] ERROR: conflict detected on relation "public.t1": conflict=insert_exists 2025-11-28 12:17:58.033 IST [504133] DETAIL: Key already exists in unique index "t1_pkey", modified in transaction 766. Key (c1)=(1); existing local row (1, 1); remote row (1, 1). 2025-11-28 12:17:58.033 IST [504133] CONTEXT: processing remote data for replication origin "pg_16402" during message type "INSERT" for replication target relation "public.t1" in transaction 761, finished at 0/01789AB8 Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-11-28T09:01:59Z
On Thu, Nov 27, 2025 at 5:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > I have fixed all these comments and also the comments of 0002, now I > feel we can actually merge 0001 and 0002, so I have merged both of > them. > > Now pending work status > 1) fixed review comments of 0003 > 2) Run pgindent -- planning to do it after we complete the first level > of review > 3) Subscription TAP test for logging the actual conflicts > Thanks for the patch. A few observations: 1) It seems, as per LOG, 'key' and 'replica-identity' are different when it comes to insert_exists, update_exists and multiple_unique_conflicts, while I believe in CLT, key is replica-identity i.e. there are no 2 separate terms. Please see below: a) Update_Exists: 2025-11-28 14:08:56.179 IST [60383] ERROR: conflict detected on relation "public.tab1": conflict=update_exists 2025-11-28 14:08:56.179 IST [60383] DETAIL: Key already exists in unique index "tab1_pkey", modified locally in transaction 790 at 2025-11-28 14:07:17.578887+05:30. Key (i)=(40); existing local row (40, 10); remote row (40, 200); replica identity (i)=(20). postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple from clt where conflict_type='update_exists'; conflict_type | key_tuple | local_tuple | remote_tuple ---------------+-----------+-----------------+------------------ update_exists | {"i":20} | {"i":40,"j":10} | {"i":40,"j":200} b) insert_Exists: ERROR: conflict detected on relation "public.tab1": conflict=insert_exists DETAIL: Key already exists in unique index "tab1_pkey", modified locally in transaction 767 at 2025-11-28 13:59:22.431097+05:30. Key (i)=(30); existing local row (30, 10); remote row (30, 10). postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple from clt; conflict_type | key_tuple | local_tuple | remote_tuple ----------------+-----------+-----------------+----------------- insert_exists | | {"i":30,"j":10} | {"i":30,"j":10} case a) has key_tuple same as replica-identity of LOG case b) does not have replica-identity and thus key_tuple is NULL. Does that mean we need to maintain both key_tuple and RI separately in CLT? Thoughts? 2) For multiple_unique_conflict (testcase is same as I shared earlier), it asserts here: CONTEXT: processing remote data for replication origin "pg_16390" during message type "INSERT" for replication target relation "public.conf_tab" in transaction 778, finished at 0/017E6DE8 TRAP: failed Assert("MyLogicalRepWorker->conflict_log_tuple == NULL"), File: "conflict.c", Line: 749, PID: 60627 I have not checked it, but maybe 'MyLogicalRepWorker->conflict_log_tuple' is left over from the previous few tests I tried? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-11-28T12:20:15Z
On Tue, Nov 18, 2025 at 3:40 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > 3) > > > We also need to think how we are going to display the info in case of > > > multiple_unique_conflicts as there could be multiple local and remote > > > tuples conflicting for one single operation. Example: > > > > > > create table conf_tab (a int primary key, b int unique, c int unique); > > > > > > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4); > > > > > > pub: insert into conf_tab values (2,3,4); > > > > > > ERROR: conflict detected on relation "public.conf_tab": > > > conflict=multiple_unique_conflicts > > > DETAIL: Key already exists in unique index "conf_tab_pkey", modified > > > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4). > > > Key already exists in unique index "conf_tab_b_key", modified locally > > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4). > > > Key already exists in unique index "conf_tab_c_key", modified locally > > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4). > > > CONTEXT: processing remote data for replication origin "pg_16392" > > > during message type "INSERT" for replication target relation > > > "public.conf_tab" in transaction 781, finished at 0/017FDDA0 > > > > > > Currently in clt, we have singular terms such as 'key_tuple', > > > 'local_tuple', 'remote_tuple'. Shall we have multiple rows inserted? > > > But it does not look reasonable to have multiple rows inserted for a > > > single conflict raised. I will think more about this. > > > > Currently I am inserting multiple records in the conflict history > > table, the same as each tuple is logged, but couldn't find any better > > way for this. > > The biggest drawback of this approach is data bloat. The incoming data row will be stored multiple times. > > Another option is to use an array of tuples instead of a > > single tuple but not sure this might make things more complicated to > > process by any external tool. > > It’s arguable and hard to say what the correct behaviour should be. > I’m slightly leaning toward having a single row per conflict. > Yeah, it is better to either have a single row per conflict or have two tables conflict_history and conflict_history_details to avoid data bloat as pointed above. For example, two-table approach could be: 1. The Header Table (Incoming Data) This stores the data that tried to be applied. SQL CREATE TABLE conflict_header ( conflict_id SERIAL PRIMARY KEY, source_tx_id VARCHAR(100), -- Transaction ID from source table_name VARCHAR(100), operation CHAR(1), -- 'I' for Insert incoming_data JSONB, -- Store the incoming row as JSON ... ); 2. The Detail Table (Existing Conflicting Data) This stores the actual rows currently in the database that caused the violations. CREATE TABLE conflict_details ( detail_id SERIAL PRIMARY KEY, conflict_id INT REFERENCES conflict_header(conflict_id), constraint_name/key_tuple VARCHAR(100), conflicting_row_data JSONB -- The existing row in the DB that blocked the insert ); Please don't consider these exact columns; you can use something on the lines of what is proposed in the patch. This is just to show how the conflict data can be rearranged. Now, one argument against this is that users need to use JOIN to query data but still better than bloating the table. The idea to store in a single table could be changed to have columns like violated_constraints TEXT[], -- e.g., ['uk_email', 'uk_phone'], error_details JSONB -- e.g., [{"const": "uk_email", "val": "a@b.com"}, ...]. If we want to store multiple conflicting tuples in a single column, we need to ensure it is queryable via a JSONB column. The point in favour of a single JSONB column to combine multiple conflicting tuples is that we need this combination only for one kind of conflict. Both the approaches have their pros and cons. I feel we should dig a bit deeper for both by laying out details for each method and see what others think. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T04:41:08Z
On Fri, Nov 28, 2025 at 5:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Nov 18, 2025 at 3:40 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > 3) > > > > We also need to think how we are going to display the info in case of > > > > multiple_unique_conflicts as there could be multiple local and remote > > > > tuples conflicting for one single operation. Example: > > > > > > > > create table conf_tab (a int primary key, b int unique, c int unique); > > > > > > > > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4); > > > > > > > > pub: insert into conf_tab values (2,3,4); > > > > > > > > ERROR: conflict detected on relation "public.conf_tab": > > > > conflict=multiple_unique_conflicts > > > > DETAIL: Key already exists in unique index "conf_tab_pkey", modified > > > > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > > > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4). > > > > Key already exists in unique index "conf_tab_b_key", modified locally > > > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > > > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4). > > > > Key already exists in unique index "conf_tab_c_key", modified locally > > > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > > > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4). > > > > CONTEXT: processing remote data for replication origin "pg_16392" > > > > during message type "INSERT" for replication target relation > > > > "public.conf_tab" in transaction 781, finished at 0/017FDDA0 > > > > > > > > Currently in clt, we have singular terms such as 'key_tuple', > > > > 'local_tuple', 'remote_tuple'. Shall we have multiple rows inserted? > > > > But it does not look reasonable to have multiple rows inserted for a > > > > single conflict raised. I will think more about this. > > > > > > Currently I am inserting multiple records in the conflict history > > > table, the same as each tuple is logged, but couldn't find any better > > > way for this. > > > > > The biggest drawback of this approach is data bloat. The incoming data > row will be stored multiple times. > > > > Another option is to use an array of tuples instead of a > > > single tuple but not sure this might make things more complicated to > > > process by any external tool. > > > > It’s arguable and hard to say what the correct behaviour should be. > > I’m slightly leaning toward having a single row per conflict. > > > > Yeah, it is better to either have a single row per conflict or have > two tables conflict_history and conflict_history_details to avoid data > bloat as pointed above. For example, two-table approach could be: > > 1. The Header Table (Incoming Data) > This stores the data that tried to be applied. > SQL > CREATE TABLE conflict_header ( > conflict_id SERIAL PRIMARY KEY, > source_tx_id VARCHAR(100), -- Transaction ID from source > table_name VARCHAR(100), > operation CHAR(1), -- 'I' for Insert > incoming_data JSONB, -- Store the incoming row as JSON > ... > ); > > 2. The Detail Table (Existing Conflicting Data) > This stores the actual rows currently in the database that caused the > violations. > CREATE TABLE conflict_details ( > detail_id SERIAL PRIMARY KEY, > conflict_id INT REFERENCES conflict_header(conflict_id), > constraint_name/key_tuple VARCHAR(100), > conflicting_row_data JSONB -- The existing row in the DB > that blocked the insert > ); > > Please don't consider these exact columns; you can use something on > the lines of what is proposed in the patch. This is just to show how > the conflict data can be rearranged. Now, one argument against this is > that users need to use JOIN to query data but still better than > bloating the table. The idea to store in a single table could be > changed to have columns like violated_constraints TEXT[], -- > e.g., ['uk_email', 'uk_phone'], error_details JSONB -- e.g., > [{"const": "uk_email", "val": "a@b.com"}, ...]. If we want to store > multiple conflicting tuples in a single column, we need to ensure it > is queryable via a JSONB column. The point in favour of a single JSONB > column to combine multiple conflicting tuples is that we need this > combination only for one kind of conflict. > > Both the approaches have their pros and cons. I feel we should dig a > bit deeper for both by laying out details for each method and see what > others think. The specific scenario we are discussing is when a single row from the publisher attempts to apply an operation that causes a conflict across multiple unique keys, with each of those unique key violations conflicting with a different local row on the subscriber, is very rare. IMHO this low-frequency scenario does not justify overcomplicating the design with an array field or a multi-level table. Consider the infrequency of the root causes: - How often does a table have more than 3 to 4 unique keys? - How frequently would each of these keys conflict with a unique row on the subscriber side? If resolving this occasional, synthetic conflict requires inserting two or three rows instead of a single one, this is an acceptable trade-off considering how rare it can occur. Anyway this is my opinion and I am open to opinions from others. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T08:18:14Z
On Fri, Nov 28, 2025 at 12:24 PM vignesh C <vignesh21@gmail.com> wrote: > > On Thu, 27 Nov 2025 at 17:50, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Nov 27, 2025 at 6:30 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > I have fixed all these comments and also the comments of 0002, now I > > feel we can actually merge 0001 and 0002, so I have merged both of > > them. > > I just started to have a look at the patch, while using I found lock > level used is not correct: > I felt the reason is that table is opened with RowExclusiveLock but > closed in AccessExclusiveLock: > > + /* If conflict log table is not set for the subscription just return. */ > + conflictlogtable = get_subscription_conflict_log_table( > + > MyLogicalRepWorker->subid, &nspid); > + if (conflictlogtable == NULL) > + { > + pfree(conflictlogtable); > + return NULL; > + } > + > + conflictlogrelid = get_relname_relid(conflictlogtable, nspid); > + if (OidIsValid(conflictlogrelid)) > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > > .... > + if (elevel < ERROR) > + InsertConflictLogTuple(conflictlogrel); > + > + table_close(conflictlogrel, AccessExclusiveLock); > .... > > 2025-11-28 12:17:55.631 IST [504133] WARNING: you don't own a lock of > type AccessExclusiveLock > 2025-11-28 12:17:55.631 IST [504133] CONTEXT: processing remote data > for replication origin "pg_16402" during message type "INSERT" for > replication target relation "public.t1" in transaction 761, finished > at 0/01789AB8 > 2025-11-28 12:17:58.033 IST [504133] WARNING: you don't own a lock of > type AccessExclusiveLock > 2025-11-28 12:17:58.033 IST [504133] ERROR: conflict detected on > relation "public.t1": conflict=insert_exists > 2025-11-28 12:17:58.033 IST [504133] DETAIL: Key already exists in > unique index "t1_pkey", modified in transaction 766. > Key (c1)=(1); existing local row (1, 1); remote row (1, 1). > 2025-11-28 12:17:58.033 IST [504133] CONTEXT: processing remote data > for replication origin "pg_16402" during message type "INSERT" for > replication target relation "public.t1" in transaction 761, finished > at 0/01789AB8 Thanks, I will fix this. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T08:23:51Z
On Fri, Nov 28, 2025 at 2:32 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Nov 27, 2025 at 5:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > I have fixed all these comments and also the comments of 0002, now I > > feel we can actually merge 0001 and 0002, so I have merged both of > > them. > > > > Now pending work status > > 1) fixed review comments of 0003 > > 2) Run pgindent -- planning to do it after we complete the first level > > of review > > 3) Subscription TAP test for logging the actual conflicts > > > > Thanks for the patch. A few observations: > > 1) > It seems, as per LOG, 'key' and 'replica-identity' are different when > it comes to insert_exists, update_exists and > multiple_unique_conflicts, while I believe in CLT, key is > replica-identity i.e. there are no 2 separate terms. Please see below: > > a) > Update_Exists: > 2025-11-28 14:08:56.179 IST [60383] ERROR: conflict detected on > relation "public.tab1": conflict=update_exists > 2025-11-28 14:08:56.179 IST [60383] DETAIL: Key already exists in > unique index "tab1_pkey", modified locally in transaction 790 at > 2025-11-28 14:07:17.578887+05:30. > Key (i)=(40); existing local row (40, 10); remote row (40, 200); > replica identity (i)=(20). > > postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple > from clt where conflict_type='update_exists'; > conflict_type | key_tuple | local_tuple | remote_tuple > ---------------+-----------+-----------------+------------------ > update_exists | {"i":20} | {"i":40,"j":10} | {"i":40,"j":200} > > b) > insert_Exists: > ERROR: conflict detected on relation "public.tab1": conflict=insert_exists > DETAIL: Key already exists in unique index "tab1_pkey", modified > locally in transaction 767 at 2025-11-28 13:59:22.431097+05:30. > Key (i)=(30); existing local row (30, 10); remote row (30, 10). > > postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple from clt; > conflict_type | key_tuple | local_tuple | remote_tuple > ----------------+-----------+-----------------+----------------- > insert_exists | | {"i":30,"j":10} | {"i":30,"j":10} > > case a) has key_tuple same as replica-identity of LOG > case b) does not have replica-identity and thus key_tuple is NULL. > > Does that mean we need to maintain both key_tuple and RI separately in > CLT? Thoughts? Maybe we should then have a place for both key_tuple as well as replica identity as we are logging, what others think about this case? > 2) > For multiple_unique_conflict (testcase is same as I shared earlier), > it asserts here: > CONTEXT: processing remote data for replication origin "pg_16390" > during message type "INSERT" for replication target relation > "public.conf_tab" in transaction 778, finished at 0/017E6DE8 > TRAP: failed Assert("MyLogicalRepWorker->conflict_log_tuple == NULL"), > File: "conflict.c", Line: 749, PID: 60627 > > I have not checked it, but maybe > 'MyLogicalRepWorker->conflict_log_tuple' is left over from the > previous few tests I tried? Yeah, prepare_conflict_log_tuple() is called in loop and when there are multiple tuple we need to collect all of the tuple before inserting it at worker exit so the current code has a bug, I will see how we can fix it, I think this also depends upon the other discussion we are having related to how to insert multiple unique conflict. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-01T08:27:40Z
On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > Few observations related to publication. > > > ------------------------------ > > Thanks Shveta, for testing and sharing your thoughts. IMHO for > conflict log tables it should be good enough if we restrict it when > ALL TABLE options are used, I don't think we need to put extra effort > to completely restrict it even if users want to explicitly list it > into the publication. > > > > > > > (In the below comments, clt/CLT implies Conflict Log Table) > > > > > > 1) > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. > > This function is used while publishing every single change and I don't > think we want to add a cost to check each subscription to identify > whether the table is listed as CLT. > > > > 2) > > > '\d+ clt' shows all-tables publication name. I feel we should not > > > show that for clt. > > I think we should fix this. > > > > 3) > > > I am able to create a publication for clt table, should it be allowed? > > I believe we should not do any specific handling to restrict this but > I am open for the opinions. > > > > create subscription sub1 connection '...' publication pub1 > > > WITH(conflict_log_table='clt'); > > > create publication pub3 for table clt; > > > > > > 4) > > > Is there a reason we have not made '!IsConflictHistoryRelid' check as > > > part of is_publishable_class() itself? If we do so, other code-logics > > > will also get clt as non-publishable always (and will solve a few of > > > the above issues I think). IIUC, there is no place where we want to > > > mark CLT as publishable or is there any? > > IMHO the main reason is performance. > > > > 5) Also, I feel we can add some documentation now to help others to > > > understand/review the patch better without going through the long > > > thread. > > Make sense, I will do that in the next version. > > > > > > > Few observations related to conflict-logging: > > > ------------------------------ > > > 1) > > > I found that for the conflicts which ultimately result in Error, we do > > > not insert any conflict-record in clt. > > > > > > a) > > > Example: insert_exists, update_Exists > > > create table tab1 (i int primary key, j int); > > > sub: insert into tab1 values(30,10); > > > pub: insert into tab1 values(30,10); > > > ERROR: conflict detected on relation "public.tab1": conflict=insert_exists > > > No record in clt. > > > > > > sub: > > > <some pre-data needed> > > > update tab1 set i=40 where i = 30; > > > pub: update tab1 set i=40 where i = 20; > > > ERROR: conflict detected on relation "public.tab1": conflict=update_exists > > > No record in clt. > > Yeah that interesting need to put thought on how to commit this record > when an outer transaction is aborted as we do not have autonomous > transactions which are generally used for this kind of logging. But > we can explore more options like inserting into conflict log tables > outside the outer transaction. > > > > b) > > > Another question related to this is, since these conflicts (which > > > results in error) keep on happening until user resolves these or skips > > > these or 'disable_on_error' is set. Then are we going to insert these > > > multiple times? We do count these in 'confl_insert_exists' and > > > 'confl_update_exists' everytime, so it makes sense to log those each > > > time in clt as well. Thoughts? > > I think it make sense to insert every time we see the conflict, but it > would be good to have opinion from others as well. Since there is a concern that multiple rows for multiple_unique_conflicts can cause data-bloat, it made me rethink that this is actually more prone to causing data-bloat if it is not resolved on time, as it seems a far more frequent scenario. So shall we keep inserting the record or insert it once and avoid inserting it again based on lsn? Thoughts? > > > > 2) > > > Conflicts where row on sub is missing, local_ts incorrectly inserted. > > > It is '2000-01-01 05:30:00+05:30'. Should it be Null or something > > > indicating that it is not applicable for this conflict-type? > > > > > > Example: delete_missing, update_missing > > > pub: > > > insert into tab1 values(10,10); > > > insert into tab1 values(20,10); > > > sub: delete from tab1 where i=10; > > > pub: delete from tab1 where i=10; > > Sure I will test this. > > > > > 3) > > We also need to think how we are going to display the info in case of > > multiple_unique_conflicts as there could be multiple local and remote > > tuples conflicting for one single operation. Example: > > > > create table conf_tab (a int primary key, b int unique, c int unique); > > > > sub: insert into conf_tab values (2,2,2), (3,3,3), (4,4,4); > > > > pub: insert into conf_tab values (2,3,4); > > > > ERROR: conflict detected on relation "public.conf_tab": > > conflict=multiple_unique_conflicts > > DETAIL: Key already exists in unique index "conf_tab_pkey", modified > > locally in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > Key (a)=(2); existing local row (2, 2, 2); remote row (2, 3, 4). > > Key already exists in unique index "conf_tab_b_key", modified locally > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > Key (b)=(3); existing local row (3, 3, 3); remote row (2, 3, 4). > > Key already exists in unique index "conf_tab_c_key", modified locally > > in transaction 874 at 2025-11-12 14:35:13.452143+05:30. > > Key (c)=(4); existing local row (4, 4, 4); remote row (2, 3, 4). > > CONTEXT: processing remote data for replication origin "pg_16392" > > during message type "INSERT" for replication target relation > > "public.conf_tab" in transaction 781, finished at 0/017FDDA0 > > > > Currently in clt, we have singular terms such as 'key_tuple', > > 'local_tuple', 'remote_tuple'. Shall we have multiple rows inserted? > > But it does not look reasonable to have multiple rows inserted for a > > single conflict raised. I will think more about this. > > Currently I am inserting multiple records in the conflict history > table, the same as each tuple is logged, but couldn't find any better > way for this. Another option is to use an array of tuples instead of a > single tuple but not sure this might make things more complicated to > process by any external tool. But you are right, this needs more > discussion. > > -- > Regards, > Dilip Kumar > Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T08:34:08Z
On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > Few observations related to publication. > > > > ------------------------------ > > > > Thanks Shveta, for testing and sharing your thoughts. IMHO for > > conflict log tables it should be good enough if we restrict it when > > ALL TABLE options are used, I don't think we need to put extra effort > > to completely restrict it even if users want to explicitly list it > > into the publication. > > > > > > > > > > (In the below comments, clt/CLT implies Conflict Log Table) > > > > > > > > 1) > > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. > > > > This function is used while publishing every single change and I don't > > think we want to add a cost to check each subscription to identify > > whether the table is listed as CLT. > > > > > > 2) > > > > '\d+ clt' shows all-tables publication name. I feel we should not > > > > show that for clt. > > > > I think we should fix this. > > > > > > 3) > > > > I am able to create a publication for clt table, should it be allowed? > > > > I believe we should not do any specific handling to restrict this but > > I am open for the opinions. > > > > > > create subscription sub1 connection '...' publication pub1 > > > > WITH(conflict_log_table='clt'); > > > > create publication pub3 for table clt; > > > > > > > > 4) > > > > Is there a reason we have not made '!IsConflictHistoryRelid' check as > > > > part of is_publishable_class() itself? If we do so, other code-logics > > > > will also get clt as non-publishable always (and will solve a few of > > > > the above issues I think). IIUC, there is no place where we want to > > > > mark CLT as publishable or is there any? > > > > IMHO the main reason is performance. > > > > > > 5) Also, I feel we can add some documentation now to help others to > > > > understand/review the patch better without going through the long > > > > thread. > > > > Make sense, I will do that in the next version. > > > > > > > > > > Few observations related to conflict-logging: > > > > ------------------------------ > > > > 1) > > > > I found that for the conflicts which ultimately result in Error, we do > > > > not insert any conflict-record in clt. > > > > > > > > a) > > > > Example: insert_exists, update_Exists > > > > create table tab1 (i int primary key, j int); > > > > sub: insert into tab1 values(30,10); > > > > pub: insert into tab1 values(30,10); > > > > ERROR: conflict detected on relation "public.tab1": conflict=insert_exists > > > > No record in clt. > > > > > > > > sub: > > > > <some pre-data needed> > > > > update tab1 set i=40 where i = 30; > > > > pub: update tab1 set i=40 where i = 20; > > > > ERROR: conflict detected on relation "public.tab1": conflict=update_exists > > > > No record in clt. > > > > Yeah that interesting need to put thought on how to commit this record > > when an outer transaction is aborted as we do not have autonomous > > transactions which are generally used for this kind of logging. But > > we can explore more options like inserting into conflict log tables > > outside the outer transaction. > > > > > > b) > > > > Another question related to this is, since these conflicts (which > > > > results in error) keep on happening until user resolves these or skips > > > > these or 'disable_on_error' is set. Then are we going to insert these > > > > multiple times? We do count these in 'confl_insert_exists' and > > > > 'confl_update_exists' everytime, so it makes sense to log those each > > > > time in clt as well. Thoughts? > > > > I think it make sense to insert every time we see the conflict, but it > > would be good to have opinion from others as well. > > Since there is a concern that multiple rows for > multiple_unique_conflicts can cause data-bloat, it made me rethink > that this is actually more prone to causing data-bloat if it is not > resolved on time, as it seems a far more frequent scenario. So shall > we keep inserting the record or insert it once and avoid inserting it > again based on lsn? Thoughts? I agree, this is the real problem related to bloat so maybe we can see if the same tuple exists we can avoid inserting it again, although I haven't put thought on how to we distinguish between the new conflict on the same row vs the same conflict being inserted multiple times due to worker restart. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-01T09:27:53Z
On Mon, Dec 1, 2025 at 2:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Thu, Nov 13, 2025 at 9:17 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, Nov 13, 2025 at 2:39 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > Few observations related to publication. > > > > > ------------------------------ > > > > > > Thanks Shveta, for testing and sharing your thoughts. IMHO for > > > conflict log tables it should be good enough if we restrict it when > > > ALL TABLE options are used, I don't think we need to put extra effort > > > to completely restrict it even if users want to explicitly list it > > > into the publication. > > > > > > > > > > > > > (In the below comments, clt/CLT implies Conflict Log Table) > > > > > > > > > > 1) > > > > > 'select pg_relation_is_publishable(clt)' returns true for conflict-log table. > > > > > > This function is used while publishing every single change and I don't > > > think we want to add a cost to check each subscription to identify > > > whether the table is listed as CLT. > > > > > > > > 2) > > > > > '\d+ clt' shows all-tables publication name. I feel we should not > > > > > show that for clt. > > > > > > I think we should fix this. > > > > > > > > 3) > > > > > I am able to create a publication for clt table, should it be allowed? > > > > > > I believe we should not do any specific handling to restrict this but > > > I am open for the opinions. > > > > > > > > create subscription sub1 connection '...' publication pub1 > > > > > WITH(conflict_log_table='clt'); > > > > > create publication pub3 for table clt; > > > > > > > > > > 4) > > > > > Is there a reason we have not made '!IsConflictHistoryRelid' check as > > > > > part of is_publishable_class() itself? If we do so, other code-logics > > > > > will also get clt as non-publishable always (and will solve a few of > > > > > the above issues I think). IIUC, there is no place where we want to > > > > > mark CLT as publishable or is there any? > > > > > > IMHO the main reason is performance. > > > > > > > > 5) Also, I feel we can add some documentation now to help others to > > > > > understand/review the patch better without going through the long > > > > > thread. > > > > > > Make sense, I will do that in the next version. > > > > > > > > > > > > > Few observations related to conflict-logging: > > > > > ------------------------------ > > > > > 1) > > > > > I found that for the conflicts which ultimately result in Error, we do > > > > > not insert any conflict-record in clt. > > > > > > > > > > a) > > > > > Example: insert_exists, update_Exists > > > > > create table tab1 (i int primary key, j int); > > > > > sub: insert into tab1 values(30,10); > > > > > pub: insert into tab1 values(30,10); > > > > > ERROR: conflict detected on relation "public.tab1": conflict=insert_exists > > > > > No record in clt. > > > > > > > > > > sub: > > > > > <some pre-data needed> > > > > > update tab1 set i=40 where i = 30; > > > > > pub: update tab1 set i=40 where i = 20; > > > > > ERROR: conflict detected on relation "public.tab1": conflict=update_exists > > > > > No record in clt. > > > > > > Yeah that interesting need to put thought on how to commit this record > > > when an outer transaction is aborted as we do not have autonomous > > > transactions which are generally used for this kind of logging. But > > > we can explore more options like inserting into conflict log tables > > > outside the outer transaction. > > > > > > > > b) > > > > > Another question related to this is, since these conflicts (which > > > > > results in error) keep on happening until user resolves these or skips > > > > > these or 'disable_on_error' is set. Then are we going to insert these > > > > > multiple times? We do count these in 'confl_insert_exists' and > > > > > 'confl_update_exists' everytime, so it makes sense to log those each > > > > > time in clt as well. Thoughts? > > > > > > I think it make sense to insert every time we see the conflict, but it > > > would be good to have opinion from others as well. > > > > Since there is a concern that multiple rows for > > multiple_unique_conflicts can cause data-bloat, it made me rethink > > that this is actually more prone to causing data-bloat if it is not > > resolved on time, as it seems a far more frequent scenario. So shall > > we keep inserting the record or insert it once and avoid inserting it > > again based on lsn? Thoughts? > > I agree, this is the real problem related to bloat so maybe we can see > if the same tuple exists we can avoid inserting it again, although I > haven't put thought on how to we distinguish between the new conflict > on the same row vs the same conflict being inserted multiple times due > to worker restart. > If there is consensus on this approach, IMO, it appears safe to rely on 'remote_origin' and 'remote_commit_lsn' as the comparison keys for the given 'conflict_type' before we insert a new record. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-01T09:41:56Z
On Mon, Dec 1, 2025 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Dec 1, 2025 at 2:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Since there is a concern that multiple rows for > > > multiple_unique_conflicts can cause data-bloat, it made me rethink > > > that this is actually more prone to causing data-bloat if it is not > > > resolved on time, as it seems a far more frequent scenario. So shall > > > we keep inserting the record or insert it once and avoid inserting it > > > again based on lsn? Thoughts? > > > > I agree, this is the real problem related to bloat so maybe we can see > > if the same tuple exists we can avoid inserting it again, although I > > haven't put thought on how to we distinguish between the new conflict > > on the same row vs the same conflict being inserted multiple times due > > to worker restart. > > > > If there is consensus on this approach, IMO, it appears safe to rely > on 'remote_origin' and 'remote_commit_lsn' as the comparison keys for > the given 'conflict_type' before we insert a new record. > What happens if as part of multiple_unique_conflict, in the next apply round only some of the rows conflict (say in the meantime user has removed a few conflicting rows)? I think the ideal way for users to avoid such multiple occurrences is to configure subscription with disable_on_error. I think we should LOG errors again on retry and it is better to keep it consistent with what we print in LOG because we may want to give an option to users in future where to LOG (in conflict_history_table, LOG, or both) the conflicts. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T10:32:17Z
On Mon, Dec 1, 2025 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Dec 1, 2025 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, Dec 1, 2025 at 2:04 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Mon, Dec 1, 2025 at 1:57 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > Since there is a concern that multiple rows for > > > > multiple_unique_conflicts can cause data-bloat, it made me rethink > > > > that this is actually more prone to causing data-bloat if it is not > > > > resolved on time, as it seems a far more frequent scenario. So shall > > > > we keep inserting the record or insert it once and avoid inserting it > > > > again based on lsn? Thoughts? > > > > > > I agree, this is the real problem related to bloat so maybe we can see > > > if the same tuple exists we can avoid inserting it again, although I > > > haven't put thought on how to we distinguish between the new conflict > > > on the same row vs the same conflict being inserted multiple times due > > > to worker restart. > > > > > > > If there is consensus on this approach, IMO, it appears safe to rely > > on 'remote_origin' and 'remote_commit_lsn' as the comparison keys for > > the given 'conflict_type' before we insert a new record. > > > > What happens if as part of multiple_unique_conflict, in the next apply > round only some of the rows conflict (say in the meantime user has > removed a few conflicting rows)? I think the ideal way for users to > avoid such multiple occurrences is to configure subscription with > disable_on_error. I think we should LOG errors again on retry and it > is better to keep it consistent with what we print in LOG because we > may want to give an option to users in future where to LOG (in > conflict_history_table, LOG, or both) the conflicts. > Yeah that makes sense, because if the user tried to fix the conflict and if still didn't get fixed then next time onward user will have no way to know that conflict reoccurred. And also it make sense to maintain consistency with LOGs. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-01T10:52:17Z
On Fri, Nov 28, 2025 at 6:06 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Some review comments for v8-0001. Thank Peter, yes these all make sense and will fix in next version along with other comments by Vignesh/Shveta and Amit, except one comment > 9. > +-- ok - conflict_log_table should not be published with ALL TABLE > +CREATE PUBLICATION pub FOR TABLES IN SCHEMA clt; > +SELECT * FROM pg_publication_tables WHERE pubname = 'pub'; > + pubname | schemaname | tablename | attnames | rowfilter > +---------+------------+-----------+----------+----------- > +(0 rows) > > Perhaps you should repeat this same test but using FOR ALL TABLES, > instead of only FOR TABLES IN SCHEMA I will have to see how we can safely do this in testing without having any side effects on the concurrent test, generally we run publication.sql and subscription.sql concurrently in regression test so if we do FOR ALL TABLES it can affect each others, one option is to don't run these 2 test concurrently, I think we can do that as there is no real concurrency we are testing by running them concurrently, any thought on this? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-02T05:07:41Z
On Fri, Nov 28, 2025 at 2:32 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Nov 27, 2025 at 5:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > I have fixed all these comments and also the comments of 0002, now I > > feel we can actually merge 0001 and 0002, so I have merged both of > > them. > > > > Now pending work status > > 1) fixed review comments of 0003 > > 2) Run pgindent -- planning to do it after we complete the first level > > of review > > 3) Subscription TAP test for logging the actual conflicts > > > > Thanks for the patch. A few observations: > > 1) > It seems, as per LOG, 'key' and 'replica-identity' are different when > it comes to insert_exists, update_exists and > multiple_unique_conflicts, while I believe in CLT, key is > replica-identity i.e. there are no 2 separate terms. Please see below: > > a) > Update_Exists: > 2025-11-28 14:08:56.179 IST [60383] ERROR: conflict detected on > relation "public.tab1": conflict=update_exists > 2025-11-28 14:08:56.179 IST [60383] DETAIL: Key already exists in > unique index "tab1_pkey", modified locally in transaction 790 at > 2025-11-28 14:07:17.578887+05:30. > Key (i)=(40); existing local row (40, 10); remote row (40, 200); > replica identity (i)=(20). > > postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple > from clt where conflict_type='update_exists'; > conflict_type | key_tuple | local_tuple | remote_tuple > ---------------+-----------+-----------------+------------------ > update_exists | {"i":20} | {"i":40,"j":10} | {"i":40,"j":200} > > b) > insert_Exists: > ERROR: conflict detected on relation "public.tab1": conflict=insert_exists > DETAIL: Key already exists in unique index "tab1_pkey", modified > locally in transaction 767 at 2025-11-28 13:59:22.431097+05:30. > Key (i)=(30); existing local row (30, 10); remote row (30, 10). > > postgres=# select conflict_type, key_tuple,local_tuple,remote_tuple from clt; > conflict_type | key_tuple | local_tuple | remote_tuple > ----------------+-----------+-----------------+----------------- > insert_exists | | {"i":30,"j":10} | {"i":30,"j":10} > > case a) has key_tuple same as replica-identity of LOG > case b) does not have replica-identity and thus key_tuple is NULL. > > Does that mean we need to maintain both key_tuple and RI separately in > CLT? Thoughts? > Yeah, it could be useful to display RI values separately. What should be the column name? Few options could be: remote_val_for_ri, or remote_value_ri, or something else. I think it may also be useful to display conflicting_index but OTOH, it would be difficult to decide in the first version what other information could be required, so it is better to stick with what is being displayed in LOG. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-02T05:15:42Z
On Wed, Nov 19, 2025 at 3:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Nov 18, 2025 at 4:47 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > 3) > > Do we need to have a timestamp column as well to say when conflict was > > recorded? Or local_commit_ts, remote_commit_ts are sufficient? > > Thoughts > > You mean we can record the timestamp now while inserting, not sure if > it will add some more meaningful information than remote_commit_ts, > but let's see what others think. > local_commit_ts and remote_commit_ts sounds sufficient as one can identify the truth of information from those two. The key/schema values displayed in this table could change later but the information about a particular row is based on the time shown by those two columns. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-02T06:08:23Z
On Mon, Dec 1, 2025 at 10:11 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > The specific scenario we are discussing is when a single row from the > publisher attempts to apply an operation that causes a conflict across > multiple unique keys, with each of those unique key violations > conflicting with a different local row on the subscriber, is very > rare. IMHO this low-frequency scenario does not justify > overcomplicating the design with an array field or a multi-level > table. > I did some analysis and search on the internet to answer your following two questions. > Consider the infrequency of the root causes: > - How often does a table have more than 3 to 4 unique keys? It is extremely common—in fact, it is considered the industry "best practice" for modern database design. One can find this pattern in almost every enterprise system (e.g. banking apps, CRMs). It relies on distinguishing between Technical Identity (for the database) and Business Identity (for the real world). 1. The Design Pattern: Surrogate vs. Natural Keys Primary Key (Surrogate Key): Usually a meaningless number (e.g., 10452) or a UUID. It is used strictly for the database to join tables efficiently. It never changes. Unique Key (Natural Key): A real-world value (e.g., john@email.com or SSN-123). This is how humans or external systems identify the row. It can change (e.g., someone updates their email). 2. Common Real-World Use Cases A. User Management (The most classic example) Primary Key: user_id (Integer). Used for foreign keys in the ORDERS table. Unique Key 1: email (Varchar). Prevents two people from registering with the same email. Unique Key 2: username (Varchar). Ensures unique display names. Why? If a user changes their email address, you only update one field in one table. If you used email as the Primary Key, you would have to update millions of rows in the ORDERS table that reference that email. B. Inventory / E-Commerce Primary Key: product_id (Integer). Used internally by the code. Unique Key: SKU (Stock Keeping Unit) or Barcode (EAN/UPC). Why? Companies often re-organize their SKU formats. If the SKU was the Primary Key, a format change would require a massive database migration. C. Government / HR Systems Primary Key: employee_id (Integer). Unique Key: National_ID (SSN, Aadhaar, Passport Number). Why? Privacy and security. You do not want to expose a National ID in every URL or API call (e.g., api/employee/552 is safer than api/employee/SSN-123). > - How frequently would each of these keys conflict with a unique row > on the subscriber side? > It can occur with medium-to-high probability in following cases. (a) In Bi-Directional replication systems; for example, If two users create the same "User Profile" on two different servers at the same time, the row will conflict on every unique field (ID, Email, SSN) simultaneously. (b) The chances of bloat are high, on retrying to fix the error as mentioned by Shveta. Say, if Ops team fixes errors by just "trying again" without checking the full row, you will hit the ID error, fix it, then immediately hit the Email error. (c) The chances are medium during initial data-load; If a user is loading data from a legacy system with "dirty" data, rows often violate multiple rules (e.g., a duplicate user with both a reused ID and a reused Email). > If resolving this occasional, synthetic conflict requires inserting > two or three rows instead of a single one, this is an acceptable > trade-off considering how rare it can occur. > As per above analysis and the re-try point Shveta raises, I don't think we can ignore the possibility of data-bloat especially for this multiple_unique_key conflict. We can consider logging multiple local conflicting rows as JSON Array. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-02T06:36:37Z
On Tue, Dec 2, 2025 at 11:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Dec 1, 2025 at 10:11 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > The specific scenario we are discussing is when a single row from the > > publisher attempts to apply an operation that causes a conflict across > > multiple unique keys, with each of those unique key violations > > conflicting with a different local row on the subscriber, is very > > rare. IMHO this low-frequency scenario does not justify > > overcomplicating the design with an array field or a multi-level > > table. > > > > I did some analysis and search on the internet to answer your > following two questions. > > > Consider the infrequency of the root causes: > > - How often does a table have more than 3 to 4 unique keys? > > It is extremely common—in fact, it is considered the industry "best > practice" for modern database design. > > One can find this pattern in almost every enterprise system (e.g. > banking apps, CRMs). It relies on distinguishing between Technical > Identity (for the database) and Business Identity (for the real > world). > > 1. The Design Pattern: Surrogate vs. Natural Keys > Primary Key (Surrogate Key): Usually a meaningless number (e.g., > 10452) or a UUID. It is used strictly for the database to join tables > efficiently. It never changes. > Unique Key (Natural Key): A real-world value (e.g., john@email.com or > SSN-123). This is how humans or external systems identify the row. It > can change (e.g., someone updates their email). > > 2. Common Real-World Use Cases > A. User Management (The most classic example) > Primary Key: user_id (Integer). Used for foreign keys in the ORDERS table. > Unique Key 1: email (Varchar). Prevents two people from registering > with the same email. > Unique Key 2: username (Varchar). Ensures unique display names. > Why? If a user changes their email address, you only update one field > in one table. If you used email as the Primary Key, you would have to > update millions of rows in the ORDERS table that reference that email. > > B. Inventory / E-Commerce > Primary Key: product_id (Integer). Used internally by the code. > Unique Key: SKU (Stock Keeping Unit) or Barcode (EAN/UPC). > Why? Companies often re-organize their SKU formats. If the SKU was the > Primary Key, a format change would require a massive database > migration. > > C. Government / HR Systems > Primary Key: employee_id (Integer). > Unique Key: National_ID (SSN, Aadhaar, Passport Number). > Why? Privacy and security. You do not want to expose a National ID in > every URL or API call (e.g., api/employee/552 is safer than > api/employee/SSN-123). > > > - How frequently would each of these keys conflict with a unique row > > on the subscriber side? > > > > It can occur with medium-to-high probability in following cases. (a) > In Bi-Directional replication systems; for example, If two users > create the same "User Profile" on two different servers at the same > time, the row will conflict on every unique field (ID, Email, SSN) > simultaneously. (b) The chances of bloat are high, on retrying to fix > the error as mentioned by Shveta. Say, if Ops team fixes errors by > just "trying again" without checking the full row, you will hit the ID > error, fix it, then immediately hit the Email error. (c) The chances > are medium during initial data-load; If a user is loading data from a > legacy system with "dirty" data, rows often violate multiple rules > (e.g., a duplicate user with both a reused ID and a reused Email). > > > If resolving this occasional, synthetic conflict requires inserting > > two or three rows instead of a single one, this is an acceptable > > trade-off considering how rare it can occur. > > > > As per above analysis and the re-try point Shveta raises, I don't > think we can ignore the possibility of data-bloat especially for this > multiple_unique_key conflict. We can consider logging multiple local > conflicting rows as JSON Array. Okay, I will try to make multiple local rows as JSON Array in the next version. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-02T07:08:01Z
On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 2, 2025 at 11:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 1, 2025 at 10:11 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > The specific scenario we are discussing is when a single row from the > > > publisher attempts to apply an operation that causes a conflict across > > > multiple unique keys, with each of those unique key violations > > > conflicting with a different local row on the subscriber, is very > > > rare. IMHO this low-frequency scenario does not justify > > > overcomplicating the design with an array field or a multi-level > > > table. > > > > > > > I did some analysis and search on the internet to answer your > > following two questions. > > > > > Consider the infrequency of the root causes: > > > - How often does a table have more than 3 to 4 unique keys? > > > > It is extremely common—in fact, it is considered the industry "best > > practice" for modern database design. > > > > One can find this pattern in almost every enterprise system (e.g. > > banking apps, CRMs). It relies on distinguishing between Technical > > Identity (for the database) and Business Identity (for the real > > world). > > > > 1. The Design Pattern: Surrogate vs. Natural Keys > > Primary Key (Surrogate Key): Usually a meaningless number (e.g., > > 10452) or a UUID. It is used strictly for the database to join tables > > efficiently. It never changes. > > Unique Key (Natural Key): A real-world value (e.g., john@email.com or > > SSN-123). This is how humans or external systems identify the row. It > > can change (e.g., someone updates their email). > > > > 2. Common Real-World Use Cases > > A. User Management (The most classic example) > > Primary Key: user_id (Integer). Used for foreign keys in the ORDERS table. > > Unique Key 1: email (Varchar). Prevents two people from registering > > with the same email. > > Unique Key 2: username (Varchar). Ensures unique display names. > > Why? If a user changes their email address, you only update one field > > in one table. If you used email as the Primary Key, you would have to > > update millions of rows in the ORDERS table that reference that email. > > > > B. Inventory / E-Commerce > > Primary Key: product_id (Integer). Used internally by the code. > > Unique Key: SKU (Stock Keeping Unit) or Barcode (EAN/UPC). > > Why? Companies often re-organize their SKU formats. If the SKU was the > > Primary Key, a format change would require a massive database > > migration. > > > > C. Government / HR Systems > > Primary Key: employee_id (Integer). > > Unique Key: National_ID (SSN, Aadhaar, Passport Number). > > Why? Privacy and security. You do not want to expose a National ID in > > every URL or API call (e.g., api/employee/552 is safer than > > api/employee/SSN-123). > > > > > - How frequently would each of these keys conflict with a unique row > > > on the subscriber side? > > > > > > > It can occur with medium-to-high probability in following cases. (a) > > In Bi-Directional replication systems; for example, If two users > > create the same "User Profile" on two different servers at the same > > time, the row will conflict on every unique field (ID, Email, SSN) > > simultaneously. (b) The chances of bloat are high, on retrying to fix > > the error as mentioned by Shveta. Say, if Ops team fixes errors by > > just "trying again" without checking the full row, you will hit the ID > > error, fix it, then immediately hit the Email error. (c) The chances > > are medium during initial data-load; If a user is loading data from a > > legacy system with "dirty" data, rows often violate multiple rules > > (e.g., a duplicate user with both a reused ID and a reused Email). > > > > > If resolving this occasional, synthetic conflict requires inserting > > > two or three rows instead of a single one, this is an acceptable > > > trade-off considering how rare it can occur. > > > > > > > As per above analysis and the re-try point Shveta raises, I don't > > think we can ignore the possibility of data-bloat especially for this > > multiple_unique_key conflict. We can consider logging multiple local > > conflicting rows as JSON Array. > > Okay, I will try to make multiple local rows as JSON Array in the next version. > Just to clarify so that we are on the same page, along with the local tuple the other local fields like local_xid, local_commit_ts, local_origin will also be converted into the array. Hope that makes sense? So we will change the table like this, not sure if this makes sense to keep all local array fields nearby in the table, or let it be near the respective remote field, like we are doing now remote_xid and local xid together etc. Column | Type | Collation | Nullable | Default -------------------+--------------------------+-----------+----------+--------- relid | oid | | | schemaname | text | | | relname | text | | | conflict_type | text | | | local_xid | xid[] | | | remote_xid | xid | | | remote_commit_lsn | pg_lsn | | | local_commit_ts | timestamp with time zone[] | | | remote_commit_ts | timestamp with time zone | | | local_origin | text[] | | | remote_origin | text | | | key_tuple | json | | | local_tuple | json[] | | | remote_tuple | json | | | -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-02T09:17:42Z
On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > Okay, I will try to make multiple local rows as JSON Array in the next version. > > > Just to clarify so that we are on the same page, along with the local > tuple the other local fields like local_xid, local_commit_ts, > local_origin will also be converted into the array. Hope that makes > sense? > Yes, what about key_tuple or RI? > So we will change the table like this, not sure if this makes sense to > keep all local array fields nearby in the table, or let it be near the > respective remote field, like we are doing now remote_xid and local > xid together etc. > It is better to keep the array fields together at the end. I think it would be better to read via CLI. Also, it may take more space due to padding/alignment if we store fixed-width and variable-width columns interleaved and similarly the access will also be slower for interleaved cases. Having said that, can we consider an alternative way to store all local_conflict_info together as a JSONB column (that can be used to store an array of objects). For example, the multiple conflicting tuple information can be stored as: [ { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin": "node_A", "tuple": { "id": 1, "email": "a@b.com" } }, { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin": "node_B", "tuple": { "id": 2, "phone": "555-0199" } } ] To access JSON array columns, I think one needs to use the unnest function, whereas JSONB could be accessed with something like: "SELECT * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]". -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-02T11:15:24Z
On Tue, Dec 2, 2025 at 2:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > Okay, I will try to make multiple local rows as JSON Array in the next version. > > > > > Just to clarify so that we are on the same page, along with the local > > tuple the other local fields like local_xid, local_commit_ts, > > local_origin will also be converted into the array. Hope that makes > > sense? > > > > Yes, what about key_tuple or RI? > > > So we will change the table like this, not sure if this makes sense to > > keep all local array fields nearby in the table, or let it be near the > > respective remote field, like we are doing now remote_xid and local > > xid together etc. > > > > It is better to keep the array fields together at the end. I think it > would be better to read via CLI. Also, it may take more space due to > padding/alignment if we store fixed-width and variable-width columns > interleaved and similarly the access will also be slower for > interleaved cases. > > Having said that, can we consider an alternative way to store all > local_conflict_info together as a JSONB column (that can be used to > store an array of objects). For example, the multiple conflicting > tuple information can be stored as: > > [ > { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin": > "node_A", "tuple": { "id": 1, "email": "a@b.com" } }, > { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin": > "node_B", "tuple": { "id": 2, "phone": "555-0199" } } > ] > > To access JSON array columns, I think one needs to use the unnest > function, whereas JSONB could be accessed with something like: "SELECT > * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]". Yeah we can do that as well, maybe that's a better idea compared to creating separate array fields for each local element. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-02T15:10:15Z
On Tue, Dec 2, 2025 at 4:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 2, 2025 at 2:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > Okay, I will try to make multiple local rows as JSON Array in the next version. > > > > > > > Just to clarify so that we are on the same page, along with the local > > > tuple the other local fields like local_xid, local_commit_ts, > > > local_origin will also be converted into the array. Hope that makes > > > sense? > > > > > > > Yes, what about key_tuple or RI? > > > > > So we will change the table like this, not sure if this makes sense to > > > keep all local array fields nearby in the table, or let it be near the > > > respective remote field, like we are doing now remote_xid and local > > > xid together etc. > > > > > > > It is better to keep the array fields together at the end. I think it > > would be better to read via CLI. Also, it may take more space due to > > padding/alignment if we store fixed-width and variable-width columns > > interleaved and similarly the access will also be slower for > > interleaved cases. > > > > Having said that, can we consider an alternative way to store all > > local_conflict_info together as a JSONB column (that can be used to > > store an array of objects). For example, the multiple conflicting > > tuple information can be stored as: > > > > [ > > { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin": > > "node_A", "tuple": { "id": 1, "email": "a@b.com" } }, > > { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin": > > "node_B", "tuple": { "id": 2, "phone": "555-0199" } } > > ] > > > > To access JSON array columns, I think one needs to use the unnest > > function, whereas JSONB could be accessed with something like: "SELECT > > * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]". > > Yeah we can do that as well, maybe that's a better idea compared to > creating separate array fields for each local element. So I tried the POC idea with this approach and tested with one of the test cases given by Shveta, and now the conflict log table entry looks like this. So we can see the local conflicts field which is an array of JSON and each entry of the array is formed using (xid, commit_ts, origin, json tuple). I will send the updated patch by tomorrow after doing some more cleanup and testing. relid | 16391 schemaname | public relname | conf_tab conflict_type | multiple_unique_conflicts remote_xid | 761 remote_commit_lsn | 0/01761400 remote_commit_ts | 2025-12-02 15:02:07.045935+00 remote_origin | pg_16406 key_tuple | remote_tuple | {"a":2,"b":3,"c":4} local_conflicts | {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-03T04:19:10Z
On Tue, Dec 2, 2025 at 8:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 2, 2025 at 4:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Dec 2, 2025 at 2:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > > > > Okay, I will try to make multiple local rows as JSON Array in the next version. > > > > > > > > > Just to clarify so that we are on the same page, along with the local > > > > tuple the other local fields like local_xid, local_commit_ts, > > > > local_origin will also be converted into the array. Hope that makes > > > > sense? > > > > > > > > > > Yes, what about key_tuple or RI? > > > > > > > So we will change the table like this, not sure if this makes sense to > > > > keep all local array fields nearby in the table, or let it be near the > > > > respective remote field, like we are doing now remote_xid and local > > > > xid together etc. > > > > > > > > > > It is better to keep the array fields together at the end. I think it > > > would be better to read via CLI. Also, it may take more space due to > > > padding/alignment if we store fixed-width and variable-width columns > > > interleaved and similarly the access will also be slower for > > > interleaved cases. > > > > > > Having said that, can we consider an alternative way to store all > > > local_conflict_info together as a JSONB column (that can be used to > > > store an array of objects). For example, the multiple conflicting > > > tuple information can be stored as: > > > > > > [ > > > { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin": > > > "node_A", "tuple": { "id": 1, "email": "a@b.com" } }, > > > { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin": > > > "node_B", "tuple": { "id": 2, "phone": "555-0199" } } > > > ] > > > > > > To access JSON array columns, I think one needs to use the unnest > > > function, whereas JSONB could be accessed with something like: "SELECT > > > * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]". > > > > Yeah we can do that as well, maybe that's a better idea compared to > > creating separate array fields for each local element. > > So I tried the POC idea with this approach and tested with one of the > test cases given by Shveta, and now the conflict log table entry looks > like this. So we can see the local conflicts field which is an array > of JSON and each entry of the array is formed using (xid, commit_ts, > origin, json tuple). I will send the updated patch by tomorrow after > doing some more cleanup and testing. > > relid | 16391 > schemaname | public > relname | conf_tab > conflict_type | multiple_unique_conflicts > remote_xid | 761 > remote_commit_lsn | 0/01761400 > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > remote_origin | pg_16406 > key_tuple | > remote_tuple | {"a":2,"b":3,"c":4} > local_conflicts | > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > Thanks, it looks good. For the benefit of others, could you include a brief note, perhaps in the commit message for now, describing how to access or read this array column? We can remove it later. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-03T11:26:49Z
On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > relid | 16391 > > schemaname | public > > relname | conf_tab > > conflict_type | multiple_unique_conflicts > > remote_xid | 761 > > remote_commit_lsn | 0/01761400 > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > remote_origin | pg_16406 > > key_tuple | > > remote_tuple | {"a":2,"b":3,"c":4} > > local_conflicts | > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > Thanks, it looks good. For the benefit of others, could you include a > brief note, perhaps in the commit message for now, describing how to > access or read this array column? We can remove it later. Thanks, okay, temporarily I have added in a commit message how we can fetch the data from the JSON array field. In next version I will add a test to get the conflict stored in conflict log history table and fetch from it. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Masahiko Sawada <sawada.mshk@gmail.com> — 2025-12-04T02:00:47Z
On Wed, Dec 3, 2025 at 3:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > relid | 16391 > > > schemaname | public > > > relname | conf_tab > > > conflict_type | multiple_unique_conflicts > > > remote_xid | 761 > > > remote_commit_lsn | 0/01761400 > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > remote_origin | pg_16406 > > > key_tuple | > > > remote_tuple | {"a":2,"b":3,"c":4} > > > local_conflicts | > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > brief note, perhaps in the commit message for now, describing how to > > access or read this array column? We can remove it later. > > Thanks, okay, temporarily I have added in a commit message how we can > fetch the data from the JSON array field. In next version I will add > a test to get the conflict stored in conflict log history table and > fetch from it. > I've reviewed the v9 patch and here are some comments: The patch utilizes SPI for creating and dropping the conflict history table, but I'm really not sure if it's okay because it's actually affected by some GUC parameters such as default_tablespace and default_toast_compression etc. Also, probably some hooks and event triggers could be fired during the creation and removal. Is it intentional behavior? I'm concerned that it would make investigation harder if an issue happened in the user environment. --- + /* build and execute the CREATE TABLE query. */ + appendStringInfo(&querybuf, + "CREATE TABLE %s.%s (" + "relid Oid," + "schemaname TEXT," + "relname TEXT," + "conflict_type TEXT," + "remote_xid xid," + "remote_commit_lsn pg_lsn," + "remote_commit_ts TIMESTAMPTZ," + "remote_origin TEXT," + "key_tuple JSON," + "remote_tuple JSON," + "local_conflicts JSON[])", + quote_identifier(get_namespace_name(namespaceId)), + quote_identifier(conflictrel)); If we want to use SPI for history table creation, we should use qualified names in all the places including data types. --- The patch doesn't create the dependency between the subscription and the conflict history table. So users can entirely drop the schema (with CASCADE option) where the history table is created. And once dropping the schema along with the history table, ALTER SUBSCRIPTION ... SET (conflict_history_table = '') seems not to work (I got a SEGV). --- We can create the history table in pg_temp namespace but it should not be allowed. --- I think the conflict history table should not be transferred to the new cluster when pg_upgrade since the table definition could be different across major versions. I got the following log when the publisher disables track_commit_timestamp: local_conflicts | {"{\"xid\":\"790\",\"commit_ts\":\"1999-12-31T16:00:00-08:00\",\"origin\":\"\",\"tuple\":{\"c\":1}}"} I think we can omit commit_ts when it's omitted. --- I think we should keep the history table name case-sensitive: postgres(1:351685)=# create subscription sub connection 'dbname=postgres port=5551' publication pub with (conflict_log_table = 'LOGTABLE'); CREATE SUBSCRIPTION postgres(1:351685)=# \d List of relations Schema | Name | Type | Owner --------+----------+-------+---------- public | test | table | masahiko public | logtable | table | masahiko (2 rows) Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-04T05:18:42Z
On Thu, Dec 4, 2025 at 7:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Dec 3, 2025 at 3:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > relid | 16391 > > > > schemaname | public > > > > relname | conf_tab > > > > conflict_type | multiple_unique_conflicts > > > > remote_xid | 761 > > > > remote_commit_lsn | 0/01761400 > > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > > remote_origin | pg_16406 > > > > key_tuple | > > > > remote_tuple | {"a":2,"b":3,"c":4} > > > > local_conflicts | > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > > brief note, perhaps in the commit message for now, describing how to > > > access or read this array column? We can remove it later. > > > > Thanks, okay, temporarily I have added in a commit message how we can > > fetch the data from the JSON array field. In next version I will add > > a test to get the conflict stored in conflict log history table and > > fetch from it. > > > > I've reviewed the v9 patch and here are some comments: Thanks for reviewing this and your valuable comments. > The patch utilizes SPI for creating and dropping the conflict history > table, but I'm really not sure if it's okay because it's actually > affected by some GUC parameters such as default_tablespace and > default_toast_compression etc. Also, probably some hooks and event > triggers could be fired during the creation and removal. Is it > intentional behavior? I'm concerned that it would make investigation > harder if an issue happened in the user environment. Hmm, interesting point, well we can control the value of default parameters while creating the table using SPI, but I don't see any reason to not use heap_create_with_catalog() directly, so maybe that's a better choice than using SPI because then we don't need to bother about any event triggers/utility hooks etc. Although I don't see any specific issue with that, unless the user intentionally wants to create trouble while creating this table. What do others think about it? > --- > + /* build and execute the CREATE TABLE query. */ > + appendStringInfo(&querybuf, > + "CREATE TABLE %s.%s (" > + "relid Oid," > + "schemaname TEXT," > + "relname TEXT," > + "conflict_type TEXT," > + "remote_xid xid," > + "remote_commit_lsn pg_lsn," > + "remote_commit_ts TIMESTAMPTZ," > + "remote_origin TEXT," > + "key_tuple JSON," > + "remote_tuple JSON," > + "local_conflicts JSON[])", > + quote_identifier(get_namespace_name(namespaceId)), > + quote_identifier(conflictrel)); > > If we want to use SPI for history table creation, we should use > qualified names in all the places including data types. That's true, so that we can avoid interference of any user created types. > --- > The patch doesn't create the dependency between the subscription and > the conflict history table. So users can entirely drop the schema > (with CASCADE option) where the history table is created. I think as part of the initial discussion we thought since it is created under the subscription owner privileges so only that user can drop that table and if the user intentionally drops the table the conflict will not be recorded in the table and that's acceptable. But now I think it would be a good idea to maintain the dependency with subscription so that users can not drop it without dropping the subscription. And once > dropping the schema along with the history table, ALTER SUBSCRIPTION > ... SET (conflict_history_table = '') seems not to work (I got a > SEGV). I will check this, thanks > --- > We can create the history table in pg_temp namespace but it should not > be allowed. Right, will check this and also add the test for the same. > --- > I think the conflict history table should not be transferred to the > new cluster when pg_upgrade since the table definition could be > different across major versions. Let me think more on this with respect to behaviour of other factors like subscriptions etc. > I got the following log when the publisher disables track_commit_timestamp: > > local_conflicts | > {"{\"xid\":\"790\",\"commit_ts\":\"1999-12-31T16:00:00-08:00\",\"origin\":\"\",\"tuple\":{\"c\":1}}"} > > I think we can omit commit_ts when it's omitted. +1 > --- > I think we should keep the history table name case-sensitive: Yeah we can do that, it looks good to me, what do others think about it? -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-04T06:51:20Z
On Wed, Dec 3, 2025 at 4:57 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > Thanks, it looks good. For the benefit of others, could you include a > > brief note, perhaps in the commit message for now, describing how to > > access or read this array column? We can remove it later. > > Thanks, okay, temporarily I have added in a commit message how we can > fetch the data from the JSON array field. In next version I will add > a test to get the conflict stored in conflict log history table and > fetch from it. > Thanks, I have not looked at the patch in detail yet, but a few things: 1) Assert is hit here: LOG: logical replication apply worker for subscription "sub1" has started TRAP: failed Assert("slot != NULL"), File: "conflict.c", Line: 669, PID: 137604 Steps: create table tab1 (i int primary key, j int); Pub: insert into tab1 values(10,10); insert into tab1 values(20,10); Sub: delete from tab1 where i=10; Pub: delete from tab1 where i=10; 2) I see that key_tuple still points to RI and there is no RI field added. It seems that discussion at [1] is missed in this patch. [1]: https://www.postgresql.org/message-id/CAA4eK1L3umixUUik7Ef1eU%3Dx-JMb8iXD7rWWExBMP4dmOGTS9A%40mail.gmail.com thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-12-04T07:01:56Z
Hi. Some review comments for v9-0001. ====== Commit message. 1. Note: A single remote tuple may conflict with multiple local conflict when conflict type is CT_MULTIPLE_UNIQUE_CONFLICTS, so for handling this case we create a single row in conflict log table with respect to each remote conflict row even if it conflicts with multiple local rows and we store the multiple conflict tuples as a single JSON array element in format as [ { "xid": "1001", "commit_ts": "...", "origin": "...", "tuple": {...} }, ... ] We can extract the elements from local tuple as given in below example ~ Something seems broken/confused with this description: 1a. "A single remote tuple may conflict with multiple local conflict" Should that say "... with multiple local tuples" ? ~ 1b. There is a mixture of terminology here, "row" vs "tuple", which doesn't seem correct. ~ 1c. "We can extract the elements from local tuple" Should that say "... elements of the local tuples from the CLT row ..." ====== src/backend/replication/logical/conflict.c 2. + +#define N_LOCAL_CONFLICT_INFO_ATTRS 4 I felt it would be better to put this where it is used. e.g. IMO put it within the build_conflict_tupledesc(). ~~~ InsertConflictLogTuple: 3. + /* A valid tuple must be prepared and store in MyLogicalRepWorker. */ Typo still here: /store in/stored in/ ~~~ 4. +static TupleDesc +build_conflict_tupledesc(void) +{ + TupleDesc tupdesc; + + tupdesc = CreateTemplateTupleDesc(N_LOCAL_CONFLICT_INFO_ATTRS); + + TupleDescInitEntry(tupdesc, (AttrNumber) 1, "xid", + XIDOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 2, "commit_ts", + TIMESTAMPTZOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 3, "origin", + TEXTOID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 4, "tuple", + JSONOID, -1, 0); If you had some incrementing attno instead of hard-wiring the (1,2,3,4) then you'd be able to add a sanity check like Assert(attno + 1 == N_LOCAL_CONFLICT_INFO_ATTRS); that can safeguard against future mistakes in case something changes without updating the constant. ~~~ build_local_conflicts_json_array: 5. + /* Process local conflict tuple list and prepare a array of JSON. */ + foreach(lc, conflicttuples) { - tableslot = table_slot_create(localrel, &estate->es_tupleTable); - tableslot = ExecCopySlot(tableslot, slot); + ConflictTupleInfo *conflicttuple = (ConflictTupleInfo *) lfirst(lc); 5a. typo in comment: /a array/an array/ ~ 5b. SUGGESTION foreach_ptr(ConflictTupleInfo, conflicttuple, confrlicttuples) { ~~~ 6. + i = 0; + foreach(lc, json_datums) + { + json_datum_array[i] = (Datum) lfirst(lc); + json_null_array[i] = false; + i++; + } 6a. The loop seemed to be unnecessarily complicated since you already know the size. Isn't it the same as below? SUGGESTION for (int i = 0; i < num_conflicts; i++) { json_datum_array[i] = (Datum) list_nth(json_datums, i); json_null_array[i] = false; } 6b. Also, there is probably no need to do json_null_array[i] = false; at every iteration here, because you could have just used palloc0 for the whole array in the first place. ====== src/test/regress/expected/subscription.out 7. +-- check if the table exists and has the correct schema (15 columns) +SELECT count(*) FROM pg_attribute WHERE attrelid = 'public.regress_conflict_log1'::regclass AND attnum > 0; + count +------- + 11 +(1 row) + That comment is wrong; there aren't 15 columns anymore. ~~~ 8. (mentioned in a previous review) I felt that \dRs should display the CLT's schema name in the "Conflict log table" field -- at least when it's not "public". Otherwise, it won't be easy for the user to know it. I did not see a test case for this. ~~~ 9. (mentioned in a previous review) You could have another test case to explicitly call the function pg_relation_is_publishable(clt) to verify it returns false for a CTL table. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-04T10:50:22Z
On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Dec 4, 2025 at 7:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > The patch utilizes SPI for creating and dropping the conflict history > > table, but I'm really not sure if it's okay because it's actually > > affected by some GUC parameters such as default_tablespace and > > default_toast_compression etc. Also, probably some hooks and event > > triggers could be fired during the creation and removal. Is it > > intentional behavior? I'm concerned that it would make investigation > > harder if an issue happened in the user environment. > > Hmm, interesting point, well we can control the value of default > parameters while creating the table using SPI, but I don't see any > reason to not use heap_create_with_catalog() directly, so maybe that's > a better choice than using SPI because then we don't need to bother > about any event triggers/utility hooks etc. Although I don't see any > specific issue with that, unless the user intentionally wants to > create trouble while creating this table. What do others think about > it? > > > --- > > + /* build and execute the CREATE TABLE query. */ > > + appendStringInfo(&querybuf, > > + "CREATE TABLE %s.%s (" > > + "relid Oid," > > + "schemaname TEXT," > > + "relname TEXT," > > + "conflict_type TEXT," > > + "remote_xid xid," > > + "remote_commit_lsn pg_lsn," > > + "remote_commit_ts TIMESTAMPTZ," > > + "remote_origin TEXT," > > + "key_tuple JSON," > > + "remote_tuple JSON," > > + "local_conflicts JSON[])", > > + quote_identifier(get_namespace_name(namespaceId)), > > + quote_identifier(conflictrel)); > > > > If we want to use SPI for history table creation, we should use > > qualified names in all the places including data types. > > That's true, so that we can avoid interference of any user created types. > > > --- > > The patch doesn't create the dependency between the subscription and > > the conflict history table. So users can entirely drop the schema > > (with CASCADE option) where the history table is created. > > I think as part of the initial discussion we thought since it is > created under the subscription owner privileges so only that user can > drop that table and if the user intentionally drops the table the > conflict will not be recorded in the table and that's acceptable. But > now I think it would be a good idea to maintain the dependency with > subscription so that users can not drop it without dropping the > subscription. > Yeah, it seems reasonable to maintain its dependency with the subscription in this model. BTW, for this it would be easier to record dependency, if we use heap_create_with_catalog() as we do for create_toast_table(). The other places where we use SPI interface to execute statements are either the places where we need to execute multiple SQL statements or non-CREATE Table statements. So, for this patch's purpose, I feel heap_create_with_catalog() suits more. I was also thinking whether it is a good idea to create one global conflict table and let all subscriptions use it. However, it has disadvantages like whenever, user drops any subscription, we need to DELETE all conflict rows for that subscription causing the need for vacuum. Then we somehow need to ensure that conflicts from one subscription_owner are not visible to other subscription_owner via some RLS policy. So, catalog table per-subscription (aka) the current way appears better. Also, shall we give the option to the user where she wants to see conflict/resolution information? One idea to achieve the same is to provide subscription options like (a) conflict_resolution_format, the values could be log and table for now, in future, one could extend it to other options like xml, json, etc. (b) conflict_log_table: in this user can specify the conflict table name, this can be optional such that if user omits this and conflict_resolution_format is table, then we will use internally generated table name like pg_conflicts_<subscription_id>. > And once > > dropping the schema along with the history table, ALTER SUBSCRIPTION > > ... SET (conflict_history_table = '') seems not to work (I got a > > SEGV). > > I will check this, thanks > > > --- > > We can create the history table in pg_temp namespace but it should not > > be allowed. > > Right, will check this and also add the test for the same. > > > --- > > I think the conflict history table should not be transferred to the > > new cluster when pg_upgrade since the table definition could be > > different across major versions. > > Let me think more on this with respect to behaviour of other factors > like subscriptions etc. > Can we deal with different schema of tables across versions via pg_dump/restore during upgrade? -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-04T14:35:31Z
On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > relid | 16391 > > > schemaname | public > > > relname | conf_tab > > > conflict_type | multiple_unique_conflicts > > > remote_xid | 761 > > > remote_commit_lsn | 0/01761400 > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > remote_origin | pg_16406 > > > key_tuple | > > > remote_tuple | {"a":2,"b":3,"c":4} > > > local_conflicts | > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > brief note, perhaps in the commit message for now, describing how to > > access or read this array column? We can remove it later. > > Thanks, okay, temporarily I have added in a commit message how we can > fetch the data from the JSON array field. In next version I will add > a test to get the conflict stored in conflict log history table and > fetch from it. I noticed that the table structure can get changed by the time the conflict record is prepared. In ReportApplyConflict(), the code currently prepares the conflict log tuple before deciding whether the insertion will be immediate or deferred: + /* Insert conflict details to conflict log table. */ + if (conflictlogrel) + { + /* + * Prepare the conflict log tuple. If the error level is below ERROR, + * insert it immediately. Otherwise, defer the insertion to a new + * transaction after the current one aborts, ensuring the insertion of + * the log tuple is not rolled back. + */ + prepare_conflict_log_tuple(estate, + relinfo->ri_RelationDesc, + conflictlogrel, + type, + searchslot, + conflicttuples, + remoteslot); + if (elevel < ERROR) + InsertConflictLogTuple(conflictlogrel); + + table_close(conflictlogrel, RowExclusiveLock); + } If the conflict history table defintion is changed just before prepare_conflict_log_tuple, the tuple creation will crash: Program received signal SIGSEGV, Segmentation fault. 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at ../../../../src/include/varatt.h:419 419 return VARATT_IS_4B_U(PTR) && (gdb) bt #0 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at ../../../../src/include/varatt.h:419 #1 0x00005a342e01e5ed in heap_compute_data_size (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20, isnull=0x7ffd7af3ad15) at heaptuple.c:239 #2 0x00005a342e0200dd in heap_form_tuple (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20, isnull=0x7ffd7af3ad15) at heaptuple.c:1158 #3 0x00005a342e55e8c2 in prepare_conflict_log_tuple (estate=0x5a3467944530, rel=0x7ab405e594e8, conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS, searchslot=0x0, conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936 #4 0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530, relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS, searchslot=0x0, remoteslot=0x5a346792e498, conflicttuples=0x5a3467942da0) at conflict.c:168 #5 0x00005a342e348c35 in CheckAndReportConflict (resultRelInfo=0x5a346792e778, estate=0x5a3467944530, type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0, remoteslot=0x5a346792e498) at execReplication.c:793 This can be reproduced by the following steps: CREATE PUBLICATION pub; CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict'); ALTER TABLE conflict RENAME TO conflict1: CREATE TABLE conflict(c1 varchar, c2 varchar); -- Cause a conflict, this will crash while trying to prepare the conflicting tuple Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-05T03:54:18Z
On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > relid | 16391 > > > schemaname | public > > > relname | conf_tab > > > conflict_type | multiple_unique_conflicts > > > remote_xid | 761 > > > remote_commit_lsn | 0/01761400 > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > remote_origin | pg_16406 > > > key_tuple | > > > remote_tuple | {"a":2,"b":3,"c":4} > > > local_conflicts | > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > brief note, perhaps in the commit message for now, describing how to > > access or read this array column? We can remove it later. > > Thanks, okay, temporarily I have added in a commit message how we can > fetch the data from the JSON array field. In next version I will add > a test to get the conflict stored in conflict log history table and > fetch from it. Few comments: 1) Currently pg_dump is not dumping conflict_log_table option, I felt it should be included while dumping. 2) Is there a way to unset the conflict log table after we create the subscription with conflict_log_table option 3) Any reason why this table should not be allowed to add to a publication: + /* Can't be conflict log table */ + if (IsConflictLogTable(RelationGetRelid(targetrel))) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot add relation \"%s.%s\" to publication", + get_namespace_name(RelationGetNamespace(targetrel)), + RelationGetRelationName(targetrel)), + errdetail("This operation is not supported for conflict log tables."))); Is the reason like the same table can be a conflict table in the subscriber and prevent corruption in the subscriber 4) I did not find any documentation for this feature, can we include documentation in create_subscription.sgml, alter_subscription.sgml and logical_replication.sgml Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-05T05:09:47Z
On Thu, Dec 4, 2025 at 8:05 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > relid | 16391 > > > > schemaname | public > > > > relname | conf_tab > > > > conflict_type | multiple_unique_conflicts > > > > remote_xid | 761 > > > > remote_commit_lsn | 0/01761400 > > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > > remote_origin | pg_16406 > > > > key_tuple | > > > > remote_tuple | {"a":2,"b":3,"c":4} > > > > local_conflicts | > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > > brief note, perhaps in the commit message for now, describing how to > > > access or read this array column? We can remove it later. > > > > Thanks, okay, temporarily I have added in a commit message how we can > > fetch the data from the JSON array field. In next version I will add > > a test to get the conflict stored in conflict log history table and > > fetch from it. > > I noticed that the table structure can get changed by the time the > conflict record is prepared. In ReportApplyConflict(), the code > currently prepares the conflict log tuple before deciding whether the > insertion will be immediate or deferred: > + /* Insert conflict details to conflict log table. */ > + if (conflictlogrel) > + { > + /* > + * Prepare the conflict log tuple. If the error level > is below ERROR, > + * insert it immediately. Otherwise, defer the > insertion to a new > + * transaction after the current one aborts, ensuring > the insertion of > + * the log tuple is not rolled back. > + */ > + prepare_conflict_log_tuple(estate, > + > relinfo->ri_RelationDesc, > + > conflictlogrel, > + type, > + searchslot, > + > conflicttuples, > + remoteslot); > + if (elevel < ERROR) > + InsertConflictLogTuple(conflictlogrel); > + > + table_close(conflictlogrel, RowExclusiveLock); > + } > > If the conflict history table defintion is changed just before > prepare_conflict_log_tuple, the tuple creation will crash: > Program received signal SIGSEGV, Segmentation fault. > 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at > ../../../../src/include/varatt.h:419 > 419 return VARATT_IS_4B_U(PTR) && > (gdb) bt > #0 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at > ../../../../src/include/varatt.h:419 > #1 0x00005a342e01e5ed in heap_compute_data_size > (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20, > isnull=0x7ffd7af3ad15) at heaptuple.c:239 > #2 0x00005a342e0200dd in heap_form_tuple > (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20, > isnull=0x7ffd7af3ad15) at heaptuple.c:1158 > #3 0x00005a342e55e8c2 in prepare_conflict_log_tuple > (estate=0x5a3467944530, rel=0x7ab405e594e8, > conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS, > searchslot=0x0, > conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936 > #4 0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530, > relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS, > searchslot=0x0, remoteslot=0x5a346792e498, > conflicttuples=0x5a3467942da0) at conflict.c:168 > #5 0x00005a342e348c35 in CheckAndReportConflict > (resultRelInfo=0x5a346792e778, estate=0x5a3467944530, > type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0, > remoteslot=0x5a346792e498) at execReplication.c:793 > > This can be reproduced by the following steps: > CREATE PUBLICATION pub; > CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict'); > ALTER TABLE conflict RENAME TO conflict1: > CREATE TABLE conflict(c1 varchar, c2 varchar); > -- Cause a conflict, this will crash while trying to prepare the > conflicting tuple Yeah while it is allowed to drop or alter the conflict log table, it should not seg fault, IMHO error is acceptable as per the initial discussion, so I will look into this and tighten up the logic so that it will throw an error whenever it can not insert into the conflict log table. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-05T05:16:44Z
On Fri, Dec 5, 2025 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > relid | 16391 > > > > schemaname | public > > > > relname | conf_tab > > > > conflict_type | multiple_unique_conflicts > > > > remote_xid | 761 > > > > remote_commit_lsn | 0/01761400 > > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > > remote_origin | pg_16406 > > > > key_tuple | > > > > remote_tuple | {"a":2,"b":3,"c":4} > > > > local_conflicts | > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > > brief note, perhaps in the commit message for now, describing how to > > > access or read this array column? We can remove it later. > > > > Thanks, okay, temporarily I have added in a commit message how we can > > fetch the data from the JSON array field. In next version I will add > > a test to get the conflict stored in conflict log history table and > > fetch from it. > > Few comments: > 1) Currently pg_dump is not dumping conflict_log_table option, I felt > it should be included while dumping. Yeah, we should. > 2) Is there a way to unset the conflict log table after we create the > subscription with conflict_log_table option IMHO we can use ALTER SUBSCRIPTION...WITH(conflict_log_table='') so unset? What do others think about it? > 3) Any reason why this table should not be allowed to add to a publication: > + /* Can't be conflict log table */ > + if (IsConflictLogTable(RelationGetRelid(targetrel))) > + ereport(ERROR, > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("cannot add relation \"%s.%s\" > to publication", > + > get_namespace_name(RelationGetNamespace(targetrel)), > + > RelationGetRelationName(targetrel)), > + errdetail("This operation is not > supported for conflict log tables."))); > > Is the reason like the same table can be a conflict table in the > subscriber and prevent corruption in the subscriber The main reason was that, since these tables are internally created for maintaining the conflict information which is very much internal node specific details, so there is no reason someone want to replicate those tables, so we blocked it with ALL TABLES option and then based on suggestion from Shveta we blocked it from getting added to publication as well. So there is no strong reason to disallow from forcefully getting added to publication OTOH there is no reason why someone wants to do that considering those are internally managed tables. > 4) I did not find any documentation for this feature, can we include > documentation in create_subscription.sgml, alter_subscription.sgml and > logical_replication.sgml Yeah, in the initial version I posted a doc patch, but since we are doing changes in the first patch and also some behavior might change so I will postpone it for a later stage after we have consensus on most of the behaviour. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-05T09:29:54Z
On Fri, Dec 5, 2025 at 10:47 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 5, 2025 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > 2) Is there a way to unset the conflict log table after we create the > > subscription with conflict_log_table option > > IMHO we can use ALTER SUBSCRIPTION...WITH(conflict_log_table='') so > unset? What do others think about it? > We already have a syntax: ALTER SUBSCRIPTION name SET ( subscription_parameter [= value] [, ... ] ) which can be used to set/unset this new subscription option. > > 3) Any reason why this table should not be allowed to add to a publication: > > + /* Can't be conflict log table */ > > + if (IsConflictLogTable(RelationGetRelid(targetrel))) > > + ereport(ERROR, > > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > + errmsg("cannot add relation \"%s.%s\" > > to publication", > > + > > get_namespace_name(RelationGetNamespace(targetrel)), > > + > > RelationGetRelationName(targetrel)), > > + errdetail("This operation is not > > supported for conflict log tables."))); > > > > Is the reason like the same table can be a conflict table in the > > subscriber and prevent corruption in the subscriber > > The main reason was that, since these tables are internally created > for maintaining the conflict information which is very much internal > node specific details, so there is no reason someone want to replicate > those tables, so we blocked it with ALL TABLES option and then based > on suggestion from Shveta we blocked it from getting added to > publication as well. So there is no strong reason to disallow from > forcefully getting added to publication OTOH there is no reason why > someone wants to do that considering those are internally managed > tables. > I also don't see any reason to allow such internal tables to be replicated. So, it is okay to prohibit them for now. If we see any use case, we can allow it. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-05T09:55:07Z
On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Also, shall we give the option to the user where she wants to see > conflict/resolution information? One idea to achieve the same is to > provide subscription options like (a) conflict_resolution_format, the > values could be log and table for now, in future, one could extend it > to other options like xml, json, etc. (b) conflict_log_table: in this > user can specify the conflict table name, this can be optional such > that if user omits this and conflict_resolution_format is table, then > we will use internally generated table name like > pg_conflicts_<subscription_id>. > In this idea, we can keep the name of the second option as conflict_log_name instead of conflict_log_table. This can help us LOG the conflicts in a totally separate conflict file instead of in server log. Say, the user provides conflict_resolution_format as 'log' and conflict_log_name as 'conflict_report' then we can report conflicts in this separate file by appending subid to distinguish it. And, if the user gives only the first option conflict_resolution_format as 'log' then we can keep reporting the information in server log files. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-05T10:13:38Z
On Fri, Dec 5, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Also, shall we give the option to the user where she wants to see > > conflict/resolution information? One idea to achieve the same is to > > provide subscription options like (a) conflict_resolution_format, the > > values could be log and table for now, in future, one could extend it > > to other options like xml, json, etc. (b) conflict_log_table: in this > > user can specify the conflict table name, this can be optional such > > that if user omits this and conflict_resolution_format is table, then > > we will use internally generated table name like > > pg_conflicts_<subscription_id>. > > > > In this idea, we can keep the name of the second option as > conflict_log_name instead of conflict_log_table. This can help us LOG > the conflicts in a totally separate conflict file instead of in server > log. Say, the user provides conflict_resolution_format as 'log' and > conflict_log_name as 'conflict_report' then we can report conflicts in > this separate file by appending subid to distinguish it. And, if the > user gives only the first option conflict_resolution_format as 'log' > then we can keep reporting the information in server log files. > +1 on the idea. Instead of using conflict_resolution_format, I feel it should be conflict_log_format as we are referring to LOGs and not resolutions. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-06T04:50:14Z
On Fri, Dec 5, 2025 at 3:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Also, shall we give the option to the user where she wants to see > > conflict/resolution information? One idea to achieve the same is to > > provide subscription options like (a) conflict_resolution_format, the > > values could be log and table for now, in future, one could extend it > > to other options like xml, json, etc. (b) conflict_log_table: in this > > user can specify the conflict table name, this can be optional such > > that if user omits this and conflict_resolution_format is table, then > > we will use internally generated table name like > > pg_conflicts_<subscription_id>. > > > > In this idea, we can keep the name of the second option as > conflict_log_name instead of conflict_log_table. This can help us LOG > the conflicts in a totally separate conflict file instead of in server > log. Say, the user provides conflict_resolution_format as 'log' and > conflict_log_name as 'conflict_report' then we can report conflicts in > this separate file by appending subid to distinguish it. And, if the > user gives only the first option conflict_resolution_format as 'log' > then we can keep reporting the information in server log files. Yeah that looks good, so considering the extensibility I think we can keep the option name as 'conflict_log_name' from the first version itself even if we don't provide all the options in the first version. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-06T15:06:30Z
On Fri, Dec 5, 2025 at 10:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Dec 4, 2025 at 8:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > relid | 16391 > > > > > schemaname | public > > > > > relname | conf_tab > > > > > conflict_type | multiple_unique_conflicts > > > > > remote_xid | 761 > > > > > remote_commit_lsn | 0/01761400 > > > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > > > remote_origin | pg_16406 > > > > > key_tuple | > > > > > remote_tuple | {"a":2,"b":3,"c":4} > > > > > local_conflicts | > > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > > > brief note, perhaps in the commit message for now, describing how to > > > > access or read this array column? We can remove it later. > > > > > > Thanks, okay, temporarily I have added in a commit message how we can > > > fetch the data from the JSON array field. In next version I will add > > > a test to get the conflict stored in conflict log history table and > > > fetch from it. > > > > I noticed that the table structure can get changed by the time the > > conflict record is prepared. In ReportApplyConflict(), the code > > currently prepares the conflict log tuple before deciding whether the > > insertion will be immediate or deferred: > > + /* Insert conflict details to conflict log table. */ > > + if (conflictlogrel) > > + { > > + /* > > + * Prepare the conflict log tuple. If the error level > > is below ERROR, > > + * insert it immediately. Otherwise, defer the > > insertion to a new > > + * transaction after the current one aborts, ensuring > > the insertion of > > + * the log tuple is not rolled back. > > + */ > > + prepare_conflict_log_tuple(estate, > > + > > relinfo->ri_RelationDesc, > > + > > conflictlogrel, > > + type, > > + searchslot, > > + > > conflicttuples, > > + remoteslot); > > + if (elevel < ERROR) > > + InsertConflictLogTuple(conflictlogrel); > > + > > + table_close(conflictlogrel, RowExclusiveLock); > > + } > > > > If the conflict history table defintion is changed just before > > prepare_conflict_log_tuple, the tuple creation will crash: > > Program received signal SIGSEGV, Segmentation fault. > > 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at > > ../../../../src/include/varatt.h:419 > > 419 return VARATT_IS_4B_U(PTR) && > > (gdb) bt > > #0 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at > > ../../../../src/include/varatt.h:419 > > #1 0x00005a342e01e5ed in heap_compute_data_size > > (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20, > > isnull=0x7ffd7af3ad15) at heaptuple.c:239 > > #2 0x00005a342e0200dd in heap_form_tuple > > (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20, > > isnull=0x7ffd7af3ad15) at heaptuple.c:1158 > > #3 0x00005a342e55e8c2 in prepare_conflict_log_tuple > > (estate=0x5a3467944530, rel=0x7ab405e594e8, > > conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS, > > searchslot=0x0, > > conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936 > > #4 0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530, > > relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS, > > searchslot=0x0, remoteslot=0x5a346792e498, > > conflicttuples=0x5a3467942da0) at conflict.c:168 > > #5 0x00005a342e348c35 in CheckAndReportConflict > > (resultRelInfo=0x5a346792e778, estate=0x5a3467944530, > > type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0, > > remoteslot=0x5a346792e498) at execReplication.c:793 > > > > This can be reproduced by the following steps: > > CREATE PUBLICATION pub; > > CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict'); > > ALTER TABLE conflict RENAME TO conflict1: > > CREATE TABLE conflict(c1 varchar, c2 varchar); > > -- Cause a conflict, this will crash while trying to prepare the > > conflicting tuple > > Yeah while it is allowed to drop or alter the conflict log table, it > should not seg fault, IMHO error is acceptable as per the initial > discussion, so I will look into this and tighten up the logic so that > it will throw an error whenever it can not insert into the conflict > log table. I was thinking about the solution that we need to do if table definition is changed, one option is whenever we try to prepare the tuple after acquiring the lock we can validate the table definition if this doesn't qualify the standard conflict log table schema we can ERROR out. IMHO that should not be an issue as we are only doing this in conflict logging. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-08T03:42:40Z
On Sat, 6 Dec 2025 at 20:36, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 5, 2025 at 10:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Dec 4, 2025 at 8:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > > > relid | 16391 > > > > > > schemaname | public > > > > > > relname | conf_tab > > > > > > conflict_type | multiple_unique_conflicts > > > > > > remote_xid | 761 > > > > > > remote_commit_lsn | 0/01761400 > > > > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > > > > remote_origin | pg_16406 > > > > > > key_tuple | > > > > > > remote_tuple | {"a":2,"b":3,"c":4} > > > > > > local_conflicts | > > > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > > > > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > > > > brief note, perhaps in the commit message for now, describing how to > > > > > access or read this array column? We can remove it later. > > > > > > > > Thanks, okay, temporarily I have added in a commit message how we can > > > > fetch the data from the JSON array field. In next version I will add > > > > a test to get the conflict stored in conflict log history table and > > > > fetch from it. > > > > > > I noticed that the table structure can get changed by the time the > > > conflict record is prepared. In ReportApplyConflict(), the code > > > currently prepares the conflict log tuple before deciding whether the > > > insertion will be immediate or deferred: > > > + /* Insert conflict details to conflict log table. */ > > > + if (conflictlogrel) > > > + { > > > + /* > > > + * Prepare the conflict log tuple. If the error level > > > is below ERROR, > > > + * insert it immediately. Otherwise, defer the > > > insertion to a new > > > + * transaction after the current one aborts, ensuring > > > the insertion of > > > + * the log tuple is not rolled back. > > > + */ > > > + prepare_conflict_log_tuple(estate, > > > + > > > relinfo->ri_RelationDesc, > > > + > > > conflictlogrel, > > > + type, > > > + searchslot, > > > + > > > conflicttuples, > > > + remoteslot); > > > + if (elevel < ERROR) > > > + InsertConflictLogTuple(conflictlogrel); > > > + > > > + table_close(conflictlogrel, RowExclusiveLock); > > > + } > > > > > > If the conflict history table defintion is changed just before > > > prepare_conflict_log_tuple, the tuple creation will crash: > > > Program received signal SIGSEGV, Segmentation fault. > > > 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at > > > ../../../../src/include/varatt.h:419 > > > 419 return VARATT_IS_4B_U(PTR) && > > > (gdb) bt > > > #0 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at > > > ../../../../src/include/varatt.h:419 > > > #1 0x00005a342e01e5ed in heap_compute_data_size > > > (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20, > > > isnull=0x7ffd7af3ad15) at heaptuple.c:239 > > > #2 0x00005a342e0200dd in heap_form_tuple > > > (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20, > > > isnull=0x7ffd7af3ad15) at heaptuple.c:1158 > > > #3 0x00005a342e55e8c2 in prepare_conflict_log_tuple > > > (estate=0x5a3467944530, rel=0x7ab405e594e8, > > > conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS, > > > searchslot=0x0, > > > conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936 > > > #4 0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530, > > > relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS, > > > searchslot=0x0, remoteslot=0x5a346792e498, > > > conflicttuples=0x5a3467942da0) at conflict.c:168 > > > #5 0x00005a342e348c35 in CheckAndReportConflict > > > (resultRelInfo=0x5a346792e778, estate=0x5a3467944530, > > > type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0, > > > remoteslot=0x5a346792e498) at execReplication.c:793 > > > > > > This can be reproduced by the following steps: > > > CREATE PUBLICATION pub; > > > CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict'); > > > ALTER TABLE conflict RENAME TO conflict1: > > > CREATE TABLE conflict(c1 varchar, c2 varchar); > > > -- Cause a conflict, this will crash while trying to prepare the > > > conflicting tuple > > > > Yeah while it is allowed to drop or alter the conflict log table, it > > should not seg fault, IMHO error is acceptable as per the initial > > discussion, so I will look into this and tighten up the logic so that > > it will throw an error whenever it can not insert into the conflict > > log table. > > I was thinking about the solution that we need to do if table > definition is changed, one option is whenever we try to prepare the > tuple after acquiring the lock we can validate the table definition if > this doesn't qualify the standard conflict log table schema we can > ERROR out. IMHO that should not be an issue as we are only doing this > in conflict logging. Should we emit a warning instead of error, to stay consistent with the other exception case where a warning is raised when the conflict log table does not exist? + /* Conflict log table is dropped or not accessible. */ + if (conflictlogrel == NULL) + ereport(WARNING, + (errcode(ERRCODE_UNDEFINED_TABLE), + errmsg("conflict log table \"%s.%s\" does not exist", + get_namespace_name(nspid), conflictlogtable))); Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-08T04:06:43Z
On Mon, Dec 8, 2025 at 9:12 AM vignesh C <vignesh21@gmail.com> wrote: > > On Sat, 6 Dec 2025 at 20:36, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, Dec 5, 2025 at 10:39 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, Dec 4, 2025 at 8:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > On Wed, 3 Dec 2025 at 16:57, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On Wed, Dec 3, 2025 at 9:49 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > > > > > relid | 16391 > > > > > > > schemaname | public > > > > > > > relname | conf_tab > > > > > > > conflict_type | multiple_unique_conflicts > > > > > > > remote_xid | 761 > > > > > > > remote_commit_lsn | 0/01761400 > > > > > > > remote_commit_ts | 2025-12-02 15:02:07.045935+00 > > > > > > > remote_origin | pg_16406 > > > > > > > key_tuple | > > > > > > > remote_tuple | {"a":2,"b":3,"c":4} > > > > > > > local_conflicts | > > > > > > > {"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\" > > > > > > > 773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T > > > > > > > 15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"} > > > > > > > > > > > > > > > > > > > Thanks, it looks good. For the benefit of others, could you include a > > > > > > brief note, perhaps in the commit message for now, describing how to > > > > > > access or read this array column? We can remove it later. > > > > > > > > > > Thanks, okay, temporarily I have added in a commit message how we can > > > > > fetch the data from the JSON array field. In next version I will add > > > > > a test to get the conflict stored in conflict log history table and > > > > > fetch from it. > > > > > > > > I noticed that the table structure can get changed by the time the > > > > conflict record is prepared. In ReportApplyConflict(), the code > > > > currently prepares the conflict log tuple before deciding whether the > > > > insertion will be immediate or deferred: > > > > + /* Insert conflict details to conflict log table. */ > > > > + if (conflictlogrel) > > > > + { > > > > + /* > > > > + * Prepare the conflict log tuple. If the error level > > > > is below ERROR, > > > > + * insert it immediately. Otherwise, defer the > > > > insertion to a new > > > > + * transaction after the current one aborts, ensuring > > > > the insertion of > > > > + * the log tuple is not rolled back. > > > > + */ > > > > + prepare_conflict_log_tuple(estate, > > > > + > > > > relinfo->ri_RelationDesc, > > > > + > > > > conflictlogrel, > > > > + type, > > > > + searchslot, > > > > + > > > > conflicttuples, > > > > + remoteslot); > > > > + if (elevel < ERROR) > > > > + InsertConflictLogTuple(conflictlogrel); > > > > + > > > > + table_close(conflictlogrel, RowExclusiveLock); > > > > + } > > > > > > > > If the conflict history table defintion is changed just before > > > > prepare_conflict_log_tuple, the tuple creation will crash: > > > > Program received signal SIGSEGV, Segmentation fault. > > > > 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at > > > > ../../../../src/include/varatt.h:419 > > > > 419 return VARATT_IS_4B_U(PTR) && > > > > (gdb) bt > > > > #0 0x00005a342e01df4f in VARATT_CAN_MAKE_SHORT (PTR=0x4000) at > > > > ../../../../src/include/varatt.h:419 > > > > #1 0x00005a342e01e5ed in heap_compute_data_size > > > > (tupleDesc=0x7ab405e5dda8, values=0x7ffd7af3ad20, > > > > isnull=0x7ffd7af3ad15) at heaptuple.c:239 > > > > #2 0x00005a342e0200dd in heap_form_tuple > > > > (tupleDescriptor=0x7ab405e5dda8, values=0x7ffd7af3ad20, > > > > isnull=0x7ffd7af3ad15) at heaptuple.c:1158 > > > > #3 0x00005a342e55e8c2 in prepare_conflict_log_tuple > > > > (estate=0x5a3467944530, rel=0x7ab405e594e8, > > > > conflictlogrel=0x7ab405e5da88, conflict_type=CT_INSERT_EXISTS, > > > > searchslot=0x0, > > > > conflicttuples=0x5a3467942da0, remoteslot=0x5a346792e498) at conflict.c:936 > > > > #4 0x00005a342e55cea6 in ReportApplyConflict (estate=0x5a3467944530, > > > > relinfo=0x5a346792e778, elevel=21, type=CT_INSERT_EXISTS, > > > > searchslot=0x0, remoteslot=0x5a346792e498, > > > > conflicttuples=0x5a3467942da0) at conflict.c:168 > > > > #5 0x00005a342e348c35 in CheckAndReportConflict > > > > (resultRelInfo=0x5a346792e778, estate=0x5a3467944530, > > > > type=CT_INSERT_EXISTS, recheckIndexes=0x5a3467942648, searchslot=0x0, > > > > remoteslot=0x5a346792e498) at execReplication.c:793 > > > > > > > > This can be reproduced by the following steps: > > > > CREATE PUBLICATION pub; > > > > CREATE SUBSCRIPTION sub ... WITH (conflict_log_table = 'conflict'); > > > > ALTER TABLE conflict RENAME TO conflict1: > > > > CREATE TABLE conflict(c1 varchar, c2 varchar); > > > > -- Cause a conflict, this will crash while trying to prepare the > > > > conflicting tuple > > > > > > Yeah while it is allowed to drop or alter the conflict log table, it > > > should not seg fault, IMHO error is acceptable as per the initial > > > discussion, so I will look into this and tighten up the logic so that > > > it will throw an error whenever it can not insert into the conflict > > > log table. > > > > I was thinking about the solution that we need to do if table > > definition is changed, one option is whenever we try to prepare the > > tuple after acquiring the lock we can validate the table definition if > > this doesn't qualify the standard conflict log table schema we can > > ERROR out. IMHO that should not be an issue as we are only doing this > > in conflict logging. > > Should we emit a warning instead of error, to stay consistent with the > other exception case where a warning is raised when the conflict log > table does not exist? > + /* Conflict log table is dropped or not accessible. */ > + if (conflictlogrel == NULL) > + ereport(WARNING, > + (errcode(ERRCODE_UNDEFINED_TABLE), > + errmsg("conflict log table \"%s.%s\" > does not exist", > + > get_namespace_name(nspid), conflictlogtable))); Yes this should be WARNING. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-08T04:55:19Z
On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > --- > > > I think the conflict history table should not be transferred to the > > > new cluster when pg_upgrade since the table definition could be > > > different across major versions. > > > > Let me think more on this with respect to behaviour of other factors > > like subscriptions etc. > > > > Can we deal with different schema of tables across versions via > pg_dump/restore during upgrade? > While handling the case of conflict_log_table option during pg_dump, I realized that the restore is trying to create conflict log table 2 different places 1) As part of the regular table dump 2) As part of the CREATE SUBSCRIPTION when conflict_log_table option is set. So one option is we can avoid dumping the conflict log tables as part of the regular table dump if we think that we do not need to conflict log table data and let it get created as part of the create subscription command, OTOH if we think we want to keep the conflict log table data, let it get dumped as part of the regular tables and in CREATE SUBSCRIPTION we will just set the option but do not create the table, although we might need to do special handling of this case because if we allow the existing tables to be set as conflict log tables then it may allow other user tables to be set, so need to think how to handle this if we need to go with this option. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-08T09:08:32Z
On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > --- > > > > I think the conflict history table should not be transferred to the > > > > new cluster when pg_upgrade since the table definition could be > > > > different across major versions. > > > > > > Let me think more on this with respect to behaviour of other factors > > > like subscriptions etc. > > > > > > > Can we deal with different schema of tables across versions via > > pg_dump/restore during upgrade? > > > > While handling the case of conflict_log_table option during pg_dump, I > realized that the restore is trying to create conflict log table 2 > different places 1) As part of the regular table dump 2) As part of > the CREATE SUBSCRIPTION when conflict_log_table option is set. > > So one option is we can avoid dumping the conflict log tables as part > of the regular table dump if we think that we do not need to conflict > log table data and let it get created as part of the create > subscription command, OTOH if we think we want to keep the conflict > log table data, > We want to retain conflict_history after upgrade. This is required for various reasons (a) after upgrade DBA user will still require to resolved the pending unresolved conflicts, (b) Regulations often require keeping audit trails for a longer period of time. If a conflict occurred at time X (which is less than the regulations requirement) regarding a financial transaction, that record must survive the upgrade, (c) If something breaks after the upgrade (e.g., missing rows, constraint violations), conflict history helps trace root causes. It shows whether issues existed before the upgrade or were introduced during migration, (d) as users can query the conflict_history tables, it should be treated similar to user tables. BTW, we are also planning to migrate commit_ts data in thread [1] which would be helpful for conflict_resolutions after upgrade. let it get dumped as part of the regular tables and in > CREATE SUBSCRIPTION we will just set the option but do not create the > table, > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it doesn't try to create the table again. > although we might need to do special handling of this case > because if we allow the existing tables to be set as conflict log > tables then it may allow other user tables to be set, so need to think > how to handle this if we need to go with this option. > Yeah, probably but it should be allowed internally only not to users. I think we can split this upgrade handling as a top-up patch at least for the purpose of review. [1] - https://www.postgresql.org/message-id/182311743703924%40mail.yandex.ru -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-08T09:30:47Z
On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > --- > > > > > I think the conflict history table should not be transferred to the > > > > > new cluster when pg_upgrade since the table definition could be > > > > > different across major versions. > > > > > > > > Let me think more on this with respect to behaviour of other factors > > > > like subscriptions etc. > > > > > > > > > > Can we deal with different schema of tables across versions via > > > pg_dump/restore during upgrade? > > > > > > > While handling the case of conflict_log_table option during pg_dump, I > > realized that the restore is trying to create conflict log table 2 > > different places 1) As part of the regular table dump 2) As part of > > the CREATE SUBSCRIPTION when conflict_log_table option is set. > > > > So one option is we can avoid dumping the conflict log tables as part > > of the regular table dump if we think that we do not need to conflict > > log table data and let it get created as part of the create > > subscription command, OTOH if we think we want to keep the conflict > > log table data, > > > > We want to retain conflict_history after upgrade. This is required for > various reasons (a) after upgrade DBA user will still require to > resolved the pending unresolved conflicts, (b) Regulations often > require keeping audit trails for a longer period of time. If a > conflict occurred at time X (which is less than the regulations > requirement) regarding a financial transaction, that record must > survive the upgrade, (c) > If something breaks after the upgrade (e.g., missing rows, constraint > violations), conflict history helps trace root causes. It shows > whether issues existed before the upgrade or were introduced during > migration, (d) as users can query the conflict_history tables, it > should be treated similar to user tables. > > BTW, we are also planning to migrate commit_ts data in thread [1] > which would be helpful for conflict_resolutions after upgrade. > > let it get dumped as part of the regular tables and in > > CREATE SUBSCRIPTION we will just set the option but do not create the > > table, > > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it > doesn't try to create the table again. > > > although we might need to do special handling of this case > > because if we allow the existing tables to be set as conflict log > > tables then it may allow other user tables to be set, so need to think > > how to handle this if we need to go with this option. > > > > Yeah, probably but it should be allowed internally only not to users. Yeah I wanted to do that, but problem is with dump and restore, I mean if you just dump into a sql file and execute the sql file at that time the CREATE SUBSCRIPTION with conflict_log_table option will fail as the table already exists because it was restored as part of the dump. I know under binary upgrade we have binary_upgrade flag so can do special handling not sure how to distinguish the sql executing as part of the restore or normal sql execution by user? > I think we can split this upgrade handling as a top-up patch at least > for the purpose of review. Make sense. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-08T09:51:40Z
On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > --- > > > > > > I think the conflict history table should not be transferred to the > > > > > > new cluster when pg_upgrade since the table definition could be > > > > > > different across major versions. > > > > > > > > > > Let me think more on this with respect to behaviour of other factors > > > > > like subscriptions etc. > > > > > > > > > > > > > Can we deal with different schema of tables across versions via > > > > pg_dump/restore during upgrade? > > > > > > > > > > While handling the case of conflict_log_table option during pg_dump, I > > > realized that the restore is trying to create conflict log table 2 > > > different places 1) As part of the regular table dump 2) As part of > > > the CREATE SUBSCRIPTION when conflict_log_table option is set. > > > > > > So one option is we can avoid dumping the conflict log tables as part > > > of the regular table dump if we think that we do not need to conflict > > > log table data and let it get created as part of the create > > > subscription command, OTOH if we think we want to keep the conflict > > > log table data, > > > > > > > We want to retain conflict_history after upgrade. This is required for > > various reasons (a) after upgrade DBA user will still require to > > resolved the pending unresolved conflicts, (b) Regulations often > > require keeping audit trails for a longer period of time. If a > > conflict occurred at time X (which is less than the regulations > > requirement) regarding a financial transaction, that record must > > survive the upgrade, (c) > > If something breaks after the upgrade (e.g., missing rows, constraint > > violations), conflict history helps trace root causes. It shows > > whether issues existed before the upgrade or were introduced during > > migration, (d) as users can query the conflict_history tables, it > > should be treated similar to user tables. > > > > BTW, we are also planning to migrate commit_ts data in thread [1] > > which would be helpful for conflict_resolutions after upgrade. > > > > let it get dumped as part of the regular tables and in > > > CREATE SUBSCRIPTION we will just set the option but do not create the > > > table, > > > > > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it > > doesn't try to create the table again. > > > > > although we might need to do special handling of this case > > > because if we allow the existing tables to be set as conflict log > > > tables then it may allow other user tables to be set, so need to think > > > how to handle this if we need to go with this option. > > > > > > > Yeah, probably but it should be allowed internally only not to users. > > Yeah I wanted to do that, but problem is with dump and restore, I mean > if you just dump into a sql file and execute the sql file at that time > the CREATE SUBSCRIPTION with conflict_log_table option will fail as > the table already exists because it was restored as part of the dump. > I know under binary upgrade we have binary_upgrade flag so can do > special handling not sure how to distinguish the sql executing as part > of the restore or normal sql execution by user? > See dumpSubscription(). We always use (connect = false) while dumping subscription, so, similarly, we should always dump the new option with default value which not to create the history table. Won't that be sufficient? -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-08T11:45:37Z
On Mon, Dec 8, 2025 at 3:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > > --- > > > > > > > I think the conflict history table should not be transferred to the > > > > > > > new cluster when pg_upgrade since the table definition could be > > > > > > > different across major versions. > > > > > > > > > > > > Let me think more on this with respect to behaviour of other factors > > > > > > like subscriptions etc. > > > > > > > > > > > > > > > > Can we deal with different schema of tables across versions via > > > > > pg_dump/restore during upgrade? > > > > > > > > > > > > > While handling the case of conflict_log_table option during pg_dump, I > > > > realized that the restore is trying to create conflict log table 2 > > > > different places 1) As part of the regular table dump 2) As part of > > > > the CREATE SUBSCRIPTION when conflict_log_table option is set. > > > > > > > > So one option is we can avoid dumping the conflict log tables as part > > > > of the regular table dump if we think that we do not need to conflict > > > > log table data and let it get created as part of the create > > > > subscription command, OTOH if we think we want to keep the conflict > > > > log table data, > > > > > > > > > > We want to retain conflict_history after upgrade. This is required for > > > various reasons (a) after upgrade DBA user will still require to > > > resolved the pending unresolved conflicts, (b) Regulations often > > > require keeping audit trails for a longer period of time. If a > > > conflict occurred at time X (which is less than the regulations > > > requirement) regarding a financial transaction, that record must > > > survive the upgrade, (c) > > > If something breaks after the upgrade (e.g., missing rows, constraint > > > violations), conflict history helps trace root causes. It shows > > > whether issues existed before the upgrade or were introduced during > > > migration, (d) as users can query the conflict_history tables, it > > > should be treated similar to user tables. > > > > > > BTW, we are also planning to migrate commit_ts data in thread [1] > > > which would be helpful for conflict_resolutions after upgrade. > > > > > > let it get dumped as part of the regular tables and in > > > > CREATE SUBSCRIPTION we will just set the option but do not create the > > > > table, > > > > > > > > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it > > > doesn't try to create the table again. > > > > > > > although we might need to do special handling of this case > > > > because if we allow the existing tables to be set as conflict log > > > > tables then it may allow other user tables to be set, so need to think > > > > how to handle this if we need to go with this option. > > > > > > > > > > Yeah, probably but it should be allowed internally only not to users. > > > > Yeah I wanted to do that, but problem is with dump and restore, I mean > > if you just dump into a sql file and execute the sql file at that time > > the CREATE SUBSCRIPTION with conflict_log_table option will fail as > > the table already exists because it was restored as part of the dump. > > I know under binary upgrade we have binary_upgrade flag so can do > > special handling not sure how to distinguish the sql executing as part > > of the restore or normal sql execution by user? > > > > See dumpSubscription(). We always use (connect = false) while dumping > subscription, so, similarly, we should always dump the new option with > default value which not to create the history table. Won't that be > sufficient? Thinking out loud, so basically what we need is we need to create subscription and set the conflict log table in catalog entry of the subscription in pg_subscription but do not want to create the conflict log table, so seems like we need to invent something new which set the conflict log table in catalog but do not create the table. Currently we have a single option that if conflict_log_table='table_name' is set then we will create the table as well as set the table name in the catalog, so need to think of something on the line of separating this, or something more innovative. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-09T04:42:11Z
On Mon, Dec 8, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 8, 2025 at 3:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > > > > --- > > > > > > > > I think the conflict history table should not be transferred to the > > > > > > > > new cluster when pg_upgrade since the table definition could be > > > > > > > > different across major versions. > > > > > > > > > > > > > > Let me think more on this with respect to behaviour of other factors > > > > > > > like subscriptions etc. > > > > > > > > > > > > > > > > > > > Can we deal with different schema of tables across versions via > > > > > > pg_dump/restore during upgrade? > > > > > > > > > > > > > > > > While handling the case of conflict_log_table option during pg_dump, I > > > > > realized that the restore is trying to create conflict log table 2 > > > > > different places 1) As part of the regular table dump 2) As part of > > > > > the CREATE SUBSCRIPTION when conflict_log_table option is set. > > > > > > > > > > So one option is we can avoid dumping the conflict log tables as part > > > > > of the regular table dump if we think that we do not need to conflict > > > > > log table data and let it get created as part of the create > > > > > subscription command, OTOH if we think we want to keep the conflict > > > > > log table data, > > > > > > > > > > > > > We want to retain conflict_history after upgrade. This is required for > > > > various reasons (a) after upgrade DBA user will still require to > > > > resolved the pending unresolved conflicts, (b) Regulations often > > > > require keeping audit trails for a longer period of time. If a > > > > conflict occurred at time X (which is less than the regulations > > > > requirement) regarding a financial transaction, that record must > > > > survive the upgrade, (c) > > > > If something breaks after the upgrade (e.g., missing rows, constraint > > > > violations), conflict history helps trace root causes. It shows > > > > whether issues existed before the upgrade or were introduced during > > > > migration, (d) as users can query the conflict_history tables, it > > > > should be treated similar to user tables. > > > > > > > > BTW, we are also planning to migrate commit_ts data in thread [1] > > > > which would be helpful for conflict_resolutions after upgrade. > > > > > > > > let it get dumped as part of the regular tables and in > > > > > CREATE SUBSCRIPTION we will just set the option but do not create the > > > > > table, > > > > > > > > > > > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it > > > > doesn't try to create the table again. > > > > > > > > > although we might need to do special handling of this case > > > > > because if we allow the existing tables to be set as conflict log > > > > > tables then it may allow other user tables to be set, so need to think > > > > > how to handle this if we need to go with this option. > > > > > > > > > > > > > Yeah, probably but it should be allowed internally only not to users. > > > > > > Yeah I wanted to do that, but problem is with dump and restore, I mean > > > if you just dump into a sql file and execute the sql file at that time > > > the CREATE SUBSCRIPTION with conflict_log_table option will fail as > > > the table already exists because it was restored as part of the dump. > > > I know under binary upgrade we have binary_upgrade flag so can do > > > special handling not sure how to distinguish the sql executing as part > > > of the restore or normal sql execution by user? > > > > > > > See dumpSubscription(). We always use (connect = false) while dumping > > subscription, so, similarly, we should always dump the new option with > > default value which not to create the history table. Won't that be > > sufficient? > > Thinking out loud, so basically what we need is we need to create > subscription and set the conflict log table in catalog entry of the > subscription in pg_subscription but do not want to create the conflict > log table, so seems like we need to invent something new which set the > conflict log table in catalog but do not create the table. Currently > we have a single option that if conflict_log_table='table_name' is set > then we will create the table as well as set the table name in the > catalog, so need to think of something on the line of separating this, > or something more innovative. > This needs more thought and discussion, so it is better to separate out this part at this stage and let's try to review the core patch first. BTW, I told a few days back to have two options (instead of a single option conflict_log_table) to allow extension of more ways to LOG the conflict data. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-09T06:36:34Z
On Tue, Dec 9, 2025 at 10:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Dec 8, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, Dec 8, 2025 at 3:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > > > > > > --- > > > > > > > > > I think the conflict history table should not be transferred to the > > > > > > > > > new cluster when pg_upgrade since the table definition could be > > > > > > > > > different across major versions. > > > > > > > > > > > > > > > > Let me think more on this with respect to behaviour of other factors > > > > > > > > like subscriptions etc. > > > > > > > > > > > > > > > > > > > > > > Can we deal with different schema of tables across versions via > > > > > > > pg_dump/restore during upgrade? > > > > > > > > > > > > > > > > > > > While handling the case of conflict_log_table option during pg_dump, I > > > > > > realized that the restore is trying to create conflict log table 2 > > > > > > different places 1) As part of the regular table dump 2) As part of > > > > > > the CREATE SUBSCRIPTION when conflict_log_table option is set. > > > > > > > > > > > > So one option is we can avoid dumping the conflict log tables as part > > > > > > of the regular table dump if we think that we do not need to conflict > > > > > > log table data and let it get created as part of the create > > > > > > subscription command, OTOH if we think we want to keep the conflict > > > > > > log table data, > > > > > > > > > > > > > > > > We want to retain conflict_history after upgrade. This is required for > > > > > various reasons (a) after upgrade DBA user will still require to > > > > > resolved the pending unresolved conflicts, (b) Regulations often > > > > > require keeping audit trails for a longer period of time. If a > > > > > conflict occurred at time X (which is less than the regulations > > > > > requirement) regarding a financial transaction, that record must > > > > > survive the upgrade, (c) > > > > > If something breaks after the upgrade (e.g., missing rows, constraint > > > > > violations), conflict history helps trace root causes. It shows > > > > > whether issues existed before the upgrade or were introduced during > > > > > migration, (d) as users can query the conflict_history tables, it > > > > > should be treated similar to user tables. > > > > > > > > > > BTW, we are also planning to migrate commit_ts data in thread [1] > > > > > which would be helpful for conflict_resolutions after upgrade. > > > > > > > > > > let it get dumped as part of the regular tables and in > > > > > > CREATE SUBSCRIPTION we will just set the option but do not create the > > > > > > table, > > > > > > > > > > > > > > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it > > > > > doesn't try to create the table again. > > > > > > > > > > > although we might need to do special handling of this case > > > > > > because if we allow the existing tables to be set as conflict log > > > > > > tables then it may allow other user tables to be set, so need to think > > > > > > how to handle this if we need to go with this option. > > > > > > > > > > > > > > > > Yeah, probably but it should be allowed internally only not to users. > > > > > > > > Yeah I wanted to do that, but problem is with dump and restore, I mean > > > > if you just dump into a sql file and execute the sql file at that time > > > > the CREATE SUBSCRIPTION with conflict_log_table option will fail as > > > > the table already exists because it was restored as part of the dump. > > > > I know under binary upgrade we have binary_upgrade flag so can do > > > > special handling not sure how to distinguish the sql executing as part > > > > of the restore or normal sql execution by user? > > > > > > > > > > See dumpSubscription(). We always use (connect = false) while dumping > > > subscription, so, similarly, we should always dump the new option with > > > default value which not to create the history table. Won't that be > > > sufficient? > > > > Thinking out loud, so basically what we need is we need to create > > subscription and set the conflict log table in catalog entry of the > > subscription in pg_subscription but do not want to create the conflict > > log table, so seems like we need to invent something new which set the > > conflict log table in catalog but do not create the table. Currently > > we have a single option that if conflict_log_table='table_name' is set > > then we will create the table as well as set the table name in the > > catalog, so need to think of something on the line of separating this, > > or something more innovative. > > > > This needs more thought and discussion, so it is better to separate > out this part at this stage and let's try to review the core patch > first. +1 BTW, I told a few days back to have two options (instead of a > single option conflict_log_table) to allow extension of more ways to > LOG the conflict data. Yeah, I will put that as well in an add on patch, once I fix all the option issues of the core patch. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-09T15:11:21Z
On Tue, Dec 9, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 9, 2025 at 10:12 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 8, 2025 at 5:15 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Mon, Dec 8, 2025 at 3:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Mon, Dec 8, 2025 at 3:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On Mon, Dec 8, 2025 at 2:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > On Mon, Dec 8, 2025 at 10:25 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > > > On Thu, Dec 4, 2025 at 4:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > On Thu, Dec 4, 2025 at 10:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > > I think the conflict history table should not be transferred to the > > > > > > > > > > new cluster when pg_upgrade since the table definition could be > > > > > > > > > > different across major versions. > > > > > > > > > > > > > > > > > > Let me think more on this with respect to behaviour of other factors > > > > > > > > > like subscriptions etc. > > > > > > > > > > > > > > > > > > > > > > > > > Can we deal with different schema of tables across versions via > > > > > > > > pg_dump/restore during upgrade? > > > > > > > > > > > > > > > > > > > > > > While handling the case of conflict_log_table option during pg_dump, I > > > > > > > realized that the restore is trying to create conflict log table 2 > > > > > > > different places 1) As part of the regular table dump 2) As part of > > > > > > > the CREATE SUBSCRIPTION when conflict_log_table option is set. > > > > > > > > > > > > > > So one option is we can avoid dumping the conflict log tables as part > > > > > > > of the regular table dump if we think that we do not need to conflict > > > > > > > log table data and let it get created as part of the create > > > > > > > subscription command, OTOH if we think we want to keep the conflict > > > > > > > log table data, > > > > > > > > > > > > > > > > > > > We want to retain conflict_history after upgrade. This is required for > > > > > > various reasons (a) after upgrade DBA user will still require to > > > > > > resolved the pending unresolved conflicts, (b) Regulations often > > > > > > require keeping audit trails for a longer period of time. If a > > > > > > conflict occurred at time X (which is less than the regulations > > > > > > requirement) regarding a financial transaction, that record must > > > > > > survive the upgrade, (c) > > > > > > If something breaks after the upgrade (e.g., missing rows, constraint > > > > > > violations), conflict history helps trace root causes. It shows > > > > > > whether issues existed before the upgrade or were introduced during > > > > > > migration, (d) as users can query the conflict_history tables, it > > > > > > should be treated similar to user tables. > > > > > > > > > > > > BTW, we are also planning to migrate commit_ts data in thread [1] > > > > > > which would be helpful for conflict_resolutions after upgrade. > > > > > > > > > > > > let it get dumped as part of the regular tables and in > > > > > > > CREATE SUBSCRIPTION we will just set the option but do not create the > > > > > > > table, > > > > > > > > > > > > > > > > > > > Yeah, we can turn this option during CREATE SUBSCRIPTION so that it > > > > > > doesn't try to create the table again. > > > > > > > > > > > > > although we might need to do special handling of this case > > > > > > > because if we allow the existing tables to be set as conflict log > > > > > > > tables then it may allow other user tables to be set, so need to think > > > > > > > how to handle this if we need to go with this option. > > > > > > > > > > > > > > > > > > > Yeah, probably but it should be allowed internally only not to users. > > > > > > > > > > Yeah I wanted to do that, but problem is with dump and restore, I mean > > > > > if you just dump into a sql file and execute the sql file at that time > > > > > the CREATE SUBSCRIPTION with conflict_log_table option will fail as > > > > > the table already exists because it was restored as part of the dump. > > > > > I know under binary upgrade we have binary_upgrade flag so can do > > > > > special handling not sure how to distinguish the sql executing as part > > > > > of the restore or normal sql execution by user? > > > > > > > > > > > > > See dumpSubscription(). We always use (connect = false) while dumping > > > > subscription, so, similarly, we should always dump the new option with > > > > default value which not to create the history table. Won't that be > > > > sufficient? > > > > > > Thinking out loud, so basically what we need is we need to create > > > subscription and set the conflict log table in catalog entry of the > > > subscription in pg_subscription but do not want to create the conflict > > > log table, so seems like we need to invent something new which set the > > > conflict log table in catalog but do not create the table. Currently > > > we have a single option that if conflict_log_table='table_name' is set > > > then we will create the table as well as set the table name in the > > > catalog, so need to think of something on the line of separating this, > > > or something more innovative. > > > > > > > This needs more thought and discussion, so it is better to separate > > out this part at this stage and let's try to review the core patch > > first. > > +1 > > BTW, I told a few days back to have two options (instead of a > > single option conflict_log_table) to allow extension of more ways to > > LOG the conflict data. > > Yeah, I will put that as well in an add on patch, once I fix all the > option issues of the core patch. > Here is the updated version of patch What has changed 1. Table is created using create_heap_with_catalog() instead of SPI as suggested by Sawada-San and Amit Kapila. 2. Validated the table schema after acquiring the lock before preparing/inserting conflict tuples for defects raised by Vignesh. 3. Bug fixes raised by Shweta (segfault) 3. Comments from Peter (except exposing namespace in \dRs+, it's still pending. What's not done/pending 1. Adding for key_tuple/RI as pointed by Shveta - will do in next version 2. Adding dependency of subscription on table so that we are not allowed to drop the table - I think when we put the dependency on shared objects those can not be dropped even with cascade option, but I am still exploring more on this. 3. dump/restore and upgrade, I have partially working patch but then I need to figure out how to skip table creation while creating subscription, while discussing offlist with Hannu, he suggested we can do something with dump dependency ordering, e.g. we can dump create subscription first and then dump the clt data without actually dumping the clt definition, with that table will be created while creating the subscription and then data will be restored with COPY command, I will explore more on this. 4. Test case for conflit insertion 5. Documentation patch -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-11T11:34:19Z
On Tue, Dec 9, 2025 at 8:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > Here is the updated version of patch > What has changed > 1. Table is created using create_heap_with_catalog() instead of SPI as > suggested by Sawada-San and Amit Kapila. > 2. Validated the table schema after acquiring the lock before > preparing/inserting conflict tuples for defects raised by Vignesh. > 3. Bug fixes raised by Shweta (segfault) > 3. Comments from Peter (except exposing namespace in \dRs+, it's still pending. > Thanks for the patch. I tested all conflict-types on this version, they (basic scenarios) seem to work well. Except only that key-RI pending issue, other issues seem to be addressed. I will start with code-review now. Few observations: 1) \dRs+ shows 'Conflict log table' without namespace, this could be confusing if the same table exists in multiple schemas. 2) When we do below: alter subscription sub1 SET (conflict_log_table=clt2); the previous conflict log table is dropped. Is this behavior intentional and discussed/concluded earlier? It’s possible that a user may want to create a new conflict log table for future events while still retaining the old one for analysis. If the subscription itself is dropped, then dropping the CLT makes sense, but I’m not sure this behavior is intended for ALTER SUBSCRIPTION. I do understand that once we unlink CLT from subscription, later even DROP subscription cannot drop it, but user can always drop it when not needed. If we plan to keep existing behavior, it should be clearly documented in a CAUTION section, and the command should explicitly log the table drop. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-11T11:40:19Z
On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Dec 9, 2025 at 8:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > Here is the updated version of patch > > What has changed > > 1. Table is created using create_heap_with_catalog() instead of SPI as > > suggested by Sawada-San and Amit Kapila. > > 2. Validated the table schema after acquiring the lock before > > preparing/inserting conflict tuples for defects raised by Vignesh. > > 3. Bug fixes raised by Shweta (segfault) > > 3. Comments from Peter (except exposing namespace in \dRs+, it's still pending. > > > > Thanks for the patch. > I tested all conflict-types on this version, they (basic scenarios) > seem to work well. Except only that key-RI pending issue, other issues > seem to be addressed. I will start with code-review now. > > Few observations: > > 1) > \dRs+ shows 'Conflict log table' without namespace, this could be > confusing if the same table exists in multiple schemas. Yeah this is not yet fixed comments, will fix in next version. > 2) > When we do below: > alter subscription sub1 SET (conflict_log_table=clt2); > > the previous conflict log table is dropped. Is this behavior > intentional and discussed/concluded earlier? It’s possible that a user > may want to create a new conflict log table for future events while > still retaining the old one for analysis. If the subscription itself > is dropped, then dropping the CLT makes sense, but I’m not sure this > behavior is intended for ALTER SUBSCRIPTION. I do understand that > once we unlink CLT from subscription, later even DROP subscription > cannot drop it, but user can always drop it when not needed. > > If we plan to keep existing behavior, it should be clearly documented > in a CAUTION section, and the command should explicitly log the table > drop. Yeah we discussed this behavior and the conclusion was we would document this behavior and its user's responsibility to take necessary backup of the conflict log table data if they are setting a new log table or NONE for the subscription. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-11T12:26:59Z
On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote: > > > 2) > > When we do below: > > alter subscription sub1 SET (conflict_log_table=clt2); > > > > the previous conflict log table is dropped. Is this behavior > > intentional and discussed/concluded earlier? It’s possible that a user > > may want to create a new conflict log table for future events while > > still retaining the old one for analysis. If the subscription itself > > is dropped, then dropping the CLT makes sense, but I’m not sure this > > behavior is intended for ALTER SUBSCRIPTION. I do understand that > > once we unlink CLT from subscription, later even DROP subscription > > cannot drop it, but user can always drop it when not needed. > > > > If we plan to keep existing behavior, it should be clearly documented > > in a CAUTION section, and the command should explicitly log the table > > drop. > > Yeah we discussed this behavior and the conclusion was we would > document this behavior and its user's responsibility to take necessary > backup of the conflict log table data if they are setting a new log > table or NONE for the subscription. > +1. If we don't do this then it will be difficult to track for postgres or users the previous conflict history tables. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-11T14:19:29Z
On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > 2) > > > When we do below: > > > alter subscription sub1 SET (conflict_log_table=clt2); > > > > > > the previous conflict log table is dropped. Is this behavior > > > intentional and discussed/concluded earlier? It’s possible that a user > > > may want to create a new conflict log table for future events while > > > still retaining the old one for analysis. If the subscription itself > > > is dropped, then dropping the CLT makes sense, but I’m not sure this > > > behavior is intended for ALTER SUBSCRIPTION. I do understand that > > > once we unlink CLT from subscription, later even DROP subscription > > > cannot drop it, but user can always drop it when not needed. > > > > > > If we plan to keep existing behavior, it should be clearly documented > > > in a CAUTION section, and the command should explicitly log the table > > > drop. > > > > Yeah we discussed this behavior and the conclusion was we would > > document this behavior and its user's responsibility to take necessary > > backup of the conflict log table data if they are setting a new log > > table or NONE for the subscription. > > > > +1. If we don't do this then it will be difficult to track for > postgres or users the previous conflict history tables. Right, it makes sense. Attached patch fixed most of the open comments 1) \dRs+ now show the schema qualified name 2) Now key_tuple and replica_identify tuple both are add in conflict log tuple wherever applicable 3) Refactored the code so that we can define the conflict log table schema only once in the header file and both create_conflict_log_table and ValidateConflictLogTable use it. I was considering the interdependence between the subscription and the conflict log table (CLT). IMHO, it would be logical to establish the subscription as dependent on the CLT. This way, if someone attempts to drop the CLT, the system would recognize the dependency of the subscription and prevent the drop unless the subscription is removed first or the CASCADE option is used. However, while investigating this, I encountered an error [1] stating that global objects are not supported in this context. This indicates that global objects cannot be made dependent on local objects. Although making an object dependent on global/shared objects is possible for certain types of shared objects [2], this is not our main objective. We do not need to make the CLT dependent on the subscription because the table can be dropped when the subscription is dropped anyway and we are already doing it as part of drop subscription as well as alter subscription when CLT is set to NONE or a different table. Therefore, extending the functionality of shared dependency is unnecessary for this purpose. Thoughts? [1] doDeletion() { .... /* * These global object types are not supported here. */ case AuthIdRelationId: case DatabaseRelationId: case TableSpaceRelationId: case SubscriptionRelationId: case ParameterAclRelationId: elog(ERROR, "global objects cannot be deleted by doDeletion"); break; } [2] typedef enum SharedDependencyType { SHARED_DEPENDENCY_OWNER = 'o', SHARED_DEPENDENCY_ACL = 'a', SHARED_DEPENDENCY_INITACL = 'i', SHARED_DEPENDENCY_POLICY = 'r', SHARED_DEPENDENCY_TABLESPACE = 't', SHARED_DEPENDENCY_INVALID = 0, } SharedDependencyType; Pending Items are: 1. Handling dump/upgrade 2. Test case for conflit insertion 3. Documentation patch -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-12T03:49:01Z
On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > 2) > > > > When we do below: > > > > alter subscription sub1 SET (conflict_log_table=clt2); > > > > > > > > the previous conflict log table is dropped. Is this behavior > > > > intentional and discussed/concluded earlier? It’s possible that a user > > > > may want to create a new conflict log table for future events while > > > > still retaining the old one for analysis. If the subscription itself > > > > is dropped, then dropping the CLT makes sense, but I’m not sure this > > > > behavior is intended for ALTER SUBSCRIPTION. I do understand that > > > > once we unlink CLT from subscription, later even DROP subscription > > > > cannot drop it, but user can always drop it when not needed. > > > > > > > > If we plan to keep existing behavior, it should be clearly documented > > > > in a CAUTION section, and the command should explicitly log the table > > > > drop. > > > > > > Yeah we discussed this behavior and the conclusion was we would > > > document this behavior and its user's responsibility to take necessary > > > backup of the conflict log table data if they are setting a new log > > > table or NONE for the subscription. > > > > > > > +1. If we don't do this then it will be difficult to track for > > postgres or users the previous conflict history tables. > > Right, it makes sense. Okay, right. > > Attached patch fixed most of the open comments > 1) \dRs+ now show the schema qualified name > 2) Now key_tuple and replica_identify tuple both are add in conflict > log tuple wherever applicable > 3) Refactored the code so that we can define the conflict log table > schema only once in the header file and both create_conflict_log_table > and ValidateConflictLogTable use it. > > I was considering the interdependence between the subscription and the > conflict log table (CLT). IMHO, it would be logical to establish the > subscription as dependent on the CLT. This way, if someone attempts to > drop the CLT, the system would recognize the dependency of the > subscription and prevent the drop unless the subscription is removed > first or the CASCADE option is used. > > However, while investigating this, I encountered an error [1] stating > that global objects are not supported in this context. This indicates > that global objects cannot be made dependent on local objects. > Although making an object dependent on global/shared objects is > possible for certain types of shared objects [2], this is not our main > objective. > > We do not need to make the CLT dependent on the subscription because > the table can be dropped when the subscription is dropped anyway and > we are already doing it as part of drop subscription as well as alter > subscription when CLT is set to NONE or a different table. Therefore, > extending the functionality of shared dependency is unnecessary for > this purpose. > > Thoughts? I believe the recommendation to create a dependency was meant to prevent the table from being accidentally dropped during a DROP SCHEMA or DROP TABLE operation. That risk still remains, regardless of the fact that dropping or altering a subscription will result in the table removal. I will give this more thought and let you know if anything comes to mind. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-12T04:11:58Z
On Fri, Dec 12, 2025 at 9:19 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > We do not need to make the CLT dependent on the subscription because > > the table can be dropped when the subscription is dropped anyway and > > we are already doing it as part of drop subscription as well as alter > > subscription when CLT is set to NONE or a different table. Therefore, > > extending the functionality of shared dependency is unnecessary for > > this purpose. > > > > Thoughts? > > I believe the recommendation to create a dependency was meant to > prevent the table from being accidentally dropped during a DROP SCHEMA > or DROP TABLE operation. That risk still remains, regardless of the > fact that dropping or altering a subscription will result in the table > removal. I will give this more thought and let you know if anything > comes to mind. I mean we can register the dependency of subscriber on table and that will prevent dropping the tables via DROP TABLE/DROP SCHEMA, but what I do not like is the internal error[1] in doDeletion() when someone will try to DROP TABLE CLT CASCADE; I suggest an alternative approach for handling this: implement a check within the ALTER/DROP table commands. If the table is a CLT (using IsConflictLogTable() to verify), these operations should be disallowed. This would enhance the robustness of CLT handling by entirely preventing external drop/alter actions. What are your thoughts on this solution? And let's also see what Amit and Sawada-san think about this solution. [1] doDeletion() { .... /* * These global object types are not supported here. */ case AuthIdRelationId: case DatabaseRelationId: case TableSpaceRelationId: case SubscriptionRelationId: case ParameterAclRelationId: elog(ERROR, "global objects cannot be deleted by doDeletion"); break; } -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-12T04:32:18Z
On Fri, Dec 12, 2025 at 9:42 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 12, 2025 at 9:19 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > We do not need to make the CLT dependent on the subscription because > > > the table can be dropped when the subscription is dropped anyway and > > > we are already doing it as part of drop subscription as well as alter > > > subscription when CLT is set to NONE or a different table. Therefore, > > > extending the functionality of shared dependency is unnecessary for > > > this purpose. > > > > > > Thoughts? > > > > I believe the recommendation to create a dependency was meant to > > prevent the table from being accidentally dropped during a DROP SCHEMA > > or DROP TABLE operation. That risk still remains, regardless of the > > fact that dropping or altering a subscription will result in the table > > removal. I will give this more thought and let you know if anything > > comes to mind. > > I mean we can register the dependency of subscriber on table and that > will prevent dropping the tables via DROP TABLE/DROP SCHEMA, but what > I do not like is the internal error[1] in doDeletion() when someone > will try to DROP TABLE CLT CASCADE; > Yes, I understand that part. > I suggest an alternative approach for handling this: implement a check > within the ALTER/DROP table commands. If the table is a CLT (using > IsConflictLogTable() to verify), these operations should be > disallowed. This would enhance the robustness of CLT handling by > entirely preventing external drop/alter actions. What are your > thoughts on this solution? And let's also see what Amit and Sawada-san > think about this solution. I had similar thoughts, but was unsure how this should behave when a user runs DROP SCHEMA … CASCADE. We can’t simply block the entire operation with an error just because the schema contains a CLT, but we also shouldn’t allow it to proceed without notifying the user that the schema includes a CLT. > > [1] > doDeletion() > { > .... > /* > * These global object types are not supported here. > */ > case AuthIdRelationId: > case DatabaseRelationId: > case TableSpaceRelationId: > case SubscriptionRelationId: > case ParameterAclRelationId: > elog(ERROR, "global objects cannot be deleted by doDeletion"); > break; > } > > -- > Regards, > Dilip Kumar > Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-12T04:58:52Z
On Fri, Dec 12, 2025 at 10:02 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Dec 12, 2025 at 9:42 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, Dec 12, 2025 at 9:19 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > We do not need to make the CLT dependent on the subscription because > > > > the table can be dropped when the subscription is dropped anyway and > > > > we are already doing it as part of drop subscription as well as alter > > > > subscription when CLT is set to NONE or a different table. Therefore, > > > > extending the functionality of shared dependency is unnecessary for > > > > this purpose. > > > > > > > > Thoughts? > > > > > > I believe the recommendation to create a dependency was meant to > > > prevent the table from being accidentally dropped during a DROP SCHEMA > > > or DROP TABLE operation. That risk still remains, regardless of the > > > fact that dropping or altering a subscription will result in the table > > > removal. I will give this more thought and let you know if anything > > > comes to mind. > > > > I mean we can register the dependency of subscriber on table and that > > will prevent dropping the tables via DROP TABLE/DROP SCHEMA, but what > > I do not like is the internal error[1] in doDeletion() when someone > > will try to DROP TABLE CLT CASCADE; > > > > Yes, I understand that part. > > > I suggest an alternative approach for handling this: implement a check > > within the ALTER/DROP table commands. If the table is a CLT (using > > IsConflictLogTable() to verify), these operations should be > > disallowed. This would enhance the robustness of CLT handling by > > entirely preventing external drop/alter actions. What are your > > thoughts on this solution? And let's also see what Amit and Sawada-san > > think about this solution. > > I had similar thoughts, but was unsure how this should behave when a > user runs DROP SCHEMA … CASCADE. We can’t simply block the entire > operation with an error just because the schema contains a CLT, but we > also shouldn’t allow it to proceed without notifying the user that the > schema includes a CLT. I understand your concern about whether this restriction is appropriate, particularly when using DROP SCHEMA … CASCADE is. However, considering the logical dependency where the subscription relies on the table (CLT), expecting DROP SCHEMA … CASCADE to drop the CLT implies it should also drop the dependent subscription, which is not permitted. Therefore, a more appropriate behavior would be to issue an error message stating that the table is a conflict log table and that subscriber "<subname>" depends on it. This message should instruct the user to either drop the subscription or reset the conflict log table before proceeding with the drop operation. OTOH, we can simply let the CLT get dropped and altered and document this behavior so that it is the user's responsibility to not to drop/alter the CLT otherwise conflict logging will be skipped as we have now. While thinking more I feel it might be better to keep it simple as we have now instead of overcomplicating it? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-12T05:57:18Z
On Thu, 11 Dec 2025 at 19:50, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > 2) > > > > When we do below: > > > > alter subscription sub1 SET (conflict_log_table=clt2); > > > > > > > > the previous conflict log table is dropped. Is this behavior > > > > intentional and discussed/concluded earlier? It’s possible that a user > > > > may want to create a new conflict log table for future events while > > > > still retaining the old one for analysis. If the subscription itself > > > > is dropped, then dropping the CLT makes sense, but I’m not sure this > > > > behavior is intended for ALTER SUBSCRIPTION. I do understand that > > > > once we unlink CLT from subscription, later even DROP subscription > > > > cannot drop it, but user can always drop it when not needed. > > > > > > > > If we plan to keep existing behavior, it should be clearly documented > > > > in a CAUTION section, and the command should explicitly log the table > > > > drop. > > > > > > Yeah we discussed this behavior and the conclusion was we would > > > document this behavior and its user's responsibility to take necessary > > > backup of the conflict log table data if they are setting a new log > > > table or NONE for the subscription. > > > > > > > +1. If we don't do this then it will be difficult to track for > > postgres or users the previous conflict history tables. > > Right, it makes sense. > > Attached patch fixed most of the open comments > 1) \dRs+ now show the schema qualified name > 2) Now key_tuple and replica_identify tuple both are add in conflict > log tuple wherever applicable > 3) Refactored the code so that we can define the conflict log table > schema only once in the header file and both create_conflict_log_table > and ValidateConflictLogTable use it. > > I was considering the interdependence between the subscription and the > conflict log table (CLT). IMHO, it would be logical to establish the > subscription as dependent on the CLT. This way, if someone attempts to > drop the CLT, the system would recognize the dependency of the > subscription and prevent the drop unless the subscription is removed > first or the CASCADE option is used. > > However, while investigating this, I encountered an error [1] stating > that global objects are not supported in this context. This indicates > that global objects cannot be made dependent on local objects. > Although making an object dependent on global/shared objects is > possible for certain types of shared objects [2], this is not our main > objective. > > We do not need to make the CLT dependent on the subscription because > the table can be dropped when the subscription is dropped anyway and > we are already doing it as part of drop subscription as well as alter > subscription when CLT is set to NONE or a different table. Therefore, > extending the functionality of shared dependency is unnecessary for > this purpose. I noticed an inconsistency in the checks that prevent adding a conflict log table to a publication. At creation time, we explicitly reject attempts to publish a conflict log table: /* Can't be conflict log table */ if (IsConflictLogTable(RelationGetRelid(targetrel))) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("cannot add relation \"%s.%s\" to publication", get_namespace_name(RelationGetNamespace(targetrel)), RelationGetRelationName(targetrel)), errdetail("This operation is not supported for conflict log tables."))); However, the restriction can be bypassed through a sequence of table renames like below: -- Set up logical replication CREATE PUBLICATION pub_all; CREATE SUBSCRIPTION sub1 CONNECTION '...' PUBLICATION pub_all WITH (conflict_log_table = 'conflict'); -- Rename the conflict log table ALTER TABLE conflict RENAME TO conflict1; -- Now this succeeds: CREATE PUBLICATION pub1 FOR TABLE conflict1; -- Rename it back ALTER TABLE conflict1 RENAME TO conflict; \dRp+ pub1 Publication pub1 ... Tables: public.conflict Thus, although we prohibit publishing the conflict log table directly, a publication can still end up referencing it through renaming. This is inconsistent with the invariant the code attempts to enforce. Should we extend the checks to handle renames so that a conflict log table can never end up in a publication? Alternatively, should the creation-time restriction be relaxed if this case is acceptable? If the invariant should be enforced, should we also prevent renaming a conflict-log table into a published table's name? Thoughts? Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-12T09:33:47Z
On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I was considering the interdependence between the subscription and the > conflict log table (CLT). IMHO, it would be logical to establish the > subscription as dependent on the CLT. This way, if someone attempts to > drop the CLT, the system would recognize the dependency of the > subscription and prevent the drop unless the subscription is removed > first or the CASCADE option is used. > > However, while investigating this, I encountered an error [1] stating > that global objects are not supported in this context. This indicates > that global objects cannot be made dependent on local objects. > What we need here is an equivalent of DEPENDENCY_INTERNAL for database objects. For example, consider following case: postgres=# create table t1(c1 int primary key); CREATE TABLE postgres=# \d+ t1 Table "public.t1" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- c1 | integer | | not null | | plain | | | Indexes: "t1_pkey" PRIMARY KEY, btree (c1) Publications: "pub1" Not-null constraints: "t1_c1_not_null" NOT NULL "c1" Access method: heap postgres=# drop index t1_pkey; ERROR: cannot drop index t1_pkey because constraint t1_pkey on table t1 requires it HINT: You can drop constraint t1_pkey on table t1 instead. Here, the PK index is created as part for CREATE TABLE operation and pk_index is not allowed to be dropped independently. > Although making an object dependent on global/shared objects is > possible for certain types of shared objects [2], this is not our main > objective. > As per my understanding from the above example, we need something like that only for shared object subscription and (internally created) table. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-12T10:03:29Z
On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > I was considering the interdependence between the subscription and the > > conflict log table (CLT). IMHO, it would be logical to establish the > > subscription as dependent on the CLT. This way, if someone attempts to > > drop the CLT, the system would recognize the dependency of the > > subscription and prevent the drop unless the subscription is removed > > first or the CASCADE option is used. > > > > However, while investigating this, I encountered an error [1] stating > > that global objects are not supported in this context. This indicates > > that global objects cannot be made dependent on local objects. > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database > objects. For example, consider following case: > postgres=# create table t1(c1 int primary key); > CREATE TABLE > postgres=# \d+ t1 > Table "public.t1" > Column | Type | Collation | Nullable | Default | Storage | > Compression | Stats target | Description > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > c1 | integer | | not null | | plain | > | | > Indexes: > "t1_pkey" PRIMARY KEY, btree (c1) > Publications: > "pub1" > Not-null constraints: > "t1_c1_not_null" NOT NULL "c1" > Access method: heap > postgres=# drop index t1_pkey; > ERROR: cannot drop index t1_pkey because constraint t1_pkey on table > t1 requires it > HINT: You can drop constraint t1_pkey on table t1 instead. > > Here, the PK index is created as part for CREATE TABLE operation and > pk_index is not allowed to be dropped independently. > > > Although making an object dependent on global/shared objects is > > possible for certain types of shared objects [2], this is not our main > > objective. > > > > As per my understanding from the above example, we need something like > that only for shared object subscription and (internally created) > table. > +1 ~~ Few comments for v11: 1) +#include "executor/spi.h" +#include "replication/conflict.h" +#include "utils/fmgroids.h" +#include "utils/regproc.h" subscriptioncmds.c compiles without the above inclusions. 2) postgres=# create subscription sub3 connection '...' publication pub1 WITH(conflict_log_table='pg_temp.clt'); NOTICE: created replication slot "sub3" on publisher CREATE SUBSCRIPTION Should we restrict clt creation in pg_temp? 3) + /* Fetch the eixsting conflict table table information. */ typos: eixsting->existing, table table -> table 4) AlterSubscription(): + values[Anum_pg_subscription_subconflictlognspid - 1] = + ObjectIdGetDatum(nspid); + + if (relname != NULL) + values[Anum_pg_subscription_subconflictlogtable - 1] = + CStringGetTextDatum(relname); + else + nulls[Anum_pg_subscription_subconflictlogtable - 1] = + true; Should we move the nspid setting inside 'if(relname != NULL)' block? 5) Is there a way to reset/remove conflict_log_table? I did not see any such handling in AlterSubscription? It gives error: postgres=# alter subscription sub3 set (conflict_log_table=''); ERROR: invalid name syntax 6) +char * +get_subscription_conflict_log_table(Oid subid, Oid *nspid) +{ + HeapTuple tup; + Datum datum; + bool isnull; + char *relname = NULL; + Form_pg_subscription subform; + + *nspid = InvalidOid; + + tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid)); + + if (!HeapTupleIsValid(tup)) + return NULL; Should we have elog(ERROR) here for cache lookup failure? Callers like AlterSubscription, DropSubscription lock the sub entry, so it being missing at this stage is not normal. I have not seen all the callers though. 7) +#include "access/htup.h" +#include "access/skey.h" +#include "access/table.h" +#include "catalog/pg_attribute.h" +#include "catalog/indexing.h" +#include "catalog/namespace.h" +#include "catalog/pg_namespace.h" +#include "catalog/pg_type.h" +#include "executor/spi.h" +#include "utils/array.h" conflict.c compiles without above inclusions. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-14T10:21:40Z
On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > I was considering the interdependence between the subscription and the > > conflict log table (CLT). IMHO, it would be logical to establish the > > subscription as dependent on the CLT. This way, if someone attempts to > > drop the CLT, the system would recognize the dependency of the > > subscription and prevent the drop unless the subscription is removed > > first or the CASCADE option is used. > > > > However, while investigating this, I encountered an error [1] stating > > that global objects are not supported in this context. This indicates > > that global objects cannot be made dependent on local objects. > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database > objects. For example, consider following case: > postgres=# create table t1(c1 int primary key); > CREATE TABLE > postgres=# \d+ t1 > Table "public.t1" > Column | Type | Collation | Nullable | Default | Storage | > Compression | Stats target | Description > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > c1 | integer | | not null | | plain | > | | > Indexes: > "t1_pkey" PRIMARY KEY, btree (c1) > Publications: > "pub1" > Not-null constraints: > "t1_c1_not_null" NOT NULL "c1" > Access method: heap > postgres=# drop index t1_pkey; > ERROR: cannot drop index t1_pkey because constraint t1_pkey on table > t1 requires it > HINT: You can drop constraint t1_pkey on table t1 instead. > > Here, the PK index is created as part for CREATE TABLE operation and > pk_index is not allowed to be dropped independently. > > > Although making an object dependent on global/shared objects is > > possible for certain types of shared objects [2], this is not our main > > objective. > > > > As per my understanding from the above example, we need something like > that only for shared object subscription and (internally created) > table. Yeah that seems to be exactly what we want, so I tried doing that by recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and it is behaving as we want[2]. And while dropping the subscription or altering CLT we can delete internal dependency so that CLT get dropped automatically[3] I will send an updated patch after testing a few more scenarios and fixing other pending issues. [1] + ObjectAddressSet(myself, RelationRelationId, relid); + ObjectAddressSet(subaddr, SubscriptionRelationId, subid); + recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL); [2] postgres[670778]=# DROP TABLE myschema.conflict_log_history2; ERROR: 2BP01: cannot drop table myschema.conflict_log_history2 because subscription sub requires it HINT: You can drop subscription sub instead. LOCATION: findDependentObjects, dependency.c:788 postgres[670778]=# [3] ObjectAddressSet(object, SubscriptionRelationId, subid); performDeletion(&object, DROP_CASCADE PERFORM_DELETION_INTERNAL | PERFORM_DELETION_SKIP_ORIGINAL); -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-14T15:46:30Z
On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > I was considering the interdependence between the subscription and the > > > conflict log table (CLT). IMHO, it would be logical to establish the > > > subscription as dependent on the CLT. This way, if someone attempts to > > > drop the CLT, the system would recognize the dependency of the > > > subscription and prevent the drop unless the subscription is removed > > > first or the CASCADE option is used. > > > > > > However, while investigating this, I encountered an error [1] stating > > > that global objects are not supported in this context. This indicates > > > that global objects cannot be made dependent on local objects. > > > > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database > > objects. For example, consider following case: > > postgres=# create table t1(c1 int primary key); > > CREATE TABLE > > postgres=# \d+ t1 > > Table "public.t1" > > Column | Type | Collation | Nullable | Default | Storage | > > Compression | Stats target | Description > > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > > c1 | integer | | not null | | plain | > > | | > > Indexes: > > "t1_pkey" PRIMARY KEY, btree (c1) > > Publications: > > "pub1" > > Not-null constraints: > > "t1_c1_not_null" NOT NULL "c1" > > Access method: heap > > postgres=# drop index t1_pkey; > > ERROR: cannot drop index t1_pkey because constraint t1_pkey on table > > t1 requires it > > HINT: You can drop constraint t1_pkey on table t1 instead. > > > > Here, the PK index is created as part for CREATE TABLE operation and > > pk_index is not allowed to be dropped independently. > > > > > Although making an object dependent on global/shared objects is > > > possible for certain types of shared objects [2], this is not our main > > > objective. > > > > > > > As per my understanding from the above example, we need something like > > that only for shared object subscription and (internally created) > > table. > > Yeah that seems to be exactly what we want, so I tried doing that by > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and > it is behaving as we want[2]. And while dropping the subscription or > altering CLT we can delete internal dependency so that CLT get dropped > automatically[3] > > I will send an updated patch after testing a few more scenarios and > fixing other pending issues. > > [1] > + ObjectAddressSet(myself, RelationRelationId, relid); > + ObjectAddressSet(subaddr, SubscriptionRelationId, subid); > + recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL); > > > [2] > postgres[670778]=# DROP TABLE myschema.conflict_log_history2; > ERROR: 2BP01: cannot drop table myschema.conflict_log_history2 > because subscription sub requires it > HINT: You can drop subscription sub instead. > LOCATION: findDependentObjects, dependency.c:788 > postgres[670778]=# > > [3] > ObjectAddressSet(object, SubscriptionRelationId, subid); > performDeletion(&object, DROP_CASCADE > PERFORM_DELETION_INTERNAL | > PERFORM_DELETION_SKIP_ORIGINAL); > > Here is the patch which implements the dependency and fixes other comments from Shveta. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-14T15:50:24Z
On Fri, Dec 12, 2025 at 3:33 PM shveta malik <shveta.malik@gmail.com> wrote: > > > Few comments for v11: > > 1) > +#include "executor/spi.h" > +#include "replication/conflict.h" > +#include "utils/fmgroids.h" > +#include "utils/regproc.h" > > subscriptioncmds.c compiles without the above inclusions. I think we need utils/regproc.h for "stringToQualifiedNameList()" > 2) > postgres=# create subscription sub3 connection '...' publication pub1 > WITH(conflict_log_table='pg_temp.clt'); > NOTICE: created replication slot "sub3" on publisher > CREATE SUBSCRIPTION > > Should we restrict clt creation in pg_temp? Done and added a test. > 3) > + /* Fetch the eixsting conflict table table information. */ > > typos: eixsting->existing, > table table -> table Fixed > 4) > AlterSubscription(): > + values[Anum_pg_subscription_subconflictlognspid - 1] = > + ObjectIdGetDatum(nspid); > + > + if (relname != NULL) > + values[Anum_pg_subscription_subconflictlogtable - 1] = > + CStringGetTextDatum(relname); > + else > + nulls[Anum_pg_subscription_subconflictlogtable - 1] = > + true; > > Should we move the nspid setting inside 'if(relname != NULL)' block? Since subconflictlognspid is part of the fixed size structure so we will always have to set it so I prefer it to keep it out. > 5) > Is there a way to reset/remove conflict_log_table? I did not see any > such handling in AlterSubscription? It gives error: > > postgres=# alter subscription sub3 set (conflict_log_table=''); > ERROR: invalid name syntax Fixed and added a test case > 6) > +char * > +get_subscription_conflict_log_table(Oid subid, Oid *nspid) > +{ > + HeapTuple tup; > + Datum datum; > + bool isnull; > + char *relname = NULL; > + Form_pg_subscription subform; > + > + *nspid = InvalidOid; > + > + tup = SearchSysCache1(SUBSCRIPTIONOID, ObjectIdGetDatum(subid)); > + > + if (!HeapTupleIsValid(tup)) > + return NULL; > > Should we have elog(ERROR) here for cache lookup failure? Callers like > AlterSubscription, DropSubscription lock the sub entry, so it being > missing at this stage is not normal. I have not seen all the callers > though. Yeah we can do that. > 7) > +#include "access/htup.h" > +#include "access/skey.h" > > +#include "access/table.h" > +#include "catalog/pg_attribute.h" > +#include "catalog/indexing.h" > +#include "catalog/namespace.h" > +#include "catalog/pg_namespace.h" > +#include "catalog/pg_type.h" > > +#include "executor/spi.h" > +#include "utils/array.h" > > conflict.c compiles without above inclusions. Done -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-15T08:45:58Z
On Sun, Dec 14, 2025 at 9:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > Thanks for the patch. Few comments: 1) + if (isTempNamespace(namespaceId)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot create conflict log table \"%s\" in a temporary namespace", + conflictrel), + errhint("Use a permanent schema."))); a) Shall we use 'temporary schema' instead of 'temporary namespace'? See other similar errors: errmsg("cannot move objects into or out of temporary schemas") errmsg("cannot create relations in temporary schemas of other sessions")) errmsg("cannot create temporary relation in non-temporary schema") b) Do we really need errhint here? It seems self-explanatory. If we really want to specify HINT, shall we say: "Specify a non-temporary schema for conflict log table." 2) postgres=# alter subscription sub1 set (conflict_log_table=''); ERROR: conflict log table name cannot be empty HINT: Provide a valid table name or omit the parameter. My idea was to allow the above operation to enable users to reset the conflict_log_table when the conflict log history is no longer needed. Is there any other way to reset it, or is this intentionally not supported? 3) postgres=# alter subscription sub1 set (conflict_log_table=NULL); ALTER SUBSCRIPTION postgres=# alter subscription sub2 set (conflict_log_table=create); ALTER SUBSCRIPTION postgres=# \d List of relations Schema | Name | Type | Owner --------+---------+-------+-------- public | create | table | shveta public | null | table | shveta It takes reserved keywords and creates tables with those names. It should be restricted. 4) postgres=# SELECT c.relname FROM pg_depend d JOIN pg_class c ON c.oid = d.objid JOIN pg_subscription s ON s.oid = d.refobjid WHERE s.subname = 'sub1'; relname --------- clt postgres=# select count(*) from pg_shdepend where refobjid = (select oid from pg_subscription where subname='sub1'); count ------- 0 Since dependency between sub and clt is a dependency involving shared-object, shouldn't the entry be in pg_shdepend? Or do we allow such entries in pg_depend as well? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-15T09:25:18Z
On Mon, Dec 15, 2025 at 2:16 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Sun, Dec 14, 2025 at 9:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > Thanks for the patch. Few comments: > > 2) > postgres=# alter subscription sub1 set (conflict_log_table=''); > ERROR: conflict log table name cannot be empty > HINT: Provide a valid table name or omit the parameter. > > My idea was to allow the above operation to enable users to reset the > conflict_log_table when the conflict log history is no longer needed. > Is there any other way to reset it, or is this intentionally not > supported? ALTEr SUBSCRIPTION..SET (conflict_log_table=NONE); this is same as how other subscription parameters are being reset > 3) > postgres=# alter subscription sub1 set (conflict_log_table=NULL); > ALTER SUBSCRIPTION > postgres=# alter subscription sub2 set (conflict_log_table=create); > ALTER SUBSCRIPTION > postgres=# \d > List of relations > Schema | Name | Type | Owner > --------+---------+-------+-------- > public | create | table | shveta > public | null | table | shveta > > > It takes reserved keywords and creates tables with those names. It > should be restricted. I somehow assume table creation will be restricted with these names, but since we switch from SPI to internal interface its not true anymore, need to see how we can handle this. > 4) > postgres=# SELECT c.relname FROM pg_depend d JOIN pg_class c ON c.oid > = d.objid JOIN pg_subscription s ON s.oid = d.refobjid WHERE s.subname > = 'sub1'; > relname > --------- > clt > > postgres=# select count(*) from pg_shdepend where refobjid = (select > oid from pg_subscription where subname='sub1'); > count > ------- > 0 > > Since dependency between sub and clt is a dependency involving > shared-object, shouldn't the entry be in pg_shdepend? Or do we allow > such entries in pg_depend as well? The primary reason for recording in pg_depend is that the RemoveRelations() function already includes logic to check for and report internal dependencies within pg_depends. Consequently, if we were to record the dependency in pg_shdepends, we would likely need to modify RemoveRelations() to incorporate handling for pg_shdepends dependencies. However, some might argue that when an object ID (objid) is local and the referenced object ID (refobjid) is shared, such as when a table is created under a ROLE, establishing a dependency with the owner, the dependency is currently recorded in pg_shdepend. In this scenario, the dependent object (the local table) can be dropped independently, while the referenced object (the shared owner) cannot. However, when aiming to record an internal dependency, the dependent object should not be droppable without first dropping the referencing object. Therefore, I believe the dependency record should be placed in pg_depend, as the depender is a local object and will check for dependencies there. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-15T09:48:23Z
On Sun, Dec 14, 2025 at 9:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Here is the patch which implements the dependency and fixes other > comments from Shveta. > +/* + * Check if the specified relation is used as a conflict log table by any + * subscription. + */ +bool +IsConflictLogTable(Oid relid) +{ + Relation rel; + TableScanDesc scan; + HeapTuple tup; + bool is_clt = false; + + rel = table_open(SubscriptionRelationId, AccessShareLock); + scan = table_beginscan_catalog(rel, 0, NULL); + + while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection))) This function has been used at multiple places in the patch, though not in any performance-critical paths, but still, it seems like the impact can be noticeable for a large number of subscriptions. Also, I am not sure it is a good design to scan the entire system table to find whether some other relation is publishable or not. I see below kinds of usages for it: + /* Subscription conflict log tables are not published */ + result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) && + !IsConflictLogTable(relid); In this regard, I see a comment atop is_publishable_class which suggests as follows: The best * long-term solution may be to add a "relispublishable" bool to pg_class, * and depend on that instead of OID checks. */ static bool is_publishable_class(Oid relid, Form_pg_class reltuple) I feel that is a good idea for reasons mentioned atop is_publishable_class and for the conflict table. What do you think? -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-15T10:31:47Z
On Mon, Dec 15, 2025 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Dec 14, 2025 at 9:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > Here is the patch which implements the dependency and fixes other > > comments from Shveta. > > > > +/* > + * Check if the specified relation is used as a conflict log table by any > + * subscription. > + */ > +bool > +IsConflictLogTable(Oid relid) > +{ > + Relation rel; > + TableScanDesc scan; > + HeapTuple tup; > + bool is_clt = false; > + > + rel = table_open(SubscriptionRelationId, AccessShareLock); > + scan = table_beginscan_catalog(rel, 0, NULL); > + > + while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection))) > > This function has been used at multiple places in the patch, though > not in any performance-critical paths, but still, it seems like the > impact can be noticeable for a large number of subscriptions. Also, I > am not sure it is a good design to scan the entire system table to > find whether some other relation is publishable or not. I see below > kinds of usages for it: > > + /* Subscription conflict log tables are not published */ > + result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) && > + !IsConflictLogTable(relid); > > In this regard, I see a comment atop is_publishable_class which > suggests as follows: > > The best > * long-term solution may be to add a "relispublishable" bool to pg_class, > * and depend on that instead of OID checks. > */ > static bool > is_publishable_class(Oid relid, Form_pg_class reltuple) > > I feel that is a good idea for reasons mentioned atop > is_publishable_class and for the conflict table. What do you think? On quick thought, this seems like a good idea and may simplify a couple of places. And might be good for future extension as we can mark publishable at individual relation instead of targeting broad categories like IsCatalogRelationOid() or checking individual items by its Oid. IMHO this can be done as an individual patch in a separate thread, or as a base patch. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-15T11:15:53Z
On Mon, Dec 15, 2025 at 4:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 15, 2025 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Sun, Dec 14, 2025 at 9:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > Here is the patch which implements the dependency and fixes other > > > comments from Shveta. > > > > > > > +/* > > + * Check if the specified relation is used as a conflict log table by any > > + * subscription. > > + */ > > +bool > > +IsConflictLogTable(Oid relid) > > +{ > > + Relation rel; > > + TableScanDesc scan; > > + HeapTuple tup; > > + bool is_clt = false; > > + > > + rel = table_open(SubscriptionRelationId, AccessShareLock); > > + scan = table_beginscan_catalog(rel, 0, NULL); > > + > > + while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection))) > > > > This function has been used at multiple places in the patch, though > > not in any performance-critical paths, but still, it seems like the > > impact can be noticeable for a large number of subscriptions. Also, I > > am not sure it is a good design to scan the entire system table to > > find whether some other relation is publishable or not. I see below > > kinds of usages for it: > > > > + /* Subscription conflict log tables are not published */ > > + result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) && > > + !IsConflictLogTable(relid); > > > > In this regard, I see a comment atop is_publishable_class which > > suggests as follows: > > > > The best > > * long-term solution may be to add a "relispublishable" bool to pg_class, > > * and depend on that instead of OID checks. > > */ > > static bool > > is_publishable_class(Oid relid, Form_pg_class reltuple) > > > > I feel that is a good idea for reasons mentioned atop > > is_publishable_class and for the conflict table. What do you think? > > On quick thought, this seems like a good idea and may simplify a > couple of places. And might be good for future extension as we can > mark publishable at individual relation instead of targeting broad > categories like IsCatalogRelationOid() or checking individual items by > its Oid. IMHO this can be done as an individual patch in a separate > thread, or as a base patch. > I prefer to do it in a separate thread, so that it can get some more attention. But it should be done before the main conflict patch. I think we can subdivide the main patch into (a) DDL handling, everything except inserting data into conflict table, (b) inserting data into conflict table, (c) upgrade handling. That way it will be easier to review. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-15T11:41:06Z
On Mon, Dec 15, 2025 at 2:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > 3) > > postgres=# alter subscription sub1 set (conflict_log_table=NULL); > > ALTER SUBSCRIPTION > > postgres=# alter subscription sub2 set (conflict_log_table=create); > > ALTER SUBSCRIPTION > > postgres=# \d > > List of relations > > Schema | Name | Type | Owner > > --------+---------+-------+-------- > > public | create | table | shveta > > public | null | table | shveta > > > > > > It takes reserved keywords and creates tables with those names. It > > should be restricted. > > I somehow assume table creation will be restricted with these names, > but since we switch from SPI to internal interface its not true > anymore, need to see how we can handle this. While thinking more on this, I was seeing other places where we use 'heap_create_with_catalog()' so I noticed that we always use the internally generated name, so wouldn't it be nice to make the conflict log table as bool and use internally generated name something like conflict_log_table_$subid$ and we will always create that in current active searchpath? Thought? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-12-16T01:10:03Z
Some review comments for v12-0001. ====== General 1. There is no documentation. Even if it seems a bit premature IMO writing/reviewing the documention could help identify unanticipated usability issues. ====== src/backend/commands/subscriptioncmds.c 2. + + /* Setting conflict_log_table = NONE is treated as no table. */ + if (strcmp(opts->conflictlogtable, "none") == 0) + opts->conflictlogtable = NULL; + } 2a. This was unexpected when I cam across this code. This feature needs to be described in the commit message. ~ 2b. Case sensitive? ~~~ CreateSubscription: 3. + List *names; + + /* Explicitly check for empty string before any processing. */ + if (opts.conflictlogtable[0] == '\0') + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("conflict log table name cannot be empty"), + errhint("Provide a valid table name or omit the parameter."))); + + names = stringToQualifiedNameList(opts.conflictlogtable, NULL); Should '' just be equivalent of NONE instead of another error condition? ~~~ AlterSubscription: 4. + Oid old_nspid = InvalidOid; + char *old_relname = NULL; + char *relname = NULL; + List *names = NIL; Var 'names' can be declared at a lower scope -- e.g. in the 'if' block. ~~~ DropSubscription: 5. + /* + * Conflict log tables are recorded as internal dependencies of the + * subscription. We must drop the dependent objects before the + * subscription itself is removed. By using + * PERFORM_DELETION_SKIP_ORIGINAL, we ensure that only the conflict log + * table is reaped while the subscription remains for the final deletion + * step. + */ Double spaces? /the subscription/the subscription/ ~~~ create_conflict_log_table_tupdesc: 6. +static TupleDesc +create_conflict_log_table_tupdesc(void) +{ + TupleDesc tupdesc; + int i; + + tupdesc = CreateTemplateTupleDesc(MAX_CONFLICT_ATTR_NUM); + + for (i = 0; i < MAX_CONFLICT_ATTR_NUM; i++) Declare 'i' as a for-loop var. ~~~ create_conflict_log_table: 7. +/* + * Create conflict log table. + * + * The subscription owner becomes the owner of this table and has all + * privileges on it. + */ +static void +create_conflict_log_table(Oid namespaceId, char *conflictrel, Oid subid) +{ I felt that the 'subid' should be the first parameter, not the last. ~~~ 8. namespace > relation, so I felt it is more natural to check for the temp namespace *before* checking for clashing table names. ====== src/backend/replication/logical/conflict.c 9. + if (ValidateConflictLogTable(conflictlogrel)) + { + /* + * Prepare the conflict log tuple. If the error level is below + * ERROR, insert it immediately. Otherwise, defer the insertion to + * a new transaction after the current one aborts, ensuring the + * insertion of the log tuple is not rolled back. + */ + prepare_conflict_log_tuple(estate, + relinfo->ri_RelationDesc, + conflictlogrel, + type, + searchslot, + conflicttuples, + remoteslot); + if (elevel < ERROR) + InsertConflictLogTuple(conflictlogrel); + } + else + ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("Conflict log table \"%s.%s\" structure changed, skipping insertion", + get_namespace_name(RelationGetNamespace(conflictlogrel)), + RelationGetRelationName(conflictlogrel))); 9a. AFAICT in the only few places this function is called it emits exactly the same warning, so it seems unnecessary duplication. Would it be better to have that WARNING code inside the ValidateConflictLogTable (eg always give the warning when returning false). But see also 9b. ~ 9b. I have some doubts about this validation function. It seems inefficient to be validating the same CLT structure over and over every time there is a new conflict. Not only is that going to be slower, but the logfile is going to fill up with warnings. Maybe this "validation" phase should be a one-time check only during the CREATE/ALTER SUBSCRIPTION. Maybe if validation fails it could give some NOTICE that the CLT logging is broken and then reset the CLT to NONE? ~~~ ValidateConflictLogTable: 10. +/* + * ValidateConflictLogTable - Validate conflict log table + * + * Validate whether the conflict log table is still suitable for considering as + * conflict log table. + */ +bool +ValidateConflictLogTable(Relation rel) This function comment seems unhelpful. 3 times it mentions equivalent of "validate conflict log table" but nowhere does it say what that even means. Maybe the later comment (below): + /* + * Check whether the table definition including its column names, data + * types, and column ordering meets the requirements for conflict log + * table. + */ Should be moved into the function comment part. ~~~ 11. + Relation pg_attribute; + HeapTuple atup; + ScanKeyData scankey; + SysScanDesc scan; + Form_pg_attribute attForm; + int attcnt = 0; + bool tbl_ok = true; 'attForm' can be declared within the while loop. ~~~ 12. + if (attcnt != MAX_CONFLICT_ATTR_NUM || !tbl_ok) + return false; As per previous review comment, this could emit the WARNING log right here. But see also #9b. ~~~ build_local_conflicts_json_array: 13. + Datum values[MAX_LOCAL_CONFLICT_INFO_ATTRS]; + bool nulls[MAX_LOCAL_CONFLICT_INFO_ATTRS]; + char *origin_name = NULL; + HeapTuple tuple; + Datum json_datum; + int attno; + + memset(values, 0, sizeof(Datum) * MAX_LOCAL_CONFLICT_INFO_ATTRS); + memset(nulls, 0, sizeof(bool) * MAX_LOCAL_CONFLICT_INFO_ATTRS); You could also just use designated initializer syntax here and avoid the memsets. e.g. = {0} ~~~ 14. + memset(values, 0, sizeof(Datum) * MAX_LOCAL_CONFLICT_INFO_ATTRS); + memset(nulls, 0, sizeof(bool) * MAX_LOCAL_CONFLICT_INFO_ATTRS); Another place where you could've avoided memset and just done = {0}; ~~~ 15. + json_datum_array = (Datum *) palloc(num_conflicts * sizeof(Datum)); + json_null_array = (bool *) palloc0(num_conflicts * sizeof(bool)); - index_value = BuildIndexValueDescription(indexDesc, values, isnull); + i = 0; + foreach(lc, json_datums) + { + json_datum_array[i] = (Datum) lfirst(lc); + i++; + } Should these be using new palloc_array instead of palloc? ====== src/include/replication/conflict.h 16. +typedef struct ConflictLogColumnDef +{ + const char *attname; /* Column name */ + Oid atttypid; /* Data type OID */ +} ConflictLogColumnDef; Add this to typedefs.list ~~~ 17. +/* The single source of truth for the conflict log table schema */ +static const ConflictLogColumnDef ConflictLogSchema[] = +{ + { .attname = "relid", .atttypid = OIDOID }, + { .attname = "schemaname", .atttypid = TEXTOID }, + { .attname = "relname", .atttypid = TEXTOID }, + { .attname = "conflict_type", .atttypid = TEXTOID }, + { .attname = "remote_xid", .atttypid = XIDOID }, + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, + { .attname = "remote_origin", .atttypid = TEXTOID }, + { .attname = "replica_identity", .atttypid = JSONOID }, + { .attname = "remote_tuple", .atttypid = JSONOID }, + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } +}; I like this, but I felt it would be better if all the definitions for "local_conflicts" were defined here too. Then everythin gis in one place. e.g. MAX_LOCAL_CONFLICT_INFO_ATTRS and most of the content of build_conflict_tupledesc(). ~~~ 18. +/* Define the count using the array size */ +#define MAX_CONFLICT_ATTR_NUM (sizeof(ConflictLogSchema) / sizeof(ConflictLogSchema[0])) This comment is just saying same as the code so doesn't seem to be useful. ====== src/test/regress/expected/subscription.out 19. +\dt+ clt.regress_conflict_log3 + List of tables + Schema | Name | Type | Owner | Persistence | Size | Description +--------+-----------------------+-------+---------------------------+-------------+---------+------------- + clt | regress_conflict_log3 | table | regress_subscription_user | permanent | 0 bytes | +(1 row) Since the CLT is auto-created internally, and since there is a "Description" attribute, I wonder should you also be auto-generating that description so that here it might say something useful like: "Conflict Log File for subscription XYZ" ~~~ 20. +-- ok - create subscription with conflict_log_table = NONE +CREATE SUBSCRIPTION regress_conflict_test1 CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, conflict_log_table = NONE); +SELECT subname, subconflictlogtable FROM pg_subscription WHERE subname = 'regress_conflict_test2'; + subname | subconflictlogtable +------------------------+----------------------- + regress_conflict_test2 | regress_conflict_log3 +(1 row) + I didn't understand this test case; You are setting a NONE clt for subscription 'regress_conflict_test1'. But then you are checking subname 'regress_conflict_test2'. Is that a typo? ~~~ 21. +ALTER SUBSCRIPTION regress_conflict_test1 DISABLE; +ALTER SUBSCRIPTION regress_conflict_test1 SET (slot_name = NONE); +DROP SUBSCRIPTION regress_conflict_test1; +-- Clean up remaining test subscription +ALTER SUBSCRIPTION regress_conflict_test2 DISABLE; +ALTER SUBSCRIPTION regress_conflict_test2 SET (slot_name = NONE); +DROP SUBSCRIPTION regress_conflict_test2; Something seems misplaced. Why aren't all of the cleanups under the 'cleanup' comment? ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-16T04:21:17Z
On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 15, 2025 at 2:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > 3) > > > postgres=# alter subscription sub1 set (conflict_log_table=NULL); > > > ALTER SUBSCRIPTION > > > postgres=# alter subscription sub2 set (conflict_log_table=create); > > > ALTER SUBSCRIPTION > > > postgres=# \d > > > List of relations > > > Schema | Name | Type | Owner > > > --------+---------+-------+-------- > > > public | create | table | shveta > > > public | null | table | shveta > > > > > > > > > It takes reserved keywords and creates tables with those names. It > > > should be restricted. > > > > I somehow assume table creation will be restricted with these names, > > but since we switch from SPI to internal interface its not true > > anymore, need to see how we can handle this. > > While thinking more on this, I was seeing other places where we use > 'heap_create_with_catalog()' so I noticed that we always use the > internally generated name, so wouldn't it be nice to make the conflict > log table as bool and use internally generated name something like > conflict_log_table_$subid$ and we will always create that in current > active searchpath? Thought? > We could do this as a first step. See the proposal in email [1] where we have discussed having two options instead of one. The first option will be conflict_log_format and the values would be log and table. In this case, the table would be an internally generated one. [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-16T04:24:03Z
On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > I was considering the interdependence between the subscription and the > > > > conflict log table (CLT). IMHO, it would be logical to establish the > > > > subscription as dependent on the CLT. This way, if someone attempts to > > > > drop the CLT, the system would recognize the dependency of the > > > > subscription and prevent the drop unless the subscription is removed > > > > first or the CASCADE option is used. > > > > > > > > However, while investigating this, I encountered an error [1] stating > > > > that global objects are not supported in this context. This indicates > > > > that global objects cannot be made dependent on local objects. > > > > > > > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database > > > objects. For example, consider following case: > > > postgres=# create table t1(c1 int primary key); > > > CREATE TABLE > > > postgres=# \d+ t1 > > > Table "public.t1" > > > Column | Type | Collation | Nullable | Default | Storage | > > > Compression | Stats target | Description > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > > > c1 | integer | | not null | | plain | > > > | | > > > Indexes: > > > "t1_pkey" PRIMARY KEY, btree (c1) > > > Publications: > > > "pub1" > > > Not-null constraints: > > > "t1_c1_not_null" NOT NULL "c1" > > > Access method: heap > > > postgres=# drop index t1_pkey; > > > ERROR: cannot drop index t1_pkey because constraint t1_pkey on table > > > t1 requires it > > > HINT: You can drop constraint t1_pkey on table t1 instead. > > > > > > Here, the PK index is created as part for CREATE TABLE operation and > > > pk_index is not allowed to be dropped independently. > > > > > > > Although making an object dependent on global/shared objects is > > > > possible for certain types of shared objects [2], this is not our main > > > > objective. > > > > > > > > > > As per my understanding from the above example, we need something like > > > that only for shared object subscription and (internally created) > > > table. > > > > Yeah that seems to be exactly what we want, so I tried doing that by > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and > > it is behaving as we want[2]. And while dropping the subscription or > > altering CLT we can delete internal dependency so that CLT get dropped > > automatically[3] > > > > I will send an updated patch after testing a few more scenarios and > > fixing other pending issues. > > > > [1] > > + ObjectAddressSet(myself, RelationRelationId, relid); > > + ObjectAddressSet(subaddr, SubscriptionRelationId, subid); > > + recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL); > > > > > > [2] > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2; > > ERROR: 2BP01: cannot drop table myschema.conflict_log_history2 > > because subscription sub requires it > > HINT: You can drop subscription sub instead. > > LOCATION: findDependentObjects, dependency.c:788 > > postgres[670778]=# > > > > [3] > > ObjectAddressSet(object, SubscriptionRelationId, subid); > > performDeletion(&object, DROP_CASCADE > > PERFORM_DELETION_INTERNAL | > > PERFORM_DELETION_SKIP_ORIGINAL); > > > > > > Here is the patch which implements the dependency and fixes other > comments from Shveta. Thanks for the changes, the new implementation based on dependency creates a cycle while dumping: ./pg_dump -d postgres -f dump1.txt -p 5433 pg_dump: warning: could not resolve dependency loop among these items: pg_dump: detail: TABLE conflict (ID 225 OID 16397) pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396) pg_dump: detail: POST-DATA BOUNDARY (ID 3491) pg_dump: detail: TABLE DATA t1 (ID 3485 OID 16384) pg_dump: detail: PRE-DATA BOUNDARY (ID 3490) This can be seen with a simple subscription with conflict_log_table. This was working fine with the v11 version patch. Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-16T05:03:01Z
On Mon, Dec 15, 2025 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sun, Dec 14, 2025 at 9:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > Here is the patch which implements the dependency and fixes other > > comments from Shveta. > > > > +/* > + * Check if the specified relation is used as a conflict log table by any > + * subscription. > + */ > +bool > +IsConflictLogTable(Oid relid) > +{ > + Relation rel; > + TableScanDesc scan; > + HeapTuple tup; > + bool is_clt = false; > + > + rel = table_open(SubscriptionRelationId, AccessShareLock); > + scan = table_beginscan_catalog(rel, 0, NULL); > + > + while (HeapTupleIsValid(tup = heap_getnext(scan, ForwardScanDirection))) > > This function has been used at multiple places in the patch, though > not in any performance-critical paths, but still, it seems like the > impact can be noticeable for a large number of subscriptions. Also, I > am not sure it is a good design to scan the entire system table to > find whether some other relation is publishable or not. I see below > kinds of usages for it: > > + /* Subscription conflict log tables are not published */ > + result = is_publishable_class(relid, (Form_pg_class) GETSTRUCT(tuple)) && > + !IsConflictLogTable(relid); > > In this regard, I see a comment atop is_publishable_class which > suggests as follows: > > The best > * long-term solution may be to add a "relispublishable" bool to pg_class, > * and depend on that instead of OID checks. > */ > static bool > is_publishable_class(Oid relid, Form_pg_class reltuple) > > I feel that is a good idea for reasons mentioned atop > is_publishable_class and for the conflict table. What do you think? > +1. The OID check may be unreliable, as mentioned in the comment. I tested this by dropping and recreating information_schema, and observed that after recreation it became eligible for publication because its relid no longer falls under FirstNormalObjectId. Steps: ****Pub****: create publication pub1; ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing; select * from information_schema.sql_sizing where sizing_id=97; ****Sub****: create subscription sub1 connection '...' publication pub1 with (copy_data=false); select * from information_schema.sql_sizing where sizing_id=97; ****Pub****: alter table information_schema.sql_sizing replica identity full; --this is not replicated. UPDATE information_schema.sql_sizing set supported_value=12 where sizing_id=97; ****Sub****: postgres=# select supported_value from information_schema.sql_sizing where sizing_id=97; supported_value ----------------- 0 ~~ Then drop and recreate and try to perform the above update again, it gets replicated: drop schema information_schema cascade; ./psql -d postgres -f ./../../src/backend/catalog/information_schema.sql -p 5433 ****Pub****: ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing; select * from information_schema.sql_sizing where sizing_id=97; alter table information_schema.sql_sizing replica identity full; --This is replicated UPDATE information_schema.sql_sizing set supported_value=14 where sizing_id=97; ****Sub****: --This shows supported_value as 14 postgres=# select supported_value from information_schema.sql_sizing where sizing_id=97; supported_value ----------------- 14 thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-16T05:41:48Z
On Thu, 11 Dec 2025 at 19:50, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Dec 11, 2025 at 5:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Dec 11, 2025 at 5:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, Dec 11, 2025 at 5:04 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > 2) > > > > When we do below: > > > > alter subscription sub1 SET (conflict_log_table=clt2); > > > > > > > > the previous conflict log table is dropped. Is this behavior > > > > intentional and discussed/concluded earlier? It’s possible that a user > > > > may want to create a new conflict log table for future events while > > > > still retaining the old one for analysis. If the subscription itself > > > > is dropped, then dropping the CLT makes sense, but I’m not sure this > > > > behavior is intended for ALTER SUBSCRIPTION. I do understand that > > > > once we unlink CLT from subscription, later even DROP subscription > > > > cannot drop it, but user can always drop it when not needed. > > > > > > > > If we plan to keep existing behavior, it should be clearly documented > > > > in a CAUTION section, and the command should explicitly log the table > > > > drop. > > > > > > Yeah we discussed this behavior and the conclusion was we would > > > document this behavior and its user's responsibility to take necessary > > > backup of the conflict log table data if they are setting a new log > > > table or NONE for the subscription. > > > > > > > +1. If we don't do this then it will be difficult to track for > > postgres or users the previous conflict history tables. > > Right, it makes sense. > > Attached patch fixed most of the open comments > 1) \dRs+ now show the schema qualified name > 2) Now key_tuple and replica_identify tuple both are add in conflict > log tuple wherever applicable > 3) Refactored the code so that we can define the conflict log table > schema only once in the header file and both create_conflict_log_table > and ValidateConflictLogTable use it. > > I was considering the interdependence between the subscription and the > conflict log table (CLT). IMHO, it would be logical to establish the > subscription as dependent on the CLT. This way, if someone attempts to > drop the CLT, the system would recognize the dependency of the > subscription and prevent the drop unless the subscription is removed > first or the CASCADE option is used. > > However, while investigating this, I encountered an error [1] stating > that global objects are not supported in this context. This indicates > that global objects cannot be made dependent on local objects. > Although making an object dependent on global/shared objects is > possible for certain types of shared objects [2], this is not our main > objective. > > We do not need to make the CLT dependent on the subscription because > the table can be dropped when the subscription is dropped anyway and > we are already doing it as part of drop subscription as well as alter > subscription when CLT is set to NONE or a different table. Therefore, > extending the functionality of shared dependency is unnecessary for > this purpose. > > Thoughts? > > [1] > doDeletion() > { > .... > /* > * These global object types are not supported here. > */ > case AuthIdRelationId: > case DatabaseRelationId: > case TableSpaceRelationId: > case SubscriptionRelationId: > case ParameterAclRelationId: > elog(ERROR, "global objects cannot be deleted by doDeletion"); > break; > } > > [2] > typedef enum SharedDependencyType > { > SHARED_DEPENDENCY_OWNER = 'o', > SHARED_DEPENDENCY_ACL = 'a', > SHARED_DEPENDENCY_INITACL = 'i', > SHARED_DEPENDENCY_POLICY = 'r', > SHARED_DEPENDENCY_TABLESPACE = 't', > SHARED_DEPENDENCY_INVALID = 0, > } SharedDependencyType; > > Pending Items are: > 1. Handling dump/upgrade The attached patch has the changes for handling dump. This works on top of v11 version, it does not work on v12 because of the issue reported at [1]. Currently the upgrade does not work because of the existing issue which is being tracked at [2], upgrade works with the patch attached at [2]. [1] - https://www.postgresql.org/message-id/CALDaNm1zEYoSdf2Ns-%3DUJRw95E5sbfpB0oaNUWtRJN27Q1Knhw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CALDaNm2x3rd7C0_HjUpJFbxpAqXgm%3DQtoKfkEWDVA8h%2BJFpa_w%40mail.gmail.com Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-16T07:17:52Z
On Mon, Dec 15, 2025 at 2:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 15, 2025 at 2:16 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Sun, Dec 14, 2025 at 9:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > 4) > > postgres=# SELECT c.relname FROM pg_depend d JOIN pg_class c ON c.oid > > = d.objid JOIN pg_subscription s ON s.oid = d.refobjid WHERE s.subname > > = 'sub1'; > > relname > > --------- > > clt > > > > postgres=# select count(*) from pg_shdepend where refobjid = (select > > oid from pg_subscription where subname='sub1'); > > count > > ------- > > 0 > > > > Since dependency between sub and clt is a dependency involving > > shared-object, shouldn't the entry be in pg_shdepend? Or do we allow > > such entries in pg_depend as well? > > The primary reason for recording in pg_depend is that the > RemoveRelations() function already includes logic to check for and > report internal dependencies within pg_depends. Consequently, if we > were to record the dependency in pg_shdepends, we would likely need to > modify RemoveRelations() to incorporate handling for pg_shdepends > dependencies. > > However, some might argue that when an object ID (objid) is local and > the referenced object ID (refobjid) is shared, such as when a table is > created under a ROLE, establishing a dependency with the owner, the > dependency is currently recorded in pg_shdepend. In this scenario, the > dependent object (the local table) can be dropped independently, while > the referenced object (the shared owner) cannot. > Yes and same is true for tablespaces. Consider below case: create tablespace tbs location <tbs_location>; create table t2(c1 int, c2 int) PARTITION BY RANGE(c1) tablespace tbs; > > However, when aiming > to record an internal dependency, the dependent object should not be > droppable without first dropping the referencing object. Therefore, I > believe the dependency record should be placed in pg_depend, as the > depender is a local object and will check for dependencies there. > I think it make sense to add the dependency entry in pg_depend for this case (dependent object table is db-local and referenced object subscription is shared among cluster) as there is a fundamental architectural difference between Tablespaces/Roles and Subscriptions that determines why one needs pg_shdepend and the other is better off with pg_depend. It comes down to cross-database visibility during the DROP command. 1. The "Tablespace" Scenario (Why it needs pg_shdepend) A Tablespace is a truly global resource. You can connect to postgres (database A) and try to drop a tablespace that is being used by app_db (database B). The Problem: When you run DROP TABLESPACE tbs from Database A, the system cannot look inside Database B's pg_depend to see if the tablespace is in use. It would have to connect to every database in the cluster to check. The Solution: We explicitly push this dependency up to the global pg_shdepend. This allows the DROP command in Database A to instantly see: "Wait, object 123 in Database B needs this. Block the drop." 2. The "Subscription" Scenario (Why it does NOT need pg_shdepend) Although pg_subscription is a shared catalog, a Subscription is pinned to a specific database (subdbid). One can only DROP SUBSCRIPTION while connected to the database that owns it. Consider a scenario where one creates a subscription sub_1 in app_db. Now, one cannot connect to postgres DB and run DROP SUBSCRIPTION sub_1. She must connect to app_db. Since we need to conenct to app_db to drop the subscription, the system has direct, fast access to the local pg_depend of app_db. It doesn't need to consult a global "Cross-DB" catalog because there is no mystery about where the dependencies live. Does this theory sound more bullet-proof as to why it is desirable to store dependency entries for this case in pg_depend. If so, I suggest we can add some comments to explain the difference of subscription with other shared objects in comments as the future readers may have the same question. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-17T04:28:52Z
On Tue, Dec 16, 2025 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > The OID check may be unreliable, as mentioned in the comment. I tested > this by dropping and recreating information_schema, and observed that > after recreation it became eligible for publication because its relid > no longer falls under FirstNormalObjectId. Steps: > > ****Pub****: > create publication pub1; > ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing; > select * from information_schema.sql_sizing where sizing_id=97; > > ****Sub****: > create subscription sub1 connection '...' publication pub1 with > (copy_data=false); > select * from information_schema.sql_sizing where sizing_id=97; > > ****Pub****: > alter table information_schema.sql_sizing replica identity full; > --this is not replicated. > UPDATE information_schema.sql_sizing set supported_value=12 where sizing_id=97; > > ****Sub****: > postgres=# select supported_value from information_schema.sql_sizing > where sizing_id=97; > supported_value > ----------------- > 0 > > ~~ > > Then drop and recreate and try to perform the above update again, it > gets replicated: > > drop schema information_schema cascade; > ./psql -d postgres -f ./../../src/backend/catalog/information_schema.sql -p 5433 > > ****Pub****: > ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing; > select * from information_schema.sql_sizing where sizing_id=97; > alter table information_schema.sql_sizing replica identity full; > --This is replicated > UPDATE information_schema.sql_sizing set supported_value=14 where sizing_id=97; > > ****Sub****: > --This shows supported_value as 14 > postgres=# select supported_value from information_schema.sql_sizing > where sizing_id=97; > supported_value > ----------------- > 14 Hmm, I might be missing something what why we do not want to publish which is in information_shcema, especially when the internally created schema is dropped then user can create his own schema with name information-schema and create a bunch of tables in that so why do we want to block those? I mean the example you showed here is pretty much like a user created schema and table no? Or am I missing something important? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-17T09:44:04Z
On Wed, Dec 17, 2025 at 9:59 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 16, 2025 at 10:33 AM shveta malik <shveta.malik@gmail.com> wrote: > > > The OID check may be unreliable, as mentioned in the comment. I tested > > this by dropping and recreating information_schema, and observed that > > after recreation it became eligible for publication because its relid > > no longer falls under FirstNormalObjectId. Steps: > > > > ****Pub****: > > create publication pub1; > > ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing; > > select * from information_schema.sql_sizing where sizing_id=97; > > > > ****Sub****: > > create subscription sub1 connection '...' publication pub1 with > > (copy_data=false); > > select * from information_schema.sql_sizing where sizing_id=97; > > > > ****Pub****: > > alter table information_schema.sql_sizing replica identity full; > > --this is not replicated. > > UPDATE information_schema.sql_sizing set supported_value=12 where sizing_id=97; > > > > ****Sub****: > > postgres=# select supported_value from information_schema.sql_sizing > > where sizing_id=97; > > supported_value > > ----------------- > > 0 > > > > ~~ > > > > Then drop and recreate and try to perform the above update again, it > > gets replicated: > > > > drop schema information_schema cascade; > > ./psql -d postgres -f ./../../src/backend/catalog/information_schema.sql -p 5433 > > > > ****Pub****: > > ALTER PUBLICATION pub1 ADD TABLE information_schema.sql_sizing; > > select * from information_schema.sql_sizing where sizing_id=97; > > alter table information_schema.sql_sizing replica identity full; > > --This is replicated > > UPDATE information_schema.sql_sizing set supported_value=14 where sizing_id=97; > > > > ****Sub****: > > --This shows supported_value as 14 > > postgres=# select supported_value from information_schema.sql_sizing > > where sizing_id=97; > > supported_value > > ----------------- > > 14 > > Hmm, I might be missing something what why we do not want to publish > which is in information_shcema, especially when the internally created > schema is dropped then user can create his own schema with name > information-schema and create a bunch of tables in that so why do we > want to block those? I mean the example you showed here is pretty > much like a user created schema and table no? Or am I missing > something important? > I don’t think a user intentionally dropping information_schema and creating their own schema (with different definitions and tables) is a practical scenario. While it isn’t explicitly restricted, I don’t see a strong need for it. OTOH, there are scenarios where, after fixing issues that affect the definition of information_schema on stable branches, users may be asked to reload information_schema to apply the updated definitions. One such case can be seen in [1]. Additionally, while reviewing the code, I noticed places where the logic does not rely solely on relid being less than FirstNormalObjectId. Instead, it performs name-based comparisons, explicitly accounting for the possibility that information_schema may have been dropped and reloaded. This further indicates that such scenarios are considered practical. See [2]. And if such scenarios are possible, it might be worth considering keeping the publish behavior consistent, both before and after a reload of information_schema. [1]: https://www.postgresql.org/docs/9.1/release-9-1-2.html [2]: pg_upgrade has this: static DataTypesUsageChecks data_types_usage_checks[] = { /* * Look for composite types that were made during initdb *or* belong to * information_schema; that's important in case information_schema was * dropped and reloaded. * * The cutoff OID here should match the source cluster's value of * FirstNormalObjectId. We hardcode it rather than using that C #define * because, if that #define is ever changed, our own version's value is * NOT what to use. Eventually we may need a test on the source cluster's * version to select the correct value. */ { .status = gettext_noop("Checking for system-defined composite types in user tables"), .report_filename = "tables_using_composite.txt", .base_query = "SELECT t.oid FROM pg_catalog.pg_type t " "LEFT JOIN pg_catalog.pg_namespace n ON t.typnamespace = n.oid " " WHERE typtype = 'c' AND (t.oid < 16384 OR nspname = 'information_schema')", thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-17T09:58:49Z
On Wed, Dec 17, 2025 at 3:14 PM shveta malik <shveta.malik@gmail.com> wrote: > > I don’t think a user intentionally dropping information_schema and > creating their own schema (with different definitions and tables) is a > practical scenario. While it isn’t explicitly restricted, I don’t see > a strong need for it. OTOH, there are scenarios where, after fixing > issues that affect the definition of information_schema on stable > branches, users may be asked to reload information_schema to apply the > updated definitions. One such case can be seen in [1]. > > Additionally, while reviewing the code, I noticed places where the > logic does not rely solely on relid being less than > FirstNormalObjectId. Instead, it performs name-based comparisons, > explicitly accounting for the possibility that information_schema may > have been dropped and reloaded. This further indicates that such > scenarios are considered practical. See [2]. > And if such scenarios are possible, it might be worth considering > keeping the publish behavior consistent, both before and after a > reload of information_schema. > > [1]: > https://www.postgresql.org/docs/9.1/release-9-1-2.html > > [2]: > pg_upgrade has this: > static DataTypesUsageChecks data_types_usage_checks[] = > { > /* > * Look for composite types that were made during initdb *or* belong to > * information_schema; that's important in case information_schema was > * dropped and reloaded. > * > * The cutoff OID here should match the source cluster's value of > * FirstNormalObjectId. We hardcode it rather than using that C #define > * because, if that #define is ever changed, our own version's value is > * NOT what to use. Eventually we may need a test on the > source cluster's > * version to select the correct value. > */ > { > .status = gettext_noop("Checking for system-defined > composite types in user tables"), > .report_filename = "tables_using_composite.txt", > .base_query = > "SELECT t.oid FROM pg_catalog.pg_type t " > "LEFT JOIN pg_catalog.pg_namespace n ON t.typnamespace = n.oid " > " WHERE typtype = 'c' AND (t.oid < 16384 OR nspname = > 'information_schema')", Yeah I agree with your theory. While the system allows users to manually create an information_schema or place objects within it, we are establishing that anything inside this schema will be treated as an internal object. If a user chooses to bypass these conventions and then finds the objects are not handled like standard user tables, it constitutes a usage error rather than a system bug. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-17T10:19:58Z
On Wed, Dec 17, 2025 at 3:29 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Dec 17, 2025 at 3:14 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > I don’t think a user intentionally dropping information_schema and > > creating their own schema (with different definitions and tables) is a > > practical scenario. While it isn’t explicitly restricted, I don’t see > > a strong need for it. OTOH, there are scenarios where, after fixing > > issues that affect the definition of information_schema on stable > > branches, users may be asked to reload information_schema to apply the > > updated definitions. One such case can be seen in [1]. > > > > Additionally, while reviewing the code, I noticed places where the > > logic does not rely solely on relid being less than > > FirstNormalObjectId. Instead, it performs name-based comparisons, > > explicitly accounting for the possibility that information_schema may > > have been dropped and reloaded. This further indicates that such > > scenarios are considered practical. See [2]. > > And if such scenarios are possible, it might be worth considering > > keeping the publish behavior consistent, both before and after a > > reload of information_schema. > > > > [1]: > > https://www.postgresql.org/docs/9.1/release-9-1-2.html > > > > [2]: > > pg_upgrade has this: > > static DataTypesUsageChecks data_types_usage_checks[] = > > { > > /* > > * Look for composite types that were made during initdb *or* belong to > > * information_schema; that's important in case information_schema was > > * dropped and reloaded. > > * > > * The cutoff OID here should match the source cluster's value of > > * FirstNormalObjectId. We hardcode it rather than using that C #define > > * because, if that #define is ever changed, our own version's value is > > * NOT what to use. Eventually we may need a test on the > > source cluster's > > * version to select the correct value. > > */ > > { > > .status = gettext_noop("Checking for system-defined > > composite types in user tables"), > > .report_filename = "tables_using_composite.txt", > > .base_query = > > "SELECT t.oid FROM pg_catalog.pg_type t " > > "LEFT JOIN pg_catalog.pg_namespace n ON t.typnamespace = n.oid " > > " WHERE typtype = 'c' AND (t.oid < 16384 OR nspname = > > 'information_schema')", > > Yeah I agree with your theory. While the system allows users to > manually create an information_schema or place objects within it, we > are establishing that anything inside this schema will be treated as > an internal object. If a user chooses to bypass these conventions and > then finds the objects are not handled like standard user tables, it > constitutes a usage error rather than a system bug. Yes, I think so as well. IIUC, we wouldn’t be establishing anything new here; this behavior is already established. If we look at the code paths that reference information_schema, it is consistently treated as similar to system schema rather than a user schema. A few examples include XML_VISIBLE_SCHEMAS_EXCLUDE, selectDumpableNamespace, data_types_usage_checks, describeFunctions, describeAggregates, and others. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-18T09:09:18Z
On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > We could do this as a first step. See the proposal in email [1] where > we have discussed having two options instead of one. The first option > will be conflict_log_format and the values would be log and table. In > this case, the table would be an internally generated one. > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com So I have put more thought on this and here is what I am proposing 1) Subscription Parameter: Son in first version the subscription parameter will be named 'conflict_log_format' which will accept 'log/table/both' default option would be log. 2) If conflict_log_format = log is provided then we do not need to do anything as this would work by default 3) If conflict_log_format = table/both is provided then we will generate a internal table name i.e. conflict_log_table_$subid$ and the table will be created in the current schema 4) in pg_subscription we will still keep 2 field a) namespace id of the conflict log table b) the conflict log format = 'log/table'both' 5) If option is table/both the name can be generated on the fly whether we are creating the table or inserting conflict into the table. Question: 1) Shall we create a conflict log table in the current schema or we should consider anything else, IMHO the current schema should be fine and in the future when we add an option for conflict_log_table we will support schema qualified names as well? 2) In catalog I am storing the "conflict_log_format" option as a text field, is there any better way so that we can store in fixed format maybe enum value as an integer we can do e.g. from below enum we can store the integer value in system catalog for "conflict_log_format" field, not sure if we have done such think anywhere else? typedef enum ConflictLogFormat { CONFLICT_LOG_FORMAT_DEFAULT = 0, CONFLICT_LOG_FORMAT_LOG, CONFLICT_LOG_FORMAT_TABLE, CONFLICT_LOG_FORMAT_BOTH } ConflictLogFormat; -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-18T09:55:26Z
On Thu, Dec 18, 2025 at 2:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > We could do this as a first step. See the proposal in email [1] where > > we have discussed having two options instead of one. The first option > > will be conflict_log_format and the values would be log and table. In > > this case, the table would be an internally generated one. > > > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com > > So I have put more thought on this and here is what I am proposing > > 1) Subscription Parameter: Son in first version the subscription > parameter will be named 'conflict_log_format' which will accept > 'log/table/both' default option would be log. > 2) If conflict_log_format = log is provided then we do not need to do > anything as this would work by default > 3) If conflict_log_format = table/both is provided then we will > generate a internal table name i.e. conflict_log_table_$subid$ and the > table will be created in the current schema > 4) in pg_subscription we will still keep 2 field a) namespace id of > the conflict log table b) the conflict log format = 'log/table'both' > 5) If option is table/both the name can be generated on the fly > whether we are creating the table or inserting conflict into the > table. > > Question: > 1) Shall we create a conflict log table in the current schema or we > should consider anything else, IMHO the current schema should be fine > and in the future when we add an option for conflict_log_table we will > support schema qualified names as well? > 2) In catalog I am storing the "conflict_log_format" option as a text > field, is there any better way so that we can store in fixed format > maybe enum value as an integer we can do e.g. from below enum we can > store the integer value in system catalog for "conflict_log_format" > field, not sure if we have done such think anywhere else? > > typedef enum ConflictLogFormat > { > CONFLICT_LOG_FORMAT_DEFAULT = 0, > CONFLICT_LOG_FORMAT_LOG, > CONFLICT_LOG_FORMAT_TABLE, > CONFLICT_LOG_FORMAT_BOTH > } ConflictLogFormat; While exploring other kinds of options I think we can make it a char something like relkind as shown below, any other opinion on the same? #define CONFLICT_LOG_FORMAT_LOG = 'l' #define CONFLICT_LOG_FORMAT_TABLE = 't' #define CONFLICT_LOG_FORMAT_BOTH = 'b' -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-18T11:06:15Z
On Thu, Dec 18, 2025 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Dec 18, 2025 at 2:39 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > We could do this as a first step. See the proposal in email [1] where > > > we have discussed having two options instead of one. The first option > > > will be conflict_log_format and the values would be log and table. In > > > this case, the table would be an internally generated one. > > > > > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com > > > > So I have put more thought on this and here is what I am proposing > > > > 1) Subscription Parameter: Son in first version the subscription > > parameter will be named 'conflict_log_format' which will accept > > 'log/table/both' default option would be log. > > 2) If conflict_log_format = log is provided then we do not need to do > > anything as this would work by default > > 3) If conflict_log_format = table/both is provided then we will > > generate a internal table name i.e. conflict_log_table_$subid$ and the > > table will be created in the current schema > > 4) in pg_subscription we will still keep 2 field a) namespace id of > > the conflict log table b) the conflict log format = 'log/table'both' > > 5) If option is table/both the name can be generated on the fly > > whether we are creating the table or inserting conflict into the > > table. > > > > Question: > > 1) Shall we create a conflict log table in the current schema or we > > should consider anything else, IMHO the current schema should be fine > > and in the future when we add an option for conflict_log_table we will > > support schema qualified names as well? > > 2) In catalog I am storing the "conflict_log_format" option as a text > > field, is there any better way so that we can store in fixed format > > maybe enum value as an integer we can do e.g. from below enum we can > > store the integer value in system catalog for "conflict_log_format" > > field, not sure if we have done such think anywhere else? > > > > typedef enum ConflictLogFormat > > { > > CONFLICT_LOG_FORMAT_DEFAULT = 0, > > CONFLICT_LOG_FORMAT_LOG, > > CONFLICT_LOG_FORMAT_TABLE, > > CONFLICT_LOG_FORMAT_BOTH > > } ConflictLogFormat; > > While exploring other kinds of options I think we can make it a char > something like relkind as shown below, any other opinion on the same? > > #define CONFLICT_LOG_FORMAT_LOG = 'l' > #define CONFLICT_LOG_FORMAT_TABLE = 't' > #define CONFLICT_LOG_FORMAT_BOTH = 'b' > +1. Also, we should expose this to users with a type as enum similar to auto_explain.log_format or publish_generated_columns. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Masahiko Sawada <sawada.mshk@gmail.com> — 2025-12-18T23:07:53Z
On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > We could do this as a first step. See the proposal in email [1] where > > we have discussed having two options instead of one. The first option > > will be conflict_log_format and the values would be log and table. In > > this case, the table would be an internally generated one. > > > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com > > So I have put more thought on this and here is what I am proposing > > 1) Subscription Parameter: Son in first version the subscription > parameter will be named 'conflict_log_format' which will accept > 'log/table/both' default option would be log. > 2) If conflict_log_format = log is provided then we do not need to do > anything as this would work by default > 3) If conflict_log_format = table/both is provided then we will > generate a internal table name i.e. conflict_log_table_$subid$ and the > table will be created in the current schema > 4) in pg_subscription we will still keep 2 field a) namespace id of > the conflict log table b) the conflict log format = 'log/table'both' > 5) If option is table/both the name can be generated on the fly > whether we are creating the table or inserting conflict into the > table. I have a question: who will be the owner of the conflict log table? I assume that the subscription owner would own the conflict log table and the conflict logs are inserted by the owner but not by the table owner, is that right? > > Question: > 1) Shall we create a conflict log table in the current schema or we > should consider anything else, IMHO the current schema should be fine > and in the future when we add an option for conflict_log_table we will > support schema qualified names as well? Some questions: If the same name table already exists, CREATE SUBSCRIPTION will fail, right? Can the conflict log table be used like normal user tables (e.g., creating a trigger/a foreign key, running vacuum, ALTER TABLE etc.)? > 2) In catalog I am storing the "conflict_log_format" option as a text > field, is there any better way so that we can store in fixed format > maybe enum value as an integer we can do e.g. from below enum we can > store the integer value in system catalog for "conflict_log_format" > field, not sure if we have done such think anywhere else? > > typedef enum ConflictLogFormat > { > CONFLICT_LOG_FORMAT_DEFAULT = 0, > CONFLICT_LOG_FORMAT_LOG, > CONFLICT_LOG_FORMAT_TABLE, > CONFLICT_LOG_FORMAT_BOTH > } ConflictLogFormat; How about making conflict_log_format accept a list of destinations instead of having the 'both' option in case where we might add more destination options in the future? It seems to me that conflict_log_destination sounds better. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-12-19T00:04:47Z
On Thu, Dec 18, 2025 at 8:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > We could do this as a first step. See the proposal in email [1] where > > we have discussed having two options instead of one. The first option > > will be conflict_log_format and the values would be log and table. In > > this case, the table would be an internally generated one. > > > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com > > So I have put more thought on this and here is what I am proposing > > 1) Subscription Parameter: Son in first version the subscription > parameter will be named 'conflict_log_format' which will accept > 'log/table/both' default option would be log. > 2) If conflict_log_format = log is provided then we do not need to do > anything as this would work by default > 3) If conflict_log_format = table/both is provided then we will > generate a internal table name i.e. conflict_log_table_$subid$ and the > table will be created in the current schema > 4) in pg_subscription we will still keep 2 field a) namespace id of > the conflict log table b) the conflict log format = 'log/table'both' > 5) If option is table/both the name can be generated on the fly > whether we are creating the table or inserting conflict into the > table. IIUC, previously you had a "none" value which was a way to "turn off" any CLT previously defined. How can users do that now with log/table/both? Would they have to reassign (the default) "log"? That seems a bit strange. The word "both" option is too restrictive. What if in the future you added a 3rd kind of destination -- then what does "both" mean? Maybe the destination list idea of Sawda-San's is better. a) it resolves the "none" issue -- e.g., empty string means revert to default CLT behaviour b) it resolves the "both" issue. ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-12-19T00:28:31Z
On Thu, Dec 18, 2025 at 8:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > ... > > Question: > 1) Shall we create a conflict log table in the current schema or we > should consider anything else, IMHO the current schema should be fine > and in the future when we add an option for conflict_log_table we will > support schema qualified names as well? You might be able to avoid a proliferation of related options (such as conflict_log_table) if you renamed the main option to "conflict_log_destination" like Sawada-San was suggesting. e.g. conflict_log_destimation="table" --> use default table named by code conflict_log_destimation="table=myschema.mytable" --> table name nominated by user e.g. if wanted maybe this idea can extend to logs too. conflict_log_destimation="log" --> use default pg log files conflict_log_destimation="log=my_clt_log.txt" --> write conflicts to a separate log file nominated by user ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-19T04:09:48Z
On Fri, Dec 19, 2025 at 4:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > 2) In catalog I am storing the "conflict_log_format" option as a text > > field, is there any better way so that we can store in fixed format > > maybe enum value as an integer we can do e.g. from below enum we can > > store the integer value in system catalog for "conflict_log_format" > > field, not sure if we have done such think anywhere else? > > > > typedef enum ConflictLogFormat > > { > > CONFLICT_LOG_FORMAT_DEFAULT = 0, > > CONFLICT_LOG_FORMAT_LOG, > > CONFLICT_LOG_FORMAT_TABLE, > > CONFLICT_LOG_FORMAT_BOTH > > } ConflictLogFormat; > > How about making conflict_log_format accept a list of destinations > instead of having the 'both' option in case where we might add more > destination options in the future? > > It seems to me that conflict_log_destination sounds better. > Yeah, this is worth considering. But say, we need to extend it so that the conflict data goes in xml format file instead of standard log then won't it look a bit odd to specify via conflict_log_destination. I thought we could name it similar to the existing auto_explain.log_format. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-19T04:23:23Z
On Fri, Dec 19, 2025 at 9:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Dec 19, 2025 at 4:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > 2) In catalog I am storing the "conflict_log_format" option as a text > > > field, is there any better way so that we can store in fixed format > > > maybe enum value as an integer we can do e.g. from below enum we can > > > store the integer value in system catalog for "conflict_log_format" > > > field, not sure if we have done such think anywhere else? > > > > > > typedef enum ConflictLogFormat > > > { > > > CONFLICT_LOG_FORMAT_DEFAULT = 0, > > > CONFLICT_LOG_FORMAT_LOG, > > > CONFLICT_LOG_FORMAT_TABLE, > > > CONFLICT_LOG_FORMAT_BOTH > > > } ConflictLogFormat; > > > > How about making conflict_log_format accept a list of destinations > > instead of having the 'both' option in case where we might add more > > destination options in the future? > > > > It seems to me that conflict_log_destination sounds better. > > > > Yeah, this is worth considering. But say, we need to extend it so that > the conflict data goes in xml format file instead of standard log then > won't it look a bit odd to specify via conflict_log_destination. I > thought we could name it similar to the existing > auto_explain.log_format. IMHO conflict_log_destination sounds more appropriate considering we are talking about the log destination instead of format no? And the option could be log/table/file etc, and for now we can just stick to log/table. And in future we can extend it by supporting extra options like destination_name, where we can provide table name or file name etc. So let me list down all the points which need consensus. 1. What should be the name of the option 'conflict_log_destination' vs 'conflict_log_format' 2. Do we want to support multi destination then providing string like 'conflict_log_destination = 'log,table,..' make more sense but then we would have to store as a string in catalog and parse it everytime we insert conflicts or alter subscription OTOH currently I have just support single option log/table/both which make things much easy because then in catalog we can store as a single char field and don't need any parsing. And since the input are taken as a string itself, even if in future we want to support more options like 'log,table,..' it would be backward compatible with old options. 3. Do we want to support 'none' destinations? i.e. do not log to anywhere? -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-19T04:25:03Z
On Fri, Dec 19, 2025 at 5:35 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Thu, Dec 18, 2025 at 8:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > We could do this as a first step. See the proposal in email [1] where > > > we have discussed having two options instead of one. The first option > > > will be conflict_log_format and the values would be log and table. In > > > this case, the table would be an internally generated one. > > > > > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com > > > > So I have put more thought on this and here is what I am proposing > > > > 1) Subscription Parameter: Son in first version the subscription > > parameter will be named 'conflict_log_format' which will accept > > 'log/table/both' default option would be log. > > 2) If conflict_log_format = log is provided then we do not need to do > > anything as this would work by default > > 3) If conflict_log_format = table/both is provided then we will > > generate a internal table name i.e. conflict_log_table_$subid$ and the > > table will be created in the current schema > > 4) in pg_subscription we will still keep 2 field a) namespace id of > > the conflict log table b) the conflict log format = 'log/table'both' > > 5) If option is table/both the name can be generated on the fly > > whether we are creating the table or inserting conflict into the > > table. > > IIUC, previously you had a "none" value which was a way to "turn off" > any CLT previously defined. How can users do that now with > log/table/both? Would they have to reassign (the default) "log"? That > seems a bit strange. Previously we were supporting only conflict log tables and by default it was always sent to log. And "none" was used for clearing the conflict log table option; it was never meant for not logging anywhere it was meant to say that there is no conflict log table. Now also we can have another option as none but I intentionally avoided it considering we want to support the case where we don't want to log it at all, maybe that's not a bad idea either. Let's see what others think about it. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-19T04:52:59Z
On Fri, Dec 19, 2025 at 9:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Dec 19, 2025 at 4:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > 2) In catalog I am storing the "conflict_log_format" option as a text > > > field, is there any better way so that we can store in fixed format > > > maybe enum value as an integer we can do e.g. from below enum we can > > > store the integer value in system catalog for "conflict_log_format" > > > field, not sure if we have done such think anywhere else? > > > > > > typedef enum ConflictLogFormat > > > { > > > CONFLICT_LOG_FORMAT_DEFAULT = 0, > > > CONFLICT_LOG_FORMAT_LOG, > > > CONFLICT_LOG_FORMAT_TABLE, > > > CONFLICT_LOG_FORMAT_BOTH > > > } ConflictLogFormat; > > > > How about making conflict_log_format accept a list of destinations > > instead of having the 'both' option in case where we might add more > > destination options in the future? > > > > It seems to me that conflict_log_destination sounds better. > > > > Yeah, this is worth considering. But say, we need to extend it so that > the conflict data goes in xml format file instead of standard log then > won't it look a bit odd to specify via conflict_log_destination. I > thought we could name it similar to the existing > auto_explain.log_format. > One option could be to separate destination and format: conflict_log_history.destination : log/table conflict_log_history.format : xml/json/text etc Another option could be to use a single parameter, 'conflict_log_destination', with values such as: table, xmllog, jsonlog, stderr/textlog (where stderr corresponds to logging to log/postgresql.log, similar to log_destination at [1]). I prefer this approach. [1]: https://www.postgresql.org/docs/18/runtime-config-logging.html thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-19T05:10:28Z
On Fri, Dec 19, 2025 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 19, 2025 at 9:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Dec 19, 2025 at 4:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Thu, Dec 18, 2025 at 1:09 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > 2) In catalog I am storing the "conflict_log_format" option as a text > > > > field, is there any better way so that we can store in fixed format > > > > maybe enum value as an integer we can do e.g. from below enum we can > > > > store the integer value in system catalog for "conflict_log_format" > > > > field, not sure if we have done such think anywhere else? > > > > > > > > typedef enum ConflictLogFormat > > > > { > > > > CONFLICT_LOG_FORMAT_DEFAULT = 0, > > > > CONFLICT_LOG_FORMAT_LOG, > > > > CONFLICT_LOG_FORMAT_TABLE, > > > > CONFLICT_LOG_FORMAT_BOTH > > > > } ConflictLogFormat; > > > > > > How about making conflict_log_format accept a list of destinations > > > instead of having the 'both' option in case where we might add more > > > destination options in the future? > > > > > > It seems to me that conflict_log_destination sounds better. > > > > > > > Yeah, this is worth considering. But say, we need to extend it so that > > the conflict data goes in xml format file instead of standard log then > > won't it look a bit odd to specify via conflict_log_destination. I > > thought we could name it similar to the existing > > auto_explain.log_format. > > IMHO conflict_log_destination sounds more appropriate considering we > are talking about the log destination instead of format no? And the > option could be log/table/file etc, and for now we can just stick to > log/table. And in future we can extend it by supporting extra options > like destination_name, where we can provide table name or file name > etc. So let me list down all the points which need consensus. > > 1. What should be the name of the option 'conflict_log_destination' vs > 'conflict_log_format' I prefer conflcit_log_destination. > 2. Do we want to support multi destination then providing string like > 'conflict_log_destination = 'log,table,..' make more sense but then we > would have to store as a string in catalog and parse it everytime we > insert conflicts or alter subscription OTOH currently I have just > support single option log/table/both which make things much easy > because then in catalog we can store as a single char field and don't > need any parsing. And since the input are taken as a string itself, > even if in future we want to support more options like 'log,table,..' > it would be backward compatible with old options. I feel, combination of options might be a good idea, similar to how 'log_destination' provides. But it can be done in future versions and the first draft can be a simple one. > 3. Do we want to support 'none' destinations? i.e. do not log to anywhere? IMO, conflict information is an important piece of information to diagnose data divergence and thus should be logged always. Let's wait for others' opinions. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-12-19T05:42:15Z
On Fri, Dec 19, 2025 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 19, 2025 at 5:35 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Thu, Dec 18, 2025 at 8:09 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Tue, Dec 16, 2025 at 9:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Mon, Dec 15, 2025 at 5:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > We could do this as a first step. See the proposal in email [1] where > > > > we have discussed having two options instead of one. The first option > > > > will be conflict_log_format and the values would be log and table. In > > > > this case, the table would be an internally generated one. > > > > > > > > [1] - https://www.postgresql.org/message-id/CAA4eK1KwqE2y%3D_k5Xc%3Def0S5JXG2x%3DoeWpDJ%2B%3D5k6Anzaw2gdw%40mail.gmail.com > > > > > > So I have put more thought on this and here is what I am proposing > > > > > > 1) Subscription Parameter: Son in first version the subscription > > > parameter will be named 'conflict_log_format' which will accept > > > 'log/table/both' default option would be log. > > > 2) If conflict_log_format = log is provided then we do not need to do > > > anything as this would work by default > > > 3) If conflict_log_format = table/both is provided then we will > > > generate a internal table name i.e. conflict_log_table_$subid$ and the > > > table will be created in the current schema > > > 4) in pg_subscription we will still keep 2 field a) namespace id of > > > the conflict log table b) the conflict log format = 'log/table'both' > > > 5) If option is table/both the name can be generated on the fly > > > whether we are creating the table or inserting conflict into the > > > table. > > > > IIUC, previously you had a "none" value which was a way to "turn off" > > any CLT previously defined. How can users do that now with > > log/table/both? Would they have to reassign (the default) "log"? That > > seems a bit strange. > > Previously we were supporting only conflict log tables and by default > it was always sent to log. And "none" was used for clearing the > conflict log table option; it was never meant for not logging anywhere > it was meant to say that there is no conflict log table. Now also we > can have another option as none but I intentionally avoided it > considering we want to support the case where we don't want to log it > at all, maybe that's not a bad idea either. Let's see what others > think about it. > I didn't mean to suggest we should allow "not logging anywhere". I only wanted to ask how the user is expected to revert the conflict logging back to the default after they had set it to something else. e.g. CREATE SUBSCRIPTION mysub2 ... WITH(conflict_log_destination=table) Now, how to ALTER SUBSCRIPTION to revert that back to default? It seems there is no "reset to default" so is the user required to do this explicitly? ALTER SUBSCRIPTION mysub2 SET (conflict_log_destination=log); Maybe that's fine --- I was just looking for some examples/clarification. ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-19T06:14:05Z
On Fri, Dec 19, 2025 at 11:12 AM Peter Smith <smithpb2250@gmail.com> wrote: > > I didn't mean to suggest we should allow "not logging anywhere". I > only wanted to ask how the user is expected to revert the conflict > logging back to the default after they had set it to something else. Okay understood, thanks for the clarification. > e.g. > > CREATE SUBSCRIPTION mysub2 ... WITH(conflict_log_destination=table) > Now, how to ALTER SUBSCRIPTION to revert that back to default? > > It seems there is no "reset to default" so is the user required to do > this explicitly? > ALTER SUBSCRIPTION mysub2 SET (conflict_log_destination=log); > > Maybe that's fine --- I was just looking for some examples/clarification. Yeah this is the way, IMHO it looks fine to me. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-19T06:19:35Z
On Fri, Dec 19, 2025 at 10:40 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Dec 19, 2025 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > 2. Do we want to support multi destination then providing string like > > 'conflict_log_destination = 'log,table,..' make more sense but then we > > would have to store as a string in catalog and parse it everytime we > > insert conflicts or alter subscription OTOH currently I have just > > support single option log/table/both which make things much easy > > because then in catalog we can store as a single char field and don't > > need any parsing. And since the input are taken as a string itself, > > even if in future we want to support more options like 'log,table,..' > > it would be backward compatible with old options. > > I feel, combination of options might be a good idea, similar to how > 'log_destination' provides. But it can be done in future versions and > the first draft can be a simple one. > Considering the future extension of storing conflict information in multiple places, it would be good to follow log_destination. Yes, it is more work now but I feel that will be future-proof. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-19T06:22:12Z
On Fri, Dec 19, 2025 at 11:44 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 19, 2025 at 11:12 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > I didn't mean to suggest we should allow "not logging anywhere". I > > only wanted to ask how the user is expected to revert the conflict > > logging back to the default after they had set it to something else. > > Okay understood, thanks for the clarification. > > > e.g. > > > > CREATE SUBSCRIPTION mysub2 ... WITH(conflict_log_destination=table) > > Now, how to ALTER SUBSCRIPTION to revert that back to default? > > > > It seems there is no "reset to default" so is the user required to do > > this explicitly? > > ALTER SUBSCRIPTION mysub2 SET (conflict_log_destination=log); > > > > Maybe that's fine --- I was just looking for some examples/clarification. > > Yeah this is the way, IMHO it looks fine to me. > How about considering log as default, so even if the user resets it via "ALTER SUBSCRIPTION mysub2 SET (conflict_log_destination='');", we send it to LOG as we are doing currently in HEAD? This means conflict_log_destination='' or conflict_log_destination='log' means the same. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-19T06:24:33Z
On Fri, Dec 19, 2025 at 10:40 AM shveta malik <shveta.malik@gmail.com> wrote: > > 1. What should be the name of the option 'conflict_log_destination' vs > > 'conflict_log_format' > > I prefer conflcit_log_destination. > > > 2. Do we want to support multi destination then providing string like > > 'conflict_log_destination = 'log,table,..' make more sense but then we > > would have to store as a string in catalog and parse it everytime we > > insert conflicts or alter subscription OTOH currently I have just > > support single option log/table/both which make things much easy > > because then in catalog we can store as a single char field and don't > > need any parsing. And since the input are taken as a string itself, > > even if in future we want to support more options like 'log,table,..' > > it would be backward compatible with old options. > > I feel, combination of options might be a good idea, similar to how > 'log_destination' provides. But it can be done in future versions and > the first draft can be a simple one. > > > 3. Do we want to support 'none' destinations? i.e. do not log to anywhere? > > IMO, conflict information is an important piece of information to > diagnose data divergence and thus should be logged always. > > Let's wait for others' opinions. Thanks Shveta for you opinion, Here is what I propose considering balance between simplicity with future scalability: 1. Retain 'conflict_log_destination' as the option name. 2. Current supported values include 'log', 'table', or 'all' (which directs output to both locations). But we will not support comma separated values in the first version. 3. By treating this as a string, we can eventually support comma-separated values like 'log, table, new_option'. This approach maintains a simple design by avoiding immediate need of parsing the comma separated options while ensuring extensibility. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Masahiko Sawada <sawada.mshk@gmail.com> — 2025-12-19T08:27:24Z
On Thu, Dec 18, 2025 at 10:24 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Dec 19, 2025 at 10:40 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > 1. What should be the name of the option 'conflict_log_destination' vs > > > 'conflict_log_format' > > > > I prefer conflcit_log_destination. > > > > > 2. Do we want to support multi destination then providing string like > > > 'conflict_log_destination = 'log,table,..' make more sense but then we > > > would have to store as a string in catalog and parse it everytime we > > > insert conflicts or alter subscription OTOH currently I have just > > > support single option log/table/both which make things much easy > > > because then in catalog we can store as a single char field and don't > > > need any parsing. And since the input are taken as a string itself, > > > even if in future we want to support more options like 'log,table,..' > > > it would be backward compatible with old options. > > > > I feel, combination of options might be a good idea, similar to how > > 'log_destination' provides. But it can be done in future versions and > > the first draft can be a simple one. > > > > > 3. Do we want to support 'none' destinations? i.e. do not log to anywhere? > > > > IMO, conflict information is an important piece of information to > > diagnose data divergence and thus should be logged always. > > > > Let's wait for others' opinions. > > Thanks Shveta for you opinion, > > Here is what I propose considering balance between simplicity with > future scalability: > > 1. Retain 'conflict_log_destination' as the option name. > 2. Current supported values include 'log', 'table', or 'all' (which > directs output to both locations). But we will not support comma > separated values in the first version. If users set conflict_log_destination='table', we don't report anything related to conflict to the server logs while all other errors generated by apply workers go to the server logs? or do we write ERRORs without the conflict details while writing full conflict logs to the table? If we go with the former idea, monitoring tools would not be able to catch ERROR logs. Users can set conflict_log_destination='all' in this case, but they might want to avoid bloating the server logs by the detailed conflict information. I wonder if there might be cases where monitoring tools want to detect at least the fact that errors occur in the system. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-20T09:47:11Z
On Tue, 16 Dec 2025 at 09:54, vignesh C <vignesh21@gmail.com> wrote: > > On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > I was considering the interdependence between the subscription and the > > > > > conflict log table (CLT). IMHO, it would be logical to establish the > > > > > subscription as dependent on the CLT. This way, if someone attempts to > > > > > drop the CLT, the system would recognize the dependency of the > > > > > subscription and prevent the drop unless the subscription is removed > > > > > first or the CASCADE option is used. > > > > > > > > > > However, while investigating this, I encountered an error [1] stating > > > > > that global objects are not supported in this context. This indicates > > > > > that global objects cannot be made dependent on local objects. > > > > > > > > > > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database > > > > objects. For example, consider following case: > > > > postgres=# create table t1(c1 int primary key); > > > > CREATE TABLE > > > > postgres=# \d+ t1 > > > > Table "public.t1" > > > > Column | Type | Collation | Nullable | Default | Storage | > > > > Compression | Stats target | Description > > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > > > > c1 | integer | | not null | | plain | > > > > | | > > > > Indexes: > > > > "t1_pkey" PRIMARY KEY, btree (c1) > > > > Publications: > > > > "pub1" > > > > Not-null constraints: > > > > "t1_c1_not_null" NOT NULL "c1" > > > > Access method: heap > > > > postgres=# drop index t1_pkey; > > > > ERROR: cannot drop index t1_pkey because constraint t1_pkey on table > > > > t1 requires it > > > > HINT: You can drop constraint t1_pkey on table t1 instead. > > > > > > > > Here, the PK index is created as part for CREATE TABLE operation and > > > > pk_index is not allowed to be dropped independently. > > > > > > > > > Although making an object dependent on global/shared objects is > > > > > possible for certain types of shared objects [2], this is not our main > > > > > objective. > > > > > > > > > > > > > As per my understanding from the above example, we need something like > > > > that only for shared object subscription and (internally created) > > > > table. > > > > > > Yeah that seems to be exactly what we want, so I tried doing that by > > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and > > > it is behaving as we want[2]. And while dropping the subscription or > > > altering CLT we can delete internal dependency so that CLT get dropped > > > automatically[3] > > > > > > I will send an updated patch after testing a few more scenarios and > > > fixing other pending issues. > > > > > > [1] > > > + ObjectAddressSet(myself, RelationRelationId, relid); > > > + ObjectAddressSet(subaddr, SubscriptionRelationId, subid); > > > + recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL); > > > > > > > > > [2] > > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2; > > > ERROR: 2BP01: cannot drop table myschema.conflict_log_history2 > > > because subscription sub requires it > > > HINT: You can drop subscription sub instead. > > > LOCATION: findDependentObjects, dependency.c:788 > > > postgres[670778]=# > > > > > > [3] > > > ObjectAddressSet(object, SubscriptionRelationId, subid); > > > performDeletion(&object, DROP_CASCADE > > > PERFORM_DELETION_INTERNAL | > > > PERFORM_DELETION_SKIP_ORIGINAL); > > > > > > > > > > Here is the patch which implements the dependency and fixes other > > comments from Shveta. > > Thanks for the changes, the new implementation based on dependency > creates a cycle while dumping: > ./pg_dump -d postgres -f dump1.txt -p 5433 > pg_dump: warning: could not resolve dependency loop among these items: > pg_dump: detail: TABLE conflict (ID 225 OID 16397) > pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396) > pg_dump: detail: POST-DATA BOUNDARY (ID 3491) > pg_dump: detail: TABLE DATA t1 (ID 3485 OID 16384) > pg_dump: detail: PRE-DATA BOUNDARY (ID 3490) > > This can be seen with a simple subscription with conflict_log_table. > This was working fine with the v11 version patch. The attached v13 patch includes the fix for this issue. In addition, it now raises an error when attempting to configure a conflict log table that belongs to a temporary schema or is not a permanent (persistent) relation. Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-20T11:20:53Z
On Sat, Dec 20, 2025 at 3:17 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 16 Dec 2025 at 09:54, vignesh C <vignesh21@gmail.com> wrote: > > > > On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > I was considering the interdependence between the subscription and the > > > > > > conflict log table (CLT). IMHO, it would be logical to establish the > > > > > > subscription as dependent on the CLT. This way, if someone attempts to > > > > > > drop the CLT, the system would recognize the dependency of the > > > > > > subscription and prevent the drop unless the subscription is removed > > > > > > first or the CASCADE option is used. > > > > > > > > > > > > However, while investigating this, I encountered an error [1] stating > > > > > > that global objects are not supported in this context. This indicates > > > > > > that global objects cannot be made dependent on local objects. > > > > > > > > > > > > > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database > > > > > objects. For example, consider following case: > > > > > postgres=# create table t1(c1 int primary key); > > > > > CREATE TABLE > > > > > postgres=# \d+ t1 > > > > > Table "public.t1" > > > > > Column | Type | Collation | Nullable | Default | Storage | > > > > > Compression | Stats target | Description > > > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > > > > > c1 | integer | | not null | | plain | > > > > > | | > > > > > Indexes: > > > > > "t1_pkey" PRIMARY KEY, btree (c1) > > > > > Publications: > > > > > "pub1" > > > > > Not-null constraints: > > > > > "t1_c1_not_null" NOT NULL "c1" > > > > > Access method: heap > > > > > postgres=# drop index t1_pkey; > > > > > ERROR: cannot drop index t1_pkey because constraint t1_pkey on table > > > > > t1 requires it > > > > > HINT: You can drop constraint t1_pkey on table t1 instead. > > > > > > > > > > Here, the PK index is created as part for CREATE TABLE operation and > > > > > pk_index is not allowed to be dropped independently. > > > > > > > > > > > Although making an object dependent on global/shared objects is > > > > > > possible for certain types of shared objects [2], this is not our main > > > > > > objective. > > > > > > > > > > > > > > > > As per my understanding from the above example, we need something like > > > > > that only for shared object subscription and (internally created) > > > > > table. > > > > > > > > Yeah that seems to be exactly what we want, so I tried doing that by > > > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and > > > > it is behaving as we want[2]. And while dropping the subscription or > > > > altering CLT we can delete internal dependency so that CLT get dropped > > > > automatically[3] > > > > > > > > I will send an updated patch after testing a few more scenarios and > > > > fixing other pending issues. > > > > > > > > [1] > > > > + ObjectAddressSet(myself, RelationRelationId, relid); > > > > + ObjectAddressSet(subaddr, SubscriptionRelationId, subid); > > > > + recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL); > > > > > > > > > > > > [2] > > > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2; > > > > ERROR: 2BP01: cannot drop table myschema.conflict_log_history2 > > > > because subscription sub requires it > > > > HINT: You can drop subscription sub instead. > > > > LOCATION: findDependentObjects, dependency.c:788 > > > > postgres[670778]=# > > > > > > > > [3] > > > > ObjectAddressSet(object, SubscriptionRelationId, subid); > > > > performDeletion(&object, DROP_CASCADE > > > > PERFORM_DELETION_INTERNAL | > > > > PERFORM_DELETION_SKIP_ORIGINAL); > > > > > > > > > > > > > > Here is the patch which implements the dependency and fixes other > > > comments from Shveta. > > > > Thanks for the changes, the new implementation based on dependency > > creates a cycle while dumping: > > ./pg_dump -d postgres -f dump1.txt -p 5433 > > pg_dump: warning: could not resolve dependency loop among these items: > > pg_dump: detail: TABLE conflict (ID 225 OID 16397) > > pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396) > > pg_dump: detail: POST-DATA BOUNDARY (ID 3491) > > pg_dump: detail: TABLE DATA t1 (ID 3485 OID 16384) > > pg_dump: detail: PRE-DATA BOUNDARY (ID 3490) > > > > This can be seen with a simple subscription with conflict_log_table. > > This was working fine with the v11 version patch. > > The attached v13 patch includes the fix for this issue. In addition, > it now raises an error when attempting to configure a conflict log > table that belongs to a temporary schema or is not a permanent > (persistent) relation. I have updated the patch and here are changes done 1. Splitted into 2 patches, 0001- for catalog related changes 0002-inserting conflict into the conflict table, Vignesh need to rebase the dump and upgrade related patch on this latest changes 2. Subscription option changed to conflict_log_destination=(log/table/all/'') 3. For internal processing we will use ConflictLogDest enum whereas for taking input or storing into catalog we will use string [1]. 4. As suggested by Sawada San, if conflict_log_destination is 'table' we log the information about conflict but don't log the tuple details[3] Pending: 1. tap test for conflict insertion 2. Still need to work on caching related changes discussed at [2], so currently we don't allow conflict log tables to be added to publication at all and might change this behavior as discussed at [2] and for that we will need to implement the caching. 3. Need to add conflict insertion test and doc changes. 4. Still need to check on the latest comments from Peter Smith. [1] typedef enum ConflictLogDest { CONFLICT_LOG_DEST_INVALID = 0, CONFLICT_LOG_DEST_LOG, /* "log" (default) */ CONFLICT_LOG_DEST_TABLE, /* "table" */ CONFLICT_LOG_DEST_ALL /* "all" */ } ConflictLogDest; /* * Array mapping for converting internal enum to string. */ static const char *const ConflictLogDestLabels[] = { [CONFLICT_LOG_DEST_LOG] = "log", [CONFLICT_LOG_DEST_TABLE] = "table", [CONFLICT_LOG_DEST_ALL] = "all" }; [2] https://www.postgresql.org/message-id/CAA4eK1LNjWigHb5YKz2nBwcGQr18WnNZHv3Gyo8GNCshSkAb-A%40mail.gmail.com [3] /* Decide what detail to show in server logs. */ if (dest == CONFLICT_LOG_DEST_LOG || dest == CONFLICT_LOG_DEST_ALL) { /* Standard reporting with full internal details. */ ereport(elevel, errcode_apply_conflict(type), errmsg("conflict detected on relation \"%s.%s\": conflict=%s", get_namespace_name(RelationGetNamespace(localrel)), RelationGetRelationName(localrel), ConflictTypeNames[type]), errdetail_internal("%s", err_detail.data)); } else { /* * 'table' only: Report the error msg but omit raw tuple data from * server logs since it's already captured in the internal table. */ ereport(elevel, errcode_apply_conflict(type), errmsg("conflict detected on relation \"%s.%s\": conflict=%s", get_namespace_name(RelationGetNamespace(localrel)), RelationGetRelationName(localrel), ConflictTypeNames[type]), errdetail("Conflict details logged to internal table with OID %u.", MySubscription->conflictrelid)); } -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-21T15:46:33Z
On Sat, 20 Dec 2025 at 16:51, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I have updated the patch and here are changes done > 1. Splitted into 2 patches, 0001- for catalog related changes > 0002-inserting conflict into the conflict table, Vignesh need to > rebase the dump and upgrade related patch on this latest changes > 2. Subscription option changed to conflict_log_destination=(log/table/all/'') > 3. For internal processing we will use ConflictLogDest enum whereas > for taking input or storing into catalog we will use string [1]. > 4. As suggested by Sawada San, if conflict_log_destination is 'table' > we log the information about conflict but don't log the tuple > details[3] > > Pending: > 2. Still need to work on caching related changes discussed at [2], so > currently we don't allow conflict log tables to be added to > publication at all and might change this behavior as discussed at [2] > and for that we will need to implement the caching. This point is addressed in the attached patch. A new shared index on pg_subscription (subconflictlogrelid) is introduced and used to efficiently determine whether a relation is a conflict log table, avoiding full catalog scans. Additionally, a conflict log table can be explicitly added to a TABLE publication and will be published when specified directly. At the same time, such relations are excluded from implicit publication paths (FOR ALL TABLES and schema publications). The patch also exposes pg_relation_is_conflict_log_table() as a SQL-visible helper, which is used by psql \d+ to filter out conflict log tables from implicit publication listings. This avoids querying pg_subscription directly, which is generally inaccessible to non-superusers. These changes are included in v14-003. There are no changes in v14-001 and v14-002; those versions are identical to the patch previously shared by Dilip at [1]. [1] - https://www.postgresql.org/message-id/CAFiTN-sNg9ghLNkB2Kn0SwBGOub9acc99XZZU_d5NAcyW-yrEg%40mail.gmail.com Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-22T06:48:32Z
On Sat, 20 Dec 2025 at 16:51, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Dec 20, 2025 at 3:17 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 16 Dec 2025 at 09:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > > > I was considering the interdependence between the subscription and the > > > > > > > conflict log table (CLT). IMHO, it would be logical to establish the > > > > > > > subscription as dependent on the CLT. This way, if someone attempts to > > > > > > > drop the CLT, the system would recognize the dependency of the > > > > > > > subscription and prevent the drop unless the subscription is removed > > > > > > > first or the CASCADE option is used. > > > > > > > > > > > > > > However, while investigating this, I encountered an error [1] stating > > > > > > > that global objects are not supported in this context. This indicates > > > > > > > that global objects cannot be made dependent on local objects. > > > > > > > > > > > > > > > > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database > > > > > > objects. For example, consider following case: > > > > > > postgres=# create table t1(c1 int primary key); > > > > > > CREATE TABLE > > > > > > postgres=# \d+ t1 > > > > > > Table "public.t1" > > > > > > Column | Type | Collation | Nullable | Default | Storage | > > > > > > Compression | Stats target | Description > > > > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > > > > > > c1 | integer | | not null | | plain | > > > > > > | | > > > > > > Indexes: > > > > > > "t1_pkey" PRIMARY KEY, btree (c1) > > > > > > Publications: > > > > > > "pub1" > > > > > > Not-null constraints: > > > > > > "t1_c1_not_null" NOT NULL "c1" > > > > > > Access method: heap > > > > > > postgres=# drop index t1_pkey; > > > > > > ERROR: cannot drop index t1_pkey because constraint t1_pkey on table > > > > > > t1 requires it > > > > > > HINT: You can drop constraint t1_pkey on table t1 instead. > > > > > > > > > > > > Here, the PK index is created as part for CREATE TABLE operation and > > > > > > pk_index is not allowed to be dropped independently. > > > > > > > > > > > > > Although making an object dependent on global/shared objects is > > > > > > > possible for certain types of shared objects [2], this is not our main > > > > > > > objective. > > > > > > > > > > > > > > > > > > > As per my understanding from the above example, we need something like > > > > > > that only for shared object subscription and (internally created) > > > > > > table. > > > > > > > > > > Yeah that seems to be exactly what we want, so I tried doing that by > > > > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and > > > > > it is behaving as we want[2]. And while dropping the subscription or > > > > > altering CLT we can delete internal dependency so that CLT get dropped > > > > > automatically[3] > > > > > > > > > > I will send an updated patch after testing a few more scenarios and > > > > > fixing other pending issues. > > > > > > > > > > [1] > > > > > + ObjectAddressSet(myself, RelationRelationId, relid); > > > > > + ObjectAddressSet(subaddr, SubscriptionRelationId, subid); > > > > > + recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL); > > > > > > > > > > > > > > > [2] > > > > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2; > > > > > ERROR: 2BP01: cannot drop table myschema.conflict_log_history2 > > > > > because subscription sub requires it > > > > > HINT: You can drop subscription sub instead. > > > > > LOCATION: findDependentObjects, dependency.c:788 > > > > > postgres[670778]=# > > > > > > > > > > [3] > > > > > ObjectAddressSet(object, SubscriptionRelationId, subid); > > > > > performDeletion(&object, DROP_CASCADE > > > > > PERFORM_DELETION_INTERNAL | > > > > > PERFORM_DELETION_SKIP_ORIGINAL); > > > > > > > > > > > > > > > > > > Here is the patch which implements the dependency and fixes other > > > > comments from Shveta. > > > > > > Thanks for the changes, the new implementation based on dependency > > > creates a cycle while dumping: > > > ./pg_dump -d postgres -f dump1.txt -p 5433 > > > pg_dump: warning: could not resolve dependency loop among these items: > > > pg_dump: detail: TABLE conflict (ID 225 OID 16397) > > > pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396) > > > pg_dump: detail: POST-DATA BOUNDARY (ID 3491) > > > pg_dump: detail: TABLE DATA t1 (ID 3485 OID 16384) > > > pg_dump: detail: PRE-DATA BOUNDARY (ID 3490) > > > > > > This can be seen with a simple subscription with conflict_log_table. > > > This was working fine with the v11 version patch. > > > > The attached v13 patch includes the fix for this issue. In addition, > > it now raises an error when attempting to configure a conflict log > > table that belongs to a temporary schema or is not a permanent > > (persistent) relation. > > I have updated the patch and here are changes done > 1. Splitted into 2 patches, 0001- for catalog related changes > 0002-inserting conflict into the conflict table, Vignesh need to > rebase the dump and upgrade related patch on this latest changes > 2. Subscription option changed to conflict_log_destination=(log/table/all/'') > 3. For internal processing we will use ConflictLogDest enum whereas > for taking input or storing into catalog we will use string [1]. > 4. As suggested by Sawada San, if conflict_log_destination is 'table' > we log the information about conflict but don't log the tuple > details[3] > > Pending: > 1. tap test for conflict insertion > 2. Still need to work on caching related changes discussed at [2], so > currently we don't allow conflict log tables to be added to > publication at all and might change this behavior as discussed at [2] > and for that we will need to implement the caching. > 3. Need to add conflict insertion test and doc changes. > 4. Still need to check on the latest comments from Peter Smith. > > > [1] > typedef enum ConflictLogDest > { > CONFLICT_LOG_DEST_INVALID = 0, > CONFLICT_LOG_DEST_LOG, /* "log" (default) */ > CONFLICT_LOG_DEST_TABLE, /* "table" */ > CONFLICT_LOG_DEST_ALL /* "all" */ > } ConflictLogDest; Consider the following scenario. Initially, the subscription was configured with conflict_log_destination set to a table. As conflicts occurred, entries were generated and recorded in that table, for example: postgres=# SELECT * FROM conflict_log_table_16399; relid | schemaname | relname | conflict_type | remote_xid | remote_commit_lsn | remote_commit_ts | remote_origin | replica_identity | remote_tuple | local_conflicts -------+------------+---------+---------------+------------+-------------------+----------------------------------+---------------+------------------+--------------+------------------------- ------------------------------------------------------------------------- 16384 | public | t1 | insert_exists | 765 | 0/0178A718 | 2025-12-22 12:06:57.417789+05:30 | pg_16399 | | {"c1":1} | {"{\"xid\":\"781\",\"com mit_ts\":null,\"origin\":null,\"key\":{\"c1\":1},\"tuple\":{\"c1\":1}}"} 16384 | public | t1 | insert_exists | 765 | 0/0178A718 | 2025-12-22 12:06:57.417789+05:30 | pg_16399 | | {"c1":1} | {"{\"xid\":\"781\",\"com mit_ts\":null,\"origin\":null,\"key\":{\"c1\":1},\"tuple\":{\"c1\":1}}"} (2 rows) Subsequently, the conflict log destination was changed from table to log: ALTER SUBSCRIPTION sub1 SET (conflict_log_destination = 'log'); As a result, the conflict log table is dropped, and there is no longer any way to access the previously recorded conflict entries. This effectively causes the loss of historical conflict data. It is unclear whether this behavior is desirable or expected. Should we consider a way to preserve the historical conflict data in this case? Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-22T09:39:38Z
On Sat, Dec 20, 2025 at 4:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I have updated the patch and here are changes done Thank You for the patch. Few comments on 001 alone: 1) postgres=# create subscription sub1 connection ...' publication pub1 WITH(conflict_log_destination = 'table'); ERROR: could not generate conflict log table "conflict_log_table_16395" DETAIL: Conflict log tables cannot be created in a temporary namespace. HINT: Ensure your 'search_path' is set to permanent schema. Based on such existing errors: errmsg("cannot create relations in temporary schemas of other sessions"))); errmsg("cannot create temporary relation in non-temporary schema"))); errmsg("cannot create relations in temporary schemas of other sessions"))); Shall we tweak: --temporary namespace --> temporary schema --permanent --> non-temporary 2) postgres=# drop schema shveta cascade; NOTICE: drop cascades to subscription sub1 ERROR: global objects cannot be deleted by doDeletion Is this expected? Is the user supposed to see this error? 3) ConflictLogDestLabels enum starts from 0/INVALID while mapping ConflictLogDestLabels has values starting from index 1. The index 0 has no value. Thus IMO, wherever we access ConflictLogDestLabels, we should make a sanity check that index accessed is not CONFLICT_LOG_DEST_INVALID i.e. opts.logdest != CONFLICT_LOG_DEST_INVALID 4) I find 'Labels' in ConflictLogDestLabels slightly odd. There could be other names for this variables such as ConflictLogDestValues, ConflictLogDestStrings or ConflictLogDestNames. See similar: ConflictTypeNames, SlotInvalidationCauses 5) + /* + * Strategy for logging replication conflicts: + * log - server log only, + * table - internal table only, + * all - both log and table. + */ + text sublogdestination; sublogdestination can be confused with regular log_destination. Shall we rename to subconflictlogdest. 6) Should the \dRs+ command display the 'Conflict Log Table:' at the end? This would be similar to how \dRp+ shows 'Tables:', even though the relation IDs can already be obtained from pg_publication_rel. I think this would be a useful improvement. 7) One observation, not sure if it needs any fix, please review and share thoughts. --CLT created in default public schema present in serach_path create subscription sub1 connection '..' publication pub1 WITH(conflict_log_destination = 'table'); --Change search path create schema sch1; SET search_path=sch1, "$user"; After this, if I create a new sub with destination as 'table', CLT is generated in sch1. But if I do below: alter subscription sub1 set (conflict_log_destination='table'); It does not move the table to sch1. This is because conflict_log_destination is not changed; and as per current implementation, alter-sub becomes no-op. But search_path is changed. So what should be the behaviour here? --let the table be in the old schema, which is currently not in search_path (existing behaviour)? --drop the table in the old schema and create a new one present in search_path? I could not find a similar case in postgres to compare the behaviour. If we do alter subscription sub1 set (conflict_log_destination='log'); alter subscription sub1 set (conflict_log_destination='table'); Then it moves the table to a new schema as internally setting destination to 'log' drops the table. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-22T10:24:57Z
On Sat, 20 Dec 2025 at 16:51, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I have updated the patch and here are changes done > 1. Splitted into 2 patches, 0001- for catalog related changes > 0002-inserting conflict into the conflict table, Vignesh need to > rebase the dump and upgrade related patch on this latest changes > 2. Subscription option changed to conflict_log_destination=(log/table/all/'') > 3. For internal processing we will use ConflictLogDest enum whereas > for taking input or storing into catalog we will use string [1]. > 4. As suggested by Sawada San, if conflict_log_destination is 'table' > we log the information about conflict but don't log the tuple > details[3] Few comments: 1) when a conflict_log_destination is specified as log: create subscription sub1 connection 'dbname=postgres host=localhost port=5432' publication pub1 with ( conflict_log_destination='log'); postgres=# select subname, subconflictlogrelid,sublogdestination from pg_subscription where subname = 'sub4'; subname | subconflictlogrelid | sublogdestination ---------+---------------------+------------------- sub4 | 0 | log (1 row) Currently it displays as 0, instead we can show as NULL in this case 2) can we include displaying of conflict log table also in describe subscriptions: + /* Conflict log destination is supported in v19 and higher */ + if (pset.sversion >= 190000) + { + appendPQExpBuffer(&buf, + ", sublogdestination AS \"%s\"\n", + gettext_noop("Conflict log destination")); + } 3) Can we include pg_ in the conflict table to indicate it is an internally created table: +/* + * Format the standardized internal conflict log table name for a subscription + * + * Use the OID to prevent collisions during rename operations. + */ +void +GetConflictLogTableName(char *dest, Oid subid) +{ + snprintf(dest, NAMEDATALEN, "conflict_log_table_%u", subid); +} 4) Can the table be deleted now with the dependency associated between the table and the subscription? + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); + + /* Conflict log table is dropped or not accessible. */ + if (conflictlogrel == NULL) + ereport(WARNING, + (errcode(ERRCODE_UNDEFINED_TABLE), + errmsg("conflict log table with OID %u does not exist", + conflictlogrelid))); + + return conflictlogrel; 5) Should this code be changed to just prepare the conflict log tuple here, validation and insertion can happen at start_apply if elevel >= ERROR to avoid ValidateConflictLogTable here as well as at start_apply function: + if (ValidateConflictLogTable(conflictlogrel)) + { + /* + * Prepare the conflict log tuple. If the error level is below + * ERROR, insert it immediately. Otherwise, defer the insertion to + * a new transaction after the current one aborts, ensuring the + * insertion of the log tuple is not rolled back. + */ + prepare_conflict_log_tuple(estate, + relinfo->ri_RelationDesc, + conflictlogrel, + type, + searchslot, + conflicttuples, + remoteslot); + if (elevel < ERROR) + InsertConflictLogTuple(conflictlogrel); + } + else + ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("conflict log table \"%s.%s\" structure changed, skipping insertion", + get_namespace_name(RelationGetNamespace(conflictlogrel)), + RelationGetRelationName(conflictlogrel))); to: prepare_conflict_log_tuple(estate, relinfo->ri_RelationDesc, conflictlogrel, type, searchslot, conflicttuples, remoteslot); if (elevel < ERROR) { if (ValidateConflictLogTable(conflictlogrel)) InsertConflictLogTuple(conflictlogrel); else ereport(WARNING, errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("conflict log table \"%s.%s\" structure changed, skipping insertion", get_namespace_name(RelationGetNamespace(conflictlogrel)), RelationGetRelationName(conflictlogrel))); } Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-22T10:31:11Z
On Sat, Dec 20, 2025 at 4:50 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Dec 20, 2025 at 3:17 PM vignesh C <vignesh21@gmail.com> wrote: > I have updated the patch and here are changes done > 1. Splitted into 2 patches, 0001- for catalog related changes > 0002-inserting conflict into the conflict table, Vignesh need to > rebase the dump and upgrade related patch on this latest changes > 2. Subscription option changed to conflict_log_destination=(log/table/all/'') > 3. For internal processing we will use ConflictLogDest enum whereas > for taking input or storing into catalog we will use string [1]. > 4. As suggested by Sawada San, if conflict_log_destination is 'table' > we log the information about conflict but don't log the tuple > details[3] > > Pending: > 1. tap test for conflict insertion Done in V15 > 2. Still need to work on caching related changes discussed at [2], so > currently we don't allow conflict log tables to be added to > publication at all and might change this behavior as discussed at [2] > and for that we will need to implement the caching. Pending > 3. Need to add conflict insertion test and doc changes. Done > 4. Still need to check on the latest comments from Peter Smith. Done While planning to send the patch, I have noticed some latest comments from Shveta and Vignesh, so I will analyze those in the next version. V15-0004 is Vignesh's patch which is attached as it is and I am going to review that soon. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-22T10:34:05Z
On Mon, Dec 22, 2025 at 3:55 PM vignesh C <vignesh21@gmail.com> wrote: > > > Few comments: > 1) when a conflict_log_destination is specified as log: > create subscription sub1 connection 'dbname=postgres host=localhost > port=5432' publication pub1 with ( conflict_log_destination='log'); > postgres=# select subname, subconflictlogrelid,sublogdestination from > pg_subscription where subname = 'sub4'; > subname | subconflictlogrelid | sublogdestination > ---------+---------------------+------------------- > sub4 | 0 | log > (1 row) > > Currently it displays as 0, instead we can show as NULL in this case I also thought about it while reviewing, but I feel 0 makes more sense as it is 'relid'. This is how it is shown currently in other tables. See 'reltoastrelid': postgres=# select relname, reltoastrelid from pg_class where relname='tab1'; relname | reltoastrelid ---------+--------------- tab1 | 0 (1 row) > > 3) Can we include pg_ in the conflict table to indicate it is an > internally created table: > +/* > + * Format the standardized internal conflict log table name for a subscription > + * > + * Use the OID to prevent collisions during rename operations. > + */ > +void > +GetConflictLogTableName(char *dest, Oid subid) > +{ > + snprintf(dest, NAMEDATALEN, "conflict_log_table_%u", subid); > +} > There is already a discussion about it in [1] [1]: https://www.postgresql.org/message-id/CAA4eK1KE%3DtNHcN3Qp0FZVwDnt4rF2zwHy8NgAdG3oPqixdzOsA%40mail.gmail.com thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-22T15:41:03Z
On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote: I think this needs more thought, others can be fixed. > 2) > postgres=# drop schema shveta cascade; > NOTICE: drop cascades to subscription sub1 > ERROR: global objects cannot be deleted by doDeletion > > Is this expected? Is the user supposed to see this error? > See below code, so this says if the object being dropped is the outermost object (i.e. if we are dropping the table directly) then it will disallow dropping the object on which it has INTERNAL DEPENDENCY, OTOH if the object is being dropped via recursive drop (i.e. the table is being dropped while dropping the schema) then object on which it has INTERNAL dependency will also be added to the deletion list and later will be dropped via doDeletion and later we are getting error as subscription is a global object. I thought maybe we can handle an additional case that the INTERNAL DEPENDENCY, is on subscription the disallow dropping it irrespective of whether it is being called directly or via recursive drop but then it will give an issue even when we are trying to drop table during subscription drop, we can make handle this case as well via 'flags' passed in findDependentObjects() but need more investigation. Seeing this complexity makes me think more on is it really worth it to maintain this dependency? Because during subscription drop we anyway have to call performDeletion externally because this dependency is local so we are just disallowing the conflict table drop, however the ALTER table is allowed so what we are really protecting by protecting the table drop, I think it can be just documented that if user try to drop the table then conflict will not be inserted anymore? findDependentObjects() { ... switch (foundDep->deptype) { .... case DEPENDENCY_INTERNAL: * 1. At the outermost recursion level, we must disallow the * DROP. However, if the owning object is listed in * pendingObjects, just release the caller's lock and return; * we'll eventually complete the DROP when we reach that entry * in the pending list. } } [1] postgres[1333899]=# select * from pg_depend where objid > 16410; classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype ---------+-------+----------+------------+----------+-------------+--------- 1259 | 16420 | 0 | 2615 | 16410 | 0 | n 1259 | 16420 | 0 | 6100 | 16419 | 0 | i (4 rows) 16420 -> conflict_log_table_16419 16419 -> subscription 16410 -> schema s1 -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-23T05:25:08Z
On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > I think this needs more thought, others can be fixed. > > > 2) > > postgres=# drop schema shveta cascade; > > NOTICE: drop cascades to subscription sub1 > > ERROR: global objects cannot be deleted by doDeletion > > > > Is this expected? Is the user supposed to see this error? > > > See below code, so this says if the object being dropped is the > outermost object (i.e. if we are dropping the table directly) then it > will disallow dropping the object on which it has INTERNAL DEPENDENCY, > OTOH if the object is being dropped via recursive drop (i.e. the table > is being dropped while dropping the schema) then object on which it > has INTERNAL dependency will also be added to the deletion list and > later will be dropped via doDeletion and later we are getting error as > subscription is a global object. I thought maybe we can handle an > additional case that the INTERNAL DEPENDENCY, is on subscription the > disallow dropping it irrespective of whether it is being called > directly or via recursive drop but then it will give an issue even > when we are trying to drop table during subscription drop, we can make > handle this case as well via 'flags' passed in findDependentObjects() > but need more investigation. > > Seeing this complexity makes me think more on is it really worth it to > maintain this dependency? Because during subscription drop we anyway > have to call performDeletion externally because this dependency is > local so we are just disallowing the conflict table drop, however the > ALTER table is allowed so what we are really protecting by protecting > the table drop, I think it can be just documented that if user try to > drop the table then conflict will not be inserted anymore? > > findDependentObjects() > { > ... > switch (foundDep->deptype) > { > .... > case DEPENDENCY_INTERNAL: > * 1. At the outermost recursion level, we must disallow the > * DROP. However, if the owning object is listed in > * pendingObjects, just release the caller's lock and return; > * we'll eventually complete the DROP when we reach that entry > * in the pending list. > } > } > > [1] > postgres[1333899]=# select * from pg_depend where objid > 16410; > classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype > ---------+-------+----------+------------+----------+-------------+--------- > 1259 | 16420 | 0 | 2615 | 16410 | 0 | n > 1259 | 16420 | 0 | 6100 | 16419 | 0 | i > (4 rows) > > 16420 -> conflict_log_table_16419 > 16419 -> subscription > 16410 -> schema s1 > One approach could be to use something similar to PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive drops. The effect would be that 'DROP SCHEMA ... CASCADE' would proceed without error, i.e., it would drop the tables as well without including the subscription in the dependency list. But if we try to drop a table directly (e.g., DROP TABLE CLT), it will still result in: ERROR: cannot drop table because subscription sub1 requires it The behavior will resemble a dependency somewhere between type 'n' and type 'i'. That said, I’m not sure if this is worth the effort, even though it prevents direct drop of table, it still does not prevent table from being dropped as part of a schema drop. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-23T06:11:24Z
On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote: > > One approach could be to use something similar to > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would > proceed without error, i.e., it would drop the tables as well without > including the subscription in the dependency list. But if we try to > drop a table directly (e.g., DROP TABLE CLT), it will still result in: > ERROR: cannot drop table because subscription sub1 requires it > > The behavior will resemble a dependency somewhere between type 'n' and > type 'i'. That said, I’m not sure if this is worth the effort, even > though it prevents direct drop of table, it still does not prevent > table from being dropped as part of a schema drop. Yeah but that would be inconsistent behavior. Anyway here is what I got with what I was proposing yesterday.[1], so basically drop schema and drop table are giving the same behavior as expected and drop subscription is internally dropping the table as we would want. Although this need more thought to see what else it might break. postgres[1553010]=# CREATE SCHEMA s1; postgres[1553010]=# SET search_path TO s1; postgres[1553010]=# CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres port=5432' PUBLICATION pub WITH (conflict_log_destination = table); postgres[1553010]=# \d List of relations Schema | Name | Type | Owner --------+--------------------------+-------+------------- s1 | conflict_log_table_16428 | table | dilipkumarb (1 row) postgres[1553010]=# DROP SCHEMA s1; ERROR: 2BP01: cannot drop table conflict_log_table_16428 because subscription sub1 requires it HINT: You can drop subscription sub1 instead. LOCATION: findDependentObjects, dependency.c:843 postgres[1553010]=# DROP TABLE conflict_log_table_16428 ; ERROR: 2BP01: cannot drop table conflict_log_table_16428 because subscription sub1 requires it HINT: You can drop subscription sub1 instead. LOCATION: findDependentObjects, dependency.c:843 postgres[1553010]=# DROP SUBSCRIPTION sub1; NOTICE: 00000: dropped replication slot "pg_16428_sync_16385_7586930395971240479" on publisher LOCATION: ReplicationSlotDropAtPubNode, subscriptioncmds.c:2469 NOTICE: 00000: dropped replication slot "sub1" on publisher LOCATION: ReplicationSlotDropAtPubNode, subscriptioncmds.c:2469 DROP SUBSCRIPTION [1] diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index 7489bbd5fb3..14184d076d3 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -662,6 +662,11 @@ findDependentObjects(const ObjectAddress *object, * However, no inconsistency can result: since we're at outer * level, there is no object depending on this one. */ + if (IsSharedRelation(otherObject.classId) && !(flags & PERFORM_DELETION_INTERNAL)) + { + owningObject = otherObject; + break; + } if (stack == NULL) { if (pendingObjects && -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-23T06:15:48Z
On Sat, 20 Dec 2025 at 16:51, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Dec 20, 2025 at 3:17 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 16 Dec 2025 at 09:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Sun, 14 Dec 2025 at 21:17, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Sun, Dec 14, 2025 at 3:51 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On Fri, Dec 12, 2025 at 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > On Thu, Dec 11, 2025 at 7:49 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > > > I was considering the interdependence between the subscription and the > > > > > > > conflict log table (CLT). IMHO, it would be logical to establish the > > > > > > > subscription as dependent on the CLT. This way, if someone attempts to > > > > > > > drop the CLT, the system would recognize the dependency of the > > > > > > > subscription and prevent the drop unless the subscription is removed > > > > > > > first or the CASCADE option is used. > > > > > > > > > > > > > > However, while investigating this, I encountered an error [1] stating > > > > > > > that global objects are not supported in this context. This indicates > > > > > > > that global objects cannot be made dependent on local objects. > > > > > > > > > > > > > > > > > > > What we need here is an equivalent of DEPENDENCY_INTERNAL for database > > > > > > objects. For example, consider following case: > > > > > > postgres=# create table t1(c1 int primary key); > > > > > > CREATE TABLE > > > > > > postgres=# \d+ t1 > > > > > > Table "public.t1" > > > > > > Column | Type | Collation | Nullable | Default | Storage | > > > > > > Compression | Stats target | Description > > > > > > --------+---------+-----------+----------+---------+---------+-------------+--------------+------------- > > > > > > c1 | integer | | not null | | plain | > > > > > > | | > > > > > > Indexes: > > > > > > "t1_pkey" PRIMARY KEY, btree (c1) > > > > > > Publications: > > > > > > "pub1" > > > > > > Not-null constraints: > > > > > > "t1_c1_not_null" NOT NULL "c1" > > > > > > Access method: heap > > > > > > postgres=# drop index t1_pkey; > > > > > > ERROR: cannot drop index t1_pkey because constraint t1_pkey on table > > > > > > t1 requires it > > > > > > HINT: You can drop constraint t1_pkey on table t1 instead. > > > > > > > > > > > > Here, the PK index is created as part for CREATE TABLE operation and > > > > > > pk_index is not allowed to be dropped independently. > > > > > > > > > > > > > Although making an object dependent on global/shared objects is > > > > > > > possible for certain types of shared objects [2], this is not our main > > > > > > > objective. > > > > > > > > > > > > > > > > > > > As per my understanding from the above example, we need something like > > > > > > that only for shared object subscription and (internally created) > > > > > > table. > > > > > > > > > > Yeah that seems to be exactly what we want, so I tried doing that by > > > > > recording DEPENDENCY_INTERNAL dependency of CLT on subscription[1] and > > > > > it is behaving as we want[2]. And while dropping the subscription or > > > > > altering CLT we can delete internal dependency so that CLT get dropped > > > > > automatically[3] > > > > > > > > > > I will send an updated patch after testing a few more scenarios and > > > > > fixing other pending issues. > > > > > > > > > > [1] > > > > > + ObjectAddressSet(myself, RelationRelationId, relid); > > > > > + ObjectAddressSet(subaddr, SubscriptionRelationId, subid); > > > > > + recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL); > > > > > > > > > > > > > > > [2] > > > > > postgres[670778]=# DROP TABLE myschema.conflict_log_history2; > > > > > ERROR: 2BP01: cannot drop table myschema.conflict_log_history2 > > > > > because subscription sub requires it > > > > > HINT: You can drop subscription sub instead. > > > > > LOCATION: findDependentObjects, dependency.c:788 > > > > > postgres[670778]=# > > > > > > > > > > [3] > > > > > ObjectAddressSet(object, SubscriptionRelationId, subid); > > > > > performDeletion(&object, DROP_CASCADE > > > > > PERFORM_DELETION_INTERNAL | > > > > > PERFORM_DELETION_SKIP_ORIGINAL); > > > > > > > > > > > > > > > > > > Here is the patch which implements the dependency and fixes other > > > > comments from Shveta. > > > > > > Thanks for the changes, the new implementation based on dependency > > > creates a cycle while dumping: > > > ./pg_dump -d postgres -f dump1.txt -p 5433 > > > pg_dump: warning: could not resolve dependency loop among these items: > > > pg_dump: detail: TABLE conflict (ID 225 OID 16397) > > > pg_dump: detail: SUBSCRIPTION (ID 3484 OID 16396) > > > pg_dump: detail: POST-DATA BOUNDARY (ID 3491) > > > pg_dump: detail: TABLE DATA t1 (ID 3485 OID 16384) > > > pg_dump: detail: PRE-DATA BOUNDARY (ID 3490) > > > > > > This can be seen with a simple subscription with conflict_log_table. > > > This was working fine with the v11 version patch. > > > > The attached v13 patch includes the fix for this issue. In addition, > > it now raises an error when attempting to configure a conflict log > > table that belongs to a temporary schema or is not a permanent > > (persistent) relation. > > I have updated the patch and here are changes done > 1. Splitted into 2 patches, 0001- for catalog related changes > 0002-inserting conflict into the conflict table, Vignesh need to > rebase the dump and upgrade related patch on this latest changes Here is a rebased version of the dump/upgrade patch based on the v15 version posted at [1]. After replacing conflict_log_table with conflict_log_destination, we don't specify a fully qualified table name directly. Instead, the conflict log behavior is controlled via conflict_log_destination (table, log, or all). Since pg_dump resets search_path, it must explicitly set the schema in which the conflict log table should be created or reused. To handle this, pg_dump temporarily sets and then restores search_path around the ALTER SUBSCRIPTION ... SET (conflict_log_destination ...) command, ensuring the conflict log table is resolved in the intended schema. Additionally, in non-upgrade dump/restore scenarios, the conflict log table is not dumped as in non-upgrade mode it does not make sense to link with the older conflict log table. v15-0001 to v15-0004 is the same as the patches posted at [1]. dump/upgrade changes are present in v15-0005 patch. [1] - https://www.postgresql.org/message-id/CAFiTN-uKn7mix8BkOOmJQ2cF5yKdfQUg2mX_w9vEC4787VZ_xQ%40mail.gmail.com Regards. Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-12-23T06:49:21Z
Hi Dilip. Here are some review comments after a first pass of patch v15-0001. ====== Commit Message 1. If user choose to log into the table the table will automatically created while creating the subscription with internal name i.e. conflict_log_table_$subid$. The table will be created in the current search path and table would be automatically dropped while dropping the subscription. English: /If user choose/ /the table the table/ /and table would/ ====== src/backend/commands/subscriptioncmds.c 2. +#define SUBOPT_CONFLICT_LOG_DESTINATION 0x00040000 For the values, you are using DEST instead of DESTINATION. You can do the same here to keep the macro name a bit shorter. ~~~ parse_subscription_options: 3. + dest = GetLogDestination(val); + + if (dest == CONFLICT_LOG_DEST_INVALID) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unrecognized conflict_log_destination value: \"%s\"", val), + errhint("Valid values are \"log\", \"table\", and \"all\"."))); I don't think CONFLICT_LOG_DEST_INVALID should even exist as an enum value. Instead, the validation and the ereport(ERROR) should all be done within GetLogDestination function. So, it should only return valid values, else give an error. ~~~ CreateSubscription: 4. + /* Always set the destination, default will be log. */ + values[Anum_pg_subscription_sublogdestination - 1] = + CStringGetTextDatum(ConflictLogDestLabels[opts.logdest]); + + /* + * If the conflict log destination includes 'table', generate an internal + * name using the subscription OID and determine the target namespace based + * on the current search path. Store the namespace OID and the conflict log + * format in the pg_subscription catalog tuple., then physically create + * the table. + */ 4a. When referring to these parameter values, you should always consistently quote them. Currently, there is a mix of lots of formats. (e.g. log (unquoted), 'table' (single-quoted), "log" (double-quoted)). Pick one style, and make them all the same. Check for the same everywhere. ~ 4b. Typo "tuple.," ~~~ 5. + if (opts.logdest == CONFLICT_LOG_DEST_TABLE || + opts.logdest == CONFLICT_LOG_DEST_ALL) IIUC, you are effectively treating these parameter values like bits that can be OR-ed together. And if in the future a "list" is supported, then that's exactly what you will be doing. So, IMO, they should be defined that way. See a review comment later in this post. e.g. this condition would be written more like: if ((opts.logdest & CONFLICT_LOG_DEST_TABLE) != 0) or, using the macro if (IsSet(opts.logdest, CONFLICT_LOG_DEST_TABLE)) ~~~ AlterSubscription: 6. + if (opts.logdest != old_dest) + { + bool want_table = + (opts.logdest == CONFLICT_LOG_DEST_TABLE || + opts.logdest == CONFLICT_LOG_DEST_ALL); + bool has_oldtable = + (old_dest == CONFLICT_LOG_DEST_TABLE || + old_dest == CONFLICT_LOG_DEST_ALL); + This is more of the same kind of logic that convinces me the code should be using bitmasks. SUGGESTION bool want_table = IsSet(opts.logdest, CONFLICT_LOG_DEST_TABLE); bool has_oldtable = IsSet(olddest, CONFLICT_LOG_DEST_TABLE); ~~~ create_conflict_log_table: 7. +/* + * Create conflict log table. + * + * The subscription owner becomes the owner of this table and has all + * privileges on it. + */ +static Oid +create_conflict_log_table(Oid subid, char *subname, Oid namespaceId, + char *conflictrel) I felt something like 'relname' is a better name for the char * conflictrel param. It clearly is the name of the conflict relation because of the name of the function. ~~~ 8. + /* Add a comments for the conflict log table. */ + snprintf(comment, sizeof(comment), + "Conflict log table for subscription \"%s\"", subname); + CreateComments(relid, RelationRelationId, 0, comment); + 8a. typo /Add a comments/Add a comment/ ~ 8b. My (previous review) suggestion for adding a table comment/description made more sense when the CLT was some arbitrary name chosen by the user. But, now that the CLT is a name like "conflict_log_table_%u", the idea for a comment seems redundant. ~~~ 9. +/* + * Format the standardized internal conflict log table name for a subscription + * + * Use the OID to prevent collisions during rename operations. + */ +void +GetConflictLogTableName(char *dest, Oid subid) +{ + snprintf(dest, NAMEDATALEN, "conflict_log_table_%u", subid); +} + 9a. To emphasise that this is an "internal" table, IMO there should be a "pg_" prefix for this table name. ~ 9b. Since it is internal anyway, why not make the tablename descriptive to clarify what that number means? e.g. "pg_conflict_log_table_for_subid_%u" BTW, since it is already a TABLE, then why is "table" even part of this name? Why not just "pg_conflict_log_for_subid_%u" ~~~ 10. +/* + * GetLogDestination + * + * Convert string to enum by comparing against standardized labels. + */ +ConflictLogDest +GetLogDestination(const char *dest) +{ + /* Empty string or NULL defaults to LOG. */ + if (dest == NULL || dest[0] == '\0') + return CONFLICT_LOG_DEST_LOG; + + for (int i = CONFLICT_LOG_DEST_LOG; i <= CONFLICT_LOG_DEST_ALL; i++) + { + if (pg_strcasecmp(dest, ConflictLogDestLabels[i]) == 0) + return (ConflictLogDest) i; + } + + /* Unrecognized string. */ + return CONFLICT_LOG_DEST_INVALID; +} Mentioned previously: I think there should be no such thing as CONFLICT_LOG_DEST_INVALID. I also think this function should be responsible for the ereport(ERROR). ====== src/include/catalog/pg_subscription.h 11. + /* + * Strategy for logging replication conflicts: + * log - server log only, + * table - internal table only, + * all - both log and table. + */ + text sublogdestination; + SUGGEST 'subconflictlogdest' (see next review comment #12 for why) ~~~ 12. + Oid conflictrelid; /* conflict log table Oid */ char *conninfo; /* Connection string to the publisher */ char *slotname; /* Name of the replication slot */ char *synccommit; /* Synchronous commit setting for worker */ List *publications; /* List of publication names to subscribe to */ char *origin; /* Only publish data originating from the * specified origin */ + char *logdestination; /* Conflict log destination */ } Subscription; These don't seem very good member names: Maybe 'conflictrelid' -> 'conflictlogrelid' (because it's rel of the log; not the conflict) Maybe 'logdestination' -> 'conflictlogdest' (because in future there might be other kinds of subscription logs) ====== src/include/replication/conflict.h 13. +typedef enum ConflictLogDest +{ + CONFLICT_LOG_DEST_INVALID = 0, + CONFLICT_LOG_DEST_LOG, /* "log" (default) */ + CONFLICT_LOG_DEST_TABLE, /* "table" */ + CONFLICT_LOG_DEST_ALL /* "all" */ +} ConflictLogDest; + I didn't like this enum much. Suggest removing CONFLICT_LOG_DEST_INVALID. And use bits for the other values. And you can still have a default enum if you want. SUGGESTION typedef enum ConflictLogDest { CONFLICT_LOG_DEST_LOG = 0x001, CONFLICT_LOG_DEST_TABLE = 0x010, CONFLICT_LOG_DEST_DEFAULT = CONFLICT_LOG_DEST_LOG, CONFLICT_LOG_DEST_ALL = CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE, } ConflictLogDest; BTW, there are only a few values that the array won't exceed length 0x11, so I guess you can still keep your same designated initialiser for the dest labels. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-23T10:03:56Z
On Mon, Dec 22, 2025 at 4:01 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Done in V15 Thanks for the patches. A few comments on v15-002 for the part I have reviewed so far: 1) Defined twice: +#define MAX_LOCAL_CONFLICT_INFO_ATTRS 5 +#define MAX_LOCAL_CONFLICT_INFO_ATTRS \ + (sizeof(LocalConflictSchema) / sizeof(LocalConflictSchema[0])) 2) GetConflictLogTableInfo: + *log_dest = GetLogDestination(MySubscription->logdestination); + conflictlogrelid = MySubscription->conflictrelid; + + /* If destination is 'log' only, no table to open. */ + if (*log_dest == CONFLICT_LOG_DEST_LOG) + return NULL; We can get conflictlogrelid after the if-check for DEST_LOG. 3) In ReportApplyConflict(), we form err_detail by calling errdetail_apply_conflict(). But when dest is TABLE, we don't use err_detail. Shall we skip creating it for dest=TABLE case? 4) ReportApplyConflict(): + /* + * Get both the conflict log destination and the opened conflict log + * relation for insertion. + */ + conflictlogrel = GetConflictLogTableInfo(&dest); + We can move it after errdetail_apply_conflict(), closer to where we actually use it. 5) start_apply: + /* Open conflict log table and insert the tuple. */ + conflictlogrel = GetConflictLogTableInfo(&dest); + if (ValidateConflictLogTable(conflictlogrel)) + InsertConflictLogTuple(conflictlogrel); We can have Assert here too before we call Validate: Assert(dest == CONFLICT_LOG_DEST_TABLE || dest == CONFLICT_LOG_DEST_ALL); 6) start_apply: + if (ValidateConflictLogTable(conflictlogrel)) + InsertConflictLogTuple(conflictlogrel); + MyLogicalRepWorker->conflict_log_tuple = NULL; InsertConflictLogTuple() already sets conflict_log_tuple to NULL. Above is not needed. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2025-12-23T11:48:34Z
On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > I think this needs more thought, others can be fixed. > > > > > 2) > > > postgres=# drop schema shveta cascade; > > > NOTICE: drop cascades to subscription sub1 > > > ERROR: global objects cannot be deleted by doDeletion > > > > > > Is this expected? Is the user supposed to see this error? > > > > > See below code, so this says if the object being dropped is the > > outermost object (i.e. if we are dropping the table directly) then it > > will disallow dropping the object on which it has INTERNAL DEPENDENCY, > > OTOH if the object is being dropped via recursive drop (i.e. the table > > is being dropped while dropping the schema) then object on which it > > has INTERNAL dependency will also be added to the deletion list and > > later will be dropped via doDeletion and later we are getting error as > > subscription is a global object. I thought maybe we can handle an > > additional case that the INTERNAL DEPENDENCY, is on subscription the > > disallow dropping it irrespective of whether it is being called > > directly or via recursive drop but then it will give an issue even > > when we are trying to drop table during subscription drop, we can make > > handle this case as well via 'flags' passed in findDependentObjects() > > but need more investigation. > > > > Seeing this complexity makes me think more on is it really worth it to > > maintain this dependency? Because during subscription drop we anyway > > have to call performDeletion externally because this dependency is > > local so we are just disallowing the conflict table drop, however the > > ALTER table is allowed so what we are really protecting by protecting > > the table drop, I think it can be just documented that if user try to > > drop the table then conflict will not be inserted anymore? > > > > findDependentObjects() > > { > > ... > > switch (foundDep->deptype) > > { > > .... > > case DEPENDENCY_INTERNAL: > > * 1. At the outermost recursion level, we must disallow the > > * DROP. However, if the owning object is listed in > > * pendingObjects, just release the caller's lock and return; > > * we'll eventually complete the DROP when we reach that entry > > * in the pending list. > > } > > } > > > > [1] > > postgres[1333899]=# select * from pg_depend where objid > 16410; > > classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype > > ---------+-------+----------+------------+----------+-------------+--------- > > 1259 | 16420 | 0 | 2615 | 16410 | 0 | n > > 1259 | 16420 | 0 | 6100 | 16419 | 0 | i > > (4 rows) > > > > 16420 -> conflict_log_table_16419 > > 16419 -> subscription > > 16410 -> schema s1 > > > > One approach could be to use something similar to > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would > proceed without error, i.e., it would drop the tables as well without > including the subscription in the dependency list. But if we try to > drop a table directly (e.g., DROP TABLE CLT), it will still result in: > ERROR: cannot drop table because subscription sub1 requires it > I think this way of allowing dropping the conflict table without caring for the parent object (subscription) is not a good idea. How about creating a dedicated schema, say pg_conflict for the purpose of storing conflict tables? This will be similar to the pg_toast schema for toast tables. So, similar to that each database will have a pg_conflict schema. It prevents the "orphan" problem where a user accidentally drops the logging schema but the Subscription is still trying to write to it. pg_dump needs to ignore all system schemas EXCEPT pg_conflict. This ensures the history is preserved during migrations while still protecting the tables from accidental user deletion. About permissions, I think we need to set the schema permissions so that USAGE is public (so users can SELECT from their logs) but CREATE is restricted to the superuser/subscription owner. We may need to think some more about permissions. I also tried to reason out if we can allow storing the conflict table in pg_catalog but here are a few reasons why it won't be a good idea. I think by default, pg_dump completely ignores the pg_catalog schema. It assumes pg_catalog contains static system definitions (like pg_class, pg_proc, etc.) that are re-generated by the initdb process, not user data. If we place a conflict table in pg_catalog, it will not be backed up. If a user runs pg_dump/all to migrate to a new server, their subscription definition will survive, but their entire history of conflict logs will vanish. Also from the permissions angle, If a user wants to write a custom PL/pgSQL function to "retry" conflicts, they might need to DELETE rows from the conflict table after fixing them. Granting DELETE permissions on a table inside pg_catalog is non-standard and often frowned upon by security auditors. It blurs the line between "System Internals" (immutable) and "User Data" (mutable). So, in short a separate pg_conflict schema appears to be a better solution. Thoughts? -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-23T12:22:14Z
On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > I think this needs more thought, others can be fixed. > > > > > > > 2) > > > > postgres=# drop schema shveta cascade; > > > > NOTICE: drop cascades to subscription sub1 > > > > ERROR: global objects cannot be deleted by doDeletion > > > > > > > > Is this expected? Is the user supposed to see this error? > > > > > > > See below code, so this says if the object being dropped is the > > > outermost object (i.e. if we are dropping the table directly) then it > > > will disallow dropping the object on which it has INTERNAL DEPENDENCY, > > > OTOH if the object is being dropped via recursive drop (i.e. the table > > > is being dropped while dropping the schema) then object on which it > > > has INTERNAL dependency will also be added to the deletion list and > > > later will be dropped via doDeletion and later we are getting error as > > > subscription is a global object. I thought maybe we can handle an > > > additional case that the INTERNAL DEPENDENCY, is on subscription the > > > disallow dropping it irrespective of whether it is being called > > > directly or via recursive drop but then it will give an issue even > > > when we are trying to drop table during subscription drop, we can make > > > handle this case as well via 'flags' passed in findDependentObjects() > > > but need more investigation. > > > > > > Seeing this complexity makes me think more on is it really worth it to > > > maintain this dependency? Because during subscription drop we anyway > > > have to call performDeletion externally because this dependency is > > > local so we are just disallowing the conflict table drop, however the > > > ALTER table is allowed so what we are really protecting by protecting > > > the table drop, I think it can be just documented that if user try to > > > drop the table then conflict will not be inserted anymore? > > > > > > findDependentObjects() > > > { > > > ... > > > switch (foundDep->deptype) > > > { > > > .... > > > case DEPENDENCY_INTERNAL: > > > * 1. At the outermost recursion level, we must disallow the > > > * DROP. However, if the owning object is listed in > > > * pendingObjects, just release the caller's lock and return; > > > * we'll eventually complete the DROP when we reach that entry > > > * in the pending list. > > > } > > > } > > > > > > [1] > > > postgres[1333899]=# select * from pg_depend where objid > 16410; > > > classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype > > > ---------+-------+----------+------------+----------+-------------+--------- > > > 1259 | 16420 | 0 | 2615 | 16410 | 0 | n > > > 1259 | 16420 | 0 | 6100 | 16419 | 0 | i > > > (4 rows) > > > > > > 16420 -> conflict_log_table_16419 > > > 16419 -> subscription > > > 16410 -> schema s1 > > > > > > > One approach could be to use something similar to > > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive > > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would > > proceed without error, i.e., it would drop the tables as well without > > including the subscription in the dependency list. But if we try to > > drop a table directly (e.g., DROP TABLE CLT), it will still result in: > > ERROR: cannot drop table because subscription sub1 requires it > > > > I think this way of allowing dropping the conflict table without > caring for the parent object (subscription) is not a good idea. How > about creating a dedicated schema, say pg_conflict for the purpose of > storing conflict tables? This will be similar to the pg_toast schema > for toast tables. So, similar to that each database will have a > pg_conflict schema. It prevents the "orphan" problem where a user > accidentally drops the logging schema but the Subscription is still > trying to write to it. pg_dump needs to ignore all system schemas > EXCEPT pg_conflict. This ensures the history is preserved during > migrations while still protecting the tables from accidental user > deletion. About permissions, I think we need to set the schema > permissions so that USAGE is public (so users can SELECT from their > logs) but CREATE is restricted to the superuser/subscription owner. We > may need to think some more about permissions. > > I also tried to reason out if we can allow storing the conflict table > in pg_catalog but here are a few reasons why it won't be a good idea. > I think by default, pg_dump completely ignores the pg_catalog schema. > It assumes pg_catalog contains static system definitions (like > pg_class, pg_proc, etc.) that are re-generated by the initdb process, > not user data. If we place a conflict table in pg_catalog, it will not > be backed up. If a user runs pg_dump/all to migrate to a new server, > their subscription definition will survive, but their entire history > of conflict logs will vanish. Also from the permissions angle, If a > user wants to write a custom PL/pgSQL function to "retry" conflicts, > they might need to DELETE rows from the conflict table after fixing > them. Granting DELETE permissions on a table inside pg_catalog is > non-standard and often frowned upon by security auditors. It blurs the > line between "System Internals" (immutable) and "User Data" (mutable). > So, in short a separate pg_conflict schema appears to be a better solution. Yeah that makes sense. Although I haven't thought about all cases whether it can be a problem anywhere, but meanwhile I tried prototyping with this and it behaves what we want. postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ; relid | schemaname | relname | conflict_type | remote_xid | remote_commit_lsn | remote_commit_ts | remote_origin | replica_identity | remote_tuple | local_conflicts -------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+---------------- +------------------------------------------------------------------------------------------------------------------------------------ 16385 | public | test | update_origin_differs | 761 | 0/01760BD8 | 2025-12-23 11:08:30.583816+00 | pg_16406 | {"a":1} | {"a":1,"b":20} | {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"} (1 row) -- Case1: Alter is not allowed postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406 ADD COLUMN a int; ERROR: 42501: permission denied: "conflict_log_table_16406" is a system catalog LOCATION: RangeVarCallbackForAlterRelation, tablecmds.c:19634 -- Case2: drop is not allowed postgres[1651968]=# drop table pg_conflict.conflict_log_table_16406; ERROR: 42501: permission denied: "conflict_log_table_16406" is a system catalog LOCATION: RangeVarCallbackForDropRelation, tablecmds.c:1803 --Case3: Drop subscription drops it internally postgres[1651968]=# DROP SUBSCRIPTION sub ; NOTICE: 00000: dropped replication slot "sub" on publisher LOCATION: ReplicationSlotDropAtPubNode, subscriptioncmds.c:2470 DROP SUBSCRIPTION postgres[1651968]=# \d pg_conflict.conflict_log_table_16406 Did not find any relation named "pg_conflict.conflict_log_table_16406". -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-12-24T07:41:57Z
On Tue, Dec 23, 2025 at 5:49 PM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Dilip. > > Here are some review comments after a first pass of patch v15-0001. > And, some more review comments for patch v15-0001. ====== src/backend/catalog/pg_subscription.c 1. + /* Always set the destination, default will be log. */ + values[Anum_pg_subscription_sublogdestination - 1] = + CStringGetTextDatum(ConflictLogDestLabels[opts.logdest]); + None of the other values[] assignments here have a comment talking about defaults, etc, so I don't think this needs one either. ====== src/backend/commands/subscriptioncmds.c CreateSubscription: 2. + { + char conflict_table_name[NAMEDATALEN]; + Oid namespaceId, logrelid; In similar code in AlterSubscription, this was just called 'relname'. Better to be consistent where possible. I think 'relname' would be fine here too. ~~~ 3. + else + { + /* Destination is "log"; no table is needed. */ + values[Anum_pg_subscription_subconflictlogrelid - 1] = + ObjectIdGetDatum(InvalidOid); + } I think it's better to say this using coded Asserts instead of just assertions in comments. e.g. /* There is no conflict log table */ Assert(opts.logdest == CONFLICT_LOG_DEST_LOG) values[...] = ObjectIdGetDatum(InvalidOid); ~~~ 4. + if (isTempNamespace(namespaceId)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("could not generate conflict log table \"%s\"", + conflictrel), + errdetail("Conflict log tables cannot be created in a temporary namespace."), + errhint("Ensure your 'search_path' is set to permanent schema."))); + + /* Report an error if the specified conflict log table already exists. */ + if (OidIsValid(get_relname_relid(conflictrel, namespaceId))) + ereport(ERROR, + (errcode(ERRCODE_DUPLICATE_TABLE), + errmsg("could not generate conflict log table \"%s.%s\"", + get_namespace_name(namespaceId), conflictrel), + errdetail("A table with the internally generated name already exists."), + errhint("Drop the existing table or change your 'search_path' to use a different schema."))); I'm not sure about these messages: 4a. "could not generate conflict log table". - Why say "generate"? - We don't need to say "conflict log table" -- that's already in the detail SUGGESTION (something like) "could not create relation \"%s\"" ~ 4b. For the 2nd error, I think errmsg should look like below, same as any other duplicate table error. "relation \"%s.%s\" already exists" ~ 4c. + errdetail("A table with the internally generated name already exists."), I don't think this errdetail added anything useful. It already exists -- that's all you need to know. Why does it matter that the name was generated automatically? ~~~ GetLogDestination: 5. + for (int i = CONFLICT_LOG_DEST_LOG; i <= CONFLICT_LOG_DEST_ALL; i++) + { + if (pg_strcasecmp(dest, ConflictLogDestLabels[i]) == 0) + return (ConflictLogDest) i; + } + + /* Unrecognized string. */ + return CONFLICT_LOG_DEST_INVALID; This code is making rash assumptions about the enums values being the same as ordinals. IMO it should be written like: if (strcmp(dest, "log") == 0) return CONFLICT_LOG_DEST_LOG; if (strcmp(dest, "table") == 0) return CONFLICT_LOG_DEST_TABLE; if (strcmp(dest, "all") == 0) return CONFLICT_LOG_DEST_ALL; /* Unrecognized dest. */ ereport(ERROR, ...); ~~~ IsConflictLogTable 6. +bool +IsConflictLogTable(Oid relid) +{ + Relation rel; If you enforce (as I've suggested elsewhere previously) a name convention that the CLT must have "pg_" prefix, then perhaps you can exit early from this function without having to scan all the OIDs, just by checking first that the RelationGetRelationName(rel) must start with "pg_". ====== src/test/regress/sql/subscription.sql 7. +-- fail - unrecognized format value /format/parameter/ ~~ 8. Some of these tests are grouped together like "ALTER: State transitions" and "Ensure drop table is not allowed, and DROP SUBSCRIPTION reaps the table" etc. These group boundaries should be identified more clearly with more substantial comments. e.g #-- ================================== #-- ALTER - state transition tests #-- ================================== ~~~ 9. The "pg_relation_is_publishable" seems misplaced because it is buried among the drop/reap tests. Maybe it should come before all that. ====== src/tools/pgindent/typedefs.list 10. What about "typedef enum ConflictLogDest" ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-24T10:32:15Z
On Tue, Dec 23, 2025 at 5:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > I think this needs more thought, others can be fixed. > > > > > > > > > 2) > > > > > postgres=# drop schema shveta cascade; > > > > > NOTICE: drop cascades to subscription sub1 > > > > > ERROR: global objects cannot be deleted by doDeletion > > > > > > > > > > Is this expected? Is the user supposed to see this error? > > > > > > > > > See below code, so this says if the object being dropped is the > > > > outermost object (i.e. if we are dropping the table directly) then it > > > > will disallow dropping the object on which it has INTERNAL DEPENDENCY, > > > > OTOH if the object is being dropped via recursive drop (i.e. the table > > > > is being dropped while dropping the schema) then object on which it > > > > has INTERNAL dependency will also be added to the deletion list and > > > > later will be dropped via doDeletion and later we are getting error as > > > > subscription is a global object. I thought maybe we can handle an > > > > additional case that the INTERNAL DEPENDENCY, is on subscription the > > > > disallow dropping it irrespective of whether it is being called > > > > directly or via recursive drop but then it will give an issue even > > > > when we are trying to drop table during subscription drop, we can make > > > > handle this case as well via 'flags' passed in findDependentObjects() > > > > but need more investigation. > > > > > > > > Seeing this complexity makes me think more on is it really worth it to > > > > maintain this dependency? Because during subscription drop we anyway > > > > have to call performDeletion externally because this dependency is > > > > local so we are just disallowing the conflict table drop, however the > > > > ALTER table is allowed so what we are really protecting by protecting > > > > the table drop, I think it can be just documented that if user try to > > > > drop the table then conflict will not be inserted anymore? > > > > > > > > findDependentObjects() > > > > { > > > > ... > > > > switch (foundDep->deptype) > > > > { > > > > .... > > > > case DEPENDENCY_INTERNAL: > > > > * 1. At the outermost recursion level, we must disallow the > > > > * DROP. However, if the owning object is listed in > > > > * pendingObjects, just release the caller's lock and return; > > > > * we'll eventually complete the DROP when we reach that entry > > > > * in the pending list. > > > > } > > > > } > > > > > > > > [1] > > > > postgres[1333899]=# select * from pg_depend where objid > 16410; > > > > classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype > > > > ---------+-------+----------+------------+----------+-------------+--------- > > > > 1259 | 16420 | 0 | 2615 | 16410 | 0 | n > > > > 1259 | 16420 | 0 | 6100 | 16419 | 0 | i > > > > (4 rows) > > > > > > > > 16420 -> conflict_log_table_16419 > > > > 16419 -> subscription > > > > 16410 -> schema s1 > > > > > > > > > > One approach could be to use something similar to > > > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive > > > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would > > > proceed without error, i.e., it would drop the tables as well without > > > including the subscription in the dependency list. But if we try to > > > drop a table directly (e.g., DROP TABLE CLT), it will still result in: > > > ERROR: cannot drop table because subscription sub1 requires it > > > > > > > I think this way of allowing dropping the conflict table without > > caring for the parent object (subscription) is not a good idea. How > > about creating a dedicated schema, say pg_conflict for the purpose of > > storing conflict tables? This will be similar to the pg_toast schema > > for toast tables. So, similar to that each database will have a > > pg_conflict schema. It prevents the "orphan" problem where a user > > accidentally drops the logging schema but the Subscription is still > > trying to write to it. pg_dump needs to ignore all system schemas > > EXCEPT pg_conflict. This ensures the history is preserved during > > migrations while still protecting the tables from accidental user > > deletion. About permissions, I think we need to set the schema > > permissions so that USAGE is public (so users can SELECT from their > > logs) but CREATE is restricted to the superuser/subscription owner. We > > may need to think some more about permissions. > > > > I also tried to reason out if we can allow storing the conflict table > > in pg_catalog but here are a few reasons why it won't be a good idea. > > I think by default, pg_dump completely ignores the pg_catalog schema. > > It assumes pg_catalog contains static system definitions (like > > pg_class, pg_proc, etc.) that are re-generated by the initdb process, > > not user data. If we place a conflict table in pg_catalog, it will not > > be backed up. If a user runs pg_dump/all to migrate to a new server, > > their subscription definition will survive, but their entire history > > of conflict logs will vanish. Also from the permissions angle, If a > > user wants to write a custom PL/pgSQL function to "retry" conflicts, > > they might need to DELETE rows from the conflict table after fixing > > them. Granting DELETE permissions on a table inside pg_catalog is > > non-standard and often frowned upon by security auditors. It blurs the > > line between "System Internals" (immutable) and "User Data" (mutable). > > So, in short a separate pg_conflict schema appears to be a better solution. > > Yeah that makes sense. Although I haven't thought about all cases > whether it can be a problem anywhere, but meanwhile I tried > prototyping with this and it behaves what we want. > > postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ; > relid | schemaname | relname | conflict_type | remote_xid | > remote_commit_lsn | remote_commit_ts | remote_origin | > replica_identity | remote_tuple > | > local_conflicts > -------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+---------------- > +------------------------------------------------------------------------------------------------------------------------------------ > 16385 | public | test | update_origin_differs | 761 | > 0/01760BD8 | 2025-12-23 11:08:30.583816+00 | pg_16406 | > {"a":1} | {"a":1,"b":20} > | {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"} > (1 row) > > -- Case1: Alter is not allowed > postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406 > ADD COLUMN a int; > ERROR: 42501: permission denied: "conflict_log_table_16406" is a system catalog > LOCATION: RangeVarCallbackForAlterRelation, tablecmds.c:19634 > How was this achieved? Did you modify IsSystemClass to behave similarly to IsToastClass? I tried to analyze whether there are alternative approaches. The possible options I see are: 1) heap_create_with_catalog() provides the boolean argument use_user_acl, which is meant to apply user-defined default privileges. In theory, we could predefine default ACLs for our schema and then invoke heap_create_with_catalog() with use_user_acl = true. But it’s not clear how to do this purely from internal code. We would need to mimic or reuse the logic behind SetDefaultACLsInSchemas. 2) Another option is to create the table using heap_create_with_catalog() with use_user_acl = false, and then explicitly update pg_class.relacl for that table, similar to what ExecGrant_Relation does when processing GRANT/REVOKE. But I couldn’t find any existing internal code paths (outside of the GRANT/REVOKE implementation itself) that do this kind of post-creation ACL manipulation. ~~ So overall, I feel changing IsSystemClass is the simpler way right now. To set ACL before/after/during heap_create_with_catalog is a tricky thing, at-least I could not find an easier way to do this, unless I have missed something. Thoughts on possible approaches? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-24T11:59:15Z
On Fri, 19 Dec 2025 at 11:49, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Dec 19, 2025 at 10:40 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Fri, Dec 19, 2025 at 9:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > 2. Do we want to support multi destination then providing string like > > > 'conflict_log_destination = 'log,table,..' make more sense but then we > > > would have to store as a string in catalog and parse it everytime we > > > insert conflicts or alter subscription OTOH currently I have just > > > support single option log/table/both which make things much easy > > > because then in catalog we can store as a single char field and don't > > > need any parsing. And since the input are taken as a string itself, > > > even if in future we want to support more options like 'log,table,..' > > > it would be backward compatible with old options. > > > > I feel, combination of options might be a good idea, similar to how > > 'log_destination' provides. But it can be done in future versions and > > the first draft can be a simple one. > > > > Considering the future extension of storing conflict information in > multiple places, it would be good to follow log_destination. Yes, it > is more work now but I feel that will be future-proof. The attached patch has the changes to specify conflict_log_destination with a combination of table, log and all. This is implemented in v15-0006 patch, there is no change in other patched v15-0001 ... v15-0005 patches which are the same as the patches attached from [1]. [1] - https://www.postgresql.org/message-id/CALDaNm1zR1L2oq-LqYEcc8-wTZYjfJsiaTC_jQ8pGwbm0fv%2B3Q%40mail.gmail.com Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2025-12-25T07:40:34Z
On Wed, Dec 24, 2025 at 4:02 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Dec 23, 2025 at 5:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > I think this needs more thought, others can be fixed. > > > > > > > > > > > 2) > > > > > > postgres=# drop schema shveta cascade; > > > > > > NOTICE: drop cascades to subscription sub1 > > > > > > ERROR: global objects cannot be deleted by doDeletion > > > > > > > > > > > > Is this expected? Is the user supposed to see this error? > > > > > > > > > > > See below code, so this says if the object being dropped is the > > > > > outermost object (i.e. if we are dropping the table directly) then it > > > > > will disallow dropping the object on which it has INTERNAL DEPENDENCY, > > > > > OTOH if the object is being dropped via recursive drop (i.e. the table > > > > > is being dropped while dropping the schema) then object on which it > > > > > has INTERNAL dependency will also be added to the deletion list and > > > > > later will be dropped via doDeletion and later we are getting error as > > > > > subscription is a global object. I thought maybe we can handle an > > > > > additional case that the INTERNAL DEPENDENCY, is on subscription the > > > > > disallow dropping it irrespective of whether it is being called > > > > > directly or via recursive drop but then it will give an issue even > > > > > when we are trying to drop table during subscription drop, we can make > > > > > handle this case as well via 'flags' passed in findDependentObjects() > > > > > but need more investigation. > > > > > > > > > > Seeing this complexity makes me think more on is it really worth it to > > > > > maintain this dependency? Because during subscription drop we anyway > > > > > have to call performDeletion externally because this dependency is > > > > > local so we are just disallowing the conflict table drop, however the > > > > > ALTER table is allowed so what we are really protecting by protecting > > > > > the table drop, I think it can be just documented that if user try to > > > > > drop the table then conflict will not be inserted anymore? > > > > > > > > > > findDependentObjects() > > > > > { > > > > > ... > > > > > switch (foundDep->deptype) > > > > > { > > > > > .... > > > > > case DEPENDENCY_INTERNAL: > > > > > * 1. At the outermost recursion level, we must disallow the > > > > > * DROP. However, if the owning object is listed in > > > > > * pendingObjects, just release the caller's lock and return; > > > > > * we'll eventually complete the DROP when we reach that entry > > > > > * in the pending list. > > > > > } > > > > > } > > > > > > > > > > [1] > > > > > postgres[1333899]=# select * from pg_depend where objid > 16410; > > > > > classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype > > > > > ---------+-------+----------+------------+----------+-------------+--------- > > > > > 1259 | 16420 | 0 | 2615 | 16410 | 0 | n > > > > > 1259 | 16420 | 0 | 6100 | 16419 | 0 | i > > > > > (4 rows) > > > > > > > > > > 16420 -> conflict_log_table_16419 > > > > > 16419 -> subscription > > > > > 16410 -> schema s1 > > > > > > > > > > > > > One approach could be to use something similar to > > > > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive > > > > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would > > > > proceed without error, i.e., it would drop the tables as well without > > > > including the subscription in the dependency list. But if we try to > > > > drop a table directly (e.g., DROP TABLE CLT), it will still result in: > > > > ERROR: cannot drop table because subscription sub1 requires it > > > > > > > > > > I think this way of allowing dropping the conflict table without > > > caring for the parent object (subscription) is not a good idea. How > > > about creating a dedicated schema, say pg_conflict for the purpose of > > > storing conflict tables? This will be similar to the pg_toast schema > > > for toast tables. So, similar to that each database will have a > > > pg_conflict schema. It prevents the "orphan" problem where a user > > > accidentally drops the logging schema but the Subscription is still > > > trying to write to it. pg_dump needs to ignore all system schemas > > > EXCEPT pg_conflict. This ensures the history is preserved during > > > migrations while still protecting the tables from accidental user > > > deletion. About permissions, I think we need to set the schema > > > permissions so that USAGE is public (so users can SELECT from their > > > logs) but CREATE is restricted to the superuser/subscription owner. We > > > may need to think some more about permissions. > > > > > > I also tried to reason out if we can allow storing the conflict table > > > in pg_catalog but here are a few reasons why it won't be a good idea. > > > I think by default, pg_dump completely ignores the pg_catalog schema. > > > It assumes pg_catalog contains static system definitions (like > > > pg_class, pg_proc, etc.) that are re-generated by the initdb process, > > > not user data. If we place a conflict table in pg_catalog, it will not > > > be backed up. If a user runs pg_dump/all to migrate to a new server, > > > their subscription definition will survive, but their entire history > > > of conflict logs will vanish. Also from the permissions angle, If a > > > user wants to write a custom PL/pgSQL function to "retry" conflicts, > > > they might need to DELETE rows from the conflict table after fixing > > > them. Granting DELETE permissions on a table inside pg_catalog is > > > non-standard and often frowned upon by security auditors. It blurs the > > > line between "System Internals" (immutable) and "User Data" (mutable). > > > So, in short a separate pg_conflict schema appears to be a better solution. > > > > Yeah that makes sense. Although I haven't thought about all cases > > whether it can be a problem anywhere, but meanwhile I tried > > prototyping with this and it behaves what we want. > > > > postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ; > > relid | schemaname | relname | conflict_type | remote_xid | > > remote_commit_lsn | remote_commit_ts | remote_origin | > > replica_identity | remote_tuple > > | > > local_conflicts > > -------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+---------------- > > +------------------------------------------------------------------------------------------------------------------------------------ > > 16385 | public | test | update_origin_differs | 761 | > > 0/01760BD8 | 2025-12-23 11:08:30.583816+00 | pg_16406 | > > {"a":1} | {"a":1,"b":20} > > | {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"} > > (1 row) > > > > -- Case1: Alter is not allowed > > postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406 > > ADD COLUMN a int; > > ERROR: 42501: permission denied: "conflict_log_table_16406" is a system catalog > > LOCATION: RangeVarCallbackForAlterRelation, tablecmds.c:19634 > > > > How was this achieved? Did you modify IsSystemClass to behave > similarly to IsToastClass? Right > I tried to analyze whether there are alternative approaches. The > possible options I see are: > > 1) > heap_create_with_catalog() provides the boolean argument use_user_acl, > which is meant to apply user-defined default privileges. In theory, we > could predefine default ACLs for our schema and then invoke > heap_create_with_catalog() with use_user_acl = true. But it’s not > clear how to do this purely from internal code. We would need to mimic > or reuse the logic behind SetDefaultACLsInSchemas. > 2) > Another option is to create the table using heap_create_with_catalog() > with use_user_acl = false, and then explicitly update pg_class.relacl > for that table, similar to what ExecGrant_Relation does when > processing GRANT/REVOKE. But I couldn’t find any existing internal > code paths (outside of the GRANT/REVOKE implementation itself) that do > this kind of post-creation ACL manipulation. I haven't analyzed this options, I will do that but not before Jan 3rd as I will be away from my laptop for a week. > So overall, I feel changing IsSystemClass is the simpler way right > now. To set ACL before/after/during heap_create_with_catalog is a > tricky thing, at-least I could not find an easier way to do this, > unless I have missed something. > Thoughts on possible approaches? Here is the patches I have changed by using IsSystemClass(), based on this many other things changed like we don't need to check for the temp schema and also the caller of create_conflict_log_table() now don't need to find the creation schema so it don't need to generate the relname so that part is also moved within create_conflict_log_table(). Fixed most of the comments given by Peter and Shveta, although some of them are still open e.g. the name of the conflict log table as of now I have kept as conflict_log_table_<subid> other options are 1. pg_conflict_<subid> 2. conflict_log_<subid> 3. sub_conflict_log_<subid> I prefer 3, considering it says this table holds subscription conflict logs. Thoughts? Vignesh, your patches have to be rebased on the new version. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-26T15:27:57Z
On Thu, 25 Dec 2025 at 13:10, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Dec 24, 2025 at 4:02 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Dec 23, 2025 at 5:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > > > I think this needs more thought, others can be fixed. > > > > > > > > > > > > > 2) > > > > > > > postgres=# drop schema shveta cascade; > > > > > > > NOTICE: drop cascades to subscription sub1 > > > > > > > ERROR: global objects cannot be deleted by doDeletion > > > > > > > > > > > > > > Is this expected? Is the user supposed to see this error? > > > > > > > > > > > > > See below code, so this says if the object being dropped is the > > > > > > outermost object (i.e. if we are dropping the table directly) then it > > > > > > will disallow dropping the object on which it has INTERNAL DEPENDENCY, > > > > > > OTOH if the object is being dropped via recursive drop (i.e. the table > > > > > > is being dropped while dropping the schema) then object on which it > > > > > > has INTERNAL dependency will also be added to the deletion list and > > > > > > later will be dropped via doDeletion and later we are getting error as > > > > > > subscription is a global object. I thought maybe we can handle an > > > > > > additional case that the INTERNAL DEPENDENCY, is on subscription the > > > > > > disallow dropping it irrespective of whether it is being called > > > > > > directly or via recursive drop but then it will give an issue even > > > > > > when we are trying to drop table during subscription drop, we can make > > > > > > handle this case as well via 'flags' passed in findDependentObjects() > > > > > > but need more investigation. > > > > > > > > > > > > Seeing this complexity makes me think more on is it really worth it to > > > > > > maintain this dependency? Because during subscription drop we anyway > > > > > > have to call performDeletion externally because this dependency is > > > > > > local so we are just disallowing the conflict table drop, however the > > > > > > ALTER table is allowed so what we are really protecting by protecting > > > > > > the table drop, I think it can be just documented that if user try to > > > > > > drop the table then conflict will not be inserted anymore? > > > > > > > > > > > > findDependentObjects() > > > > > > { > > > > > > ... > > > > > > switch (foundDep->deptype) > > > > > > { > > > > > > .... > > > > > > case DEPENDENCY_INTERNAL: > > > > > > * 1. At the outermost recursion level, we must disallow the > > > > > > * DROP. However, if the owning object is listed in > > > > > > * pendingObjects, just release the caller's lock and return; > > > > > > * we'll eventually complete the DROP when we reach that entry > > > > > > * in the pending list. > > > > > > } > > > > > > } > > > > > > > > > > > > [1] > > > > > > postgres[1333899]=# select * from pg_depend where objid > 16410; > > > > > > classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype > > > > > > ---------+-------+----------+------------+----------+-------------+--------- > > > > > > 1259 | 16420 | 0 | 2615 | 16410 | 0 | n > > > > > > 1259 | 16420 | 0 | 6100 | 16419 | 0 | i > > > > > > (4 rows) > > > > > > > > > > > > 16420 -> conflict_log_table_16419 > > > > > > 16419 -> subscription > > > > > > 16410 -> schema s1 > > > > > > > > > > > > > > > > One approach could be to use something similar to > > > > > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive > > > > > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would > > > > > proceed without error, i.e., it would drop the tables as well without > > > > > including the subscription in the dependency list. But if we try to > > > > > drop a table directly (e.g., DROP TABLE CLT), it will still result in: > > > > > ERROR: cannot drop table because subscription sub1 requires it > > > > > > > > > > > > > I think this way of allowing dropping the conflict table without > > > > caring for the parent object (subscription) is not a good idea. How > > > > about creating a dedicated schema, say pg_conflict for the purpose of > > > > storing conflict tables? This will be similar to the pg_toast schema > > > > for toast tables. So, similar to that each database will have a > > > > pg_conflict schema. It prevents the "orphan" problem where a user > > > > accidentally drops the logging schema but the Subscription is still > > > > trying to write to it. pg_dump needs to ignore all system schemas > > > > EXCEPT pg_conflict. This ensures the history is preserved during > > > > migrations while still protecting the tables from accidental user > > > > deletion. About permissions, I think we need to set the schema > > > > permissions so that USAGE is public (so users can SELECT from their > > > > logs) but CREATE is restricted to the superuser/subscription owner. We > > > > may need to think some more about permissions. > > > > > > > > I also tried to reason out if we can allow storing the conflict table > > > > in pg_catalog but here are a few reasons why it won't be a good idea. > > > > I think by default, pg_dump completely ignores the pg_catalog schema. > > > > It assumes pg_catalog contains static system definitions (like > > > > pg_class, pg_proc, etc.) that are re-generated by the initdb process, > > > > not user data. If we place a conflict table in pg_catalog, it will not > > > > be backed up. If a user runs pg_dump/all to migrate to a new server, > > > > their subscription definition will survive, but their entire history > > > > of conflict logs will vanish. Also from the permissions angle, If a > > > > user wants to write a custom PL/pgSQL function to "retry" conflicts, > > > > they might need to DELETE rows from the conflict table after fixing > > > > them. Granting DELETE permissions on a table inside pg_catalog is > > > > non-standard and often frowned upon by security auditors. It blurs the > > > > line between "System Internals" (immutable) and "User Data" (mutable). > > > > So, in short a separate pg_conflict schema appears to be a better solution. > > > > > > Yeah that makes sense. Although I haven't thought about all cases > > > whether it can be a problem anywhere, but meanwhile I tried > > > prototyping with this and it behaves what we want. > > > > > > postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ; > > > relid | schemaname | relname | conflict_type | remote_xid | > > > remote_commit_lsn | remote_commit_ts | remote_origin | > > > replica_identity | remote_tuple > > > | > > > local_conflicts > > > -------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+---------------- > > > +------------------------------------------------------------------------------------------------------------------------------------ > > > 16385 | public | test | update_origin_differs | 761 | > > > 0/01760BD8 | 2025-12-23 11:08:30.583816+00 | pg_16406 | > > > {"a":1} | {"a":1,"b":20} > > > | {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"} > > > (1 row) > > > > > > -- Case1: Alter is not allowed > > > postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406 > > > ADD COLUMN a int; > > > ERROR: 42501: permission denied: "conflict_log_table_16406" is a system catalog > > > LOCATION: RangeVarCallbackForAlterRelation, tablecmds.c:19634 > > > > > > > How was this achieved? Did you modify IsSystemClass to behave > > similarly to IsToastClass? > > Right > > > I tried to analyze whether there are alternative approaches. The > > possible options I see are: > > > > 1) > > heap_create_with_catalog() provides the boolean argument use_user_acl, > > which is meant to apply user-defined default privileges. In theory, we > > could predefine default ACLs for our schema and then invoke > > heap_create_with_catalog() with use_user_acl = true. But it’s not > > clear how to do this purely from internal code. We would need to mimic > > or reuse the logic behind SetDefaultACLsInSchemas. > > 2) > > Another option is to create the table using heap_create_with_catalog() > > with use_user_acl = false, and then explicitly update pg_class.relacl > > for that table, similar to what ExecGrant_Relation does when > > processing GRANT/REVOKE. But I couldn’t find any existing internal > > code paths (outside of the GRANT/REVOKE implementation itself) that do > > this kind of post-creation ACL manipulation. > > I haven't analyzed this options, I will do that but not before Jan 3rd > as I will be away from my laptop for a week. > > > So overall, I feel changing IsSystemClass is the simpler way right > > now. To set ACL before/after/during heap_create_with_catalog is a > > tricky thing, at-least I could not find an easier way to do this, > > unless I have missed something. > > Thoughts on possible approaches? > > Here is the patches I have changed by using IsSystemClass(), based on > this many other things changed like we don't need to check for the > temp schema and also the caller of create_conflict_log_table() now > don't need to find the creation schema so it don't need to generate > the relname so that part is also moved within > create_conflict_log_table(). Fixed most of the comments given by > Peter and Shveta, although some of them are still open e.g. the name > of the conflict log table as of now I have kept as > conflict_log_table_<subid> other options are > > 1. pg_conflict_<subid> > 2. conflict_log_<subid> > 3. sub_conflict_log_<subid> > > I prefer 3, considering it says this table holds subscription conflict > logs. Thoughts? > > Vignesh, your patches have to be rebased on the new version. Here is a rebased version of the remaining patches. Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2025-12-29T06:02:38Z
On Thu, Dec 25, 2025 at 1:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Here is the patches I have changed by using IsSystemClass(), based on > this many other things changed like we don't need to check for the > temp schema and also the caller of create_conflict_log_table() now > don't need to find the creation schema so it don't need to generate > the relname so that part is also moved within > create_conflict_log_table(). Fixed most of the comments given by > Peter and Shveta, although some of them are still open e.g. the name > of the conflict log table as of now I have kept as > conflict_log_table_<subid> other options are > > 1. pg_conflict_<subid> > 2. conflict_log_<subid> > 3. sub_conflict_log_<subid> > > I prefer 3, considering it says this table holds subscription conflict > logs. Thoughts? > I was checking how pg_toast does it. It creates tables with names: "pg_toast_%u", relOid We can do similar i.e., the schema name as pg_conflict and table name as pg_conflict_<subid>. Thoughts? Few comments on 001: 1) It will be good to display conflict tablename in \dRs command 2) postgres=# ALTER TABLE sch1.t3 set schema pg_toast; ERROR: cannot move objects into or out of TOAST schema But when we move to pg_conflict, it works. It should error out as well. postgres=# ALTER TABLE sch1.t1 set schema pg_conflict; ALTER TABLE 3) Shall we LOG CLT creation and drop during create/alter sub? 4) create_conflict_log_table() + /* Report an error if the specified conflict log table already exists. */ + if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE))) + ereport(ERROR, + (errcode(ERRCODE_DUPLICATE_TABLE), + errmsg("relation \"%s.%s\" already exists", + get_namespace_name(PG_CONFLICT_NAMESPACE), relname))); I am unable to think of a valid user-scenario when the above will be hit. Do we need this as a user-error or simply an Assert or internal-error will do? 5) + /* + * Establish an internal dependency between the conflict log table and the + * subscription. By using DEPENDENCY_INTERNAL, we ensure the table is + * automatically reaped when the subscription is dropped. This also + * prevents the table from being dropped independently unless the + * subscription itself is removed. + */ + ObjectAddressSet(myself, RelationRelationId, relid); + ObjectAddressSet(subaddr, SubscriptionRelationId, subid); + recordDependencyOn(&myself, &subaddr, DEPENDENCY_INTERNAL); Now that we have pg_conflict, which is treated similarly to a system catalog, I’m wondering whether we actually need to maintain this dependency to prevent the CLT table or schema from being dropped. Also, given that this currently goes against the convention that a shared object cannot be present in pg_depend, could DropSubscription() and AlterSubscription() instead handle dropping the table explicitly in required scenarios? 6) + descr => 'reserved schema for conflict tables', Shall we say: 'reserved schema for subscription-specific conflict tables' or anything better to include that it is subscription related? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-01T13:46:00Z
On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > 2. > > > > > +typedef enum ConflictLogDest > > > > > +{ > > > > > + /* Log conflicts to the server logs */ > > > > > + CONFLICT_LOG_DEST_LOG = 1 << 0, /* 0x01 */ > > > > > + > > > > > + /* Log conflicts to an internally managed conflict log table */ > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1, /* 0x02 */ > > > > > + > > > > > + /* Convenience bitmask for all supported destinations */ > > > > > + CONFLICT_LOG_DEST_ALL = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE) > > > > > +} ConflictLogDest; > > > > > + > > > > > +/* > > > > > + * Array mapping for converting internal enum to string. > > > > > + */ > > > > > +static const char *const ConflictLogDestNames[] = { > > > > > + [CONFLICT_LOG_DEST_LOG] = "log", > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table", > > > > > + [CONFLICT_LOG_DEST_ALL] = "all" > > > > > +}; > > > > > > > > > > Defining an array this way could be an Array size issue. Actually the > > > > > array has just three elements so the last element should be at > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will > > > > > be ConflictLogDestNames[3]. Can we define by referring the following > > > > > existing way: > > > > > > I was analyzing this because I remember we were initially using the > > > format you suggested and switched to the bit format to enable direct > > > bitwise operations elsewhere. I think Peter suggested that [1], and > > > the argument was that the bitwise operation is easy if we represent > > > them as a bit. Also, since we would not have too many options, the > > > array size shouldn't be an issue. But I understand your point: adding > > > more elements will cause the array size to grow very fast as this is > > > using sparse array. Let's see what others think about this, and then > > > we can decide whether to change it back? > > > > > > > The benefit of the current approach is that checking whether the > > destination is TABLE becomes straightforward: > > > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE) > > > > if we go by regular enum values (simialr to XLogSource), then it will be: > > > > if (opts.logdest == CONFLICT_LOG_DEST_TABLE || > > opts.logdest == CONFLICT_LOG_DEST_ALL) > > Right > > > For ease of extending the enum and its corresponding text mappings, my > > personal preference is still the regular (non-bitwise) enum approach. > > Yeah, that's my personal preference too. But Peter had strong stand > on keeping as bitwise so that we can directly use > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations. > Since this array shouldn't have many options, a sparse array is not an > issue. So lets see what @Peter Smith has to say here and then we can > build a concensus on this. > > > But if we anticipate adding more destination options in the future > > that would be covered by ALL, checking for those in code could lead to > > growing chains of OR conditions, whereas the bitwise approach scales > > more cleanly in that respect. So I think the choice depends on what > > kinds of future extensions we expect. > > > > Do we have plans to add more options that would naturally fall under > > ALL? Or do we instead expect additions that are mutually exclusive; > > for example, splitting CONFLICT_LOG_DEST_LOG into something like > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may > > not make sense to group under ALL in the same way? > > Currently, I haven't considered which options would naturally fall > under "ALL." Perhaps if we plan targets other than logs and files, > those might also fall under "ALL." I have fixed all the reported comments except these four. 1. I'm changing the ConflictLogDest enum from bitmap to integer. I can revert this in the next version but I want to see Peter's opinion first, as he suggested using a bitmap to easily apply bitwise operators. 2. Change how to display conflict log table in \dRs+ as suggested by Shveta and Amit have agreement on the same, I will update that in next version. 3. As Vignesh reported, we are still determining the best way to change the client's ownership when the subscription ownership changes. 4. pg_conflict is the catalog schema and as Nisha reported, non-superusers aren't allowed to access the objects within it. Because of this, SELECT, DELETE, and TRUNCATE are disallowed even for the subscription owner if that owner is a non-superuser. I am working on the fix. Note: I have included the base patch for reporting the schema qualified name, which is also being discussed in other thread, @vignesh C you need to rebase your patch and might need to fix the table name, as we are now using `pg_conflict_log_<subid>` for the conflict log table. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-02T09:10:02Z
On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > 4. pg_conflict is the catalog schema and as Nisha reported, > non-superusers aren't allowed to access the objects within it. Because > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the > subscription owner if that owner is a non-superuser. I am working on > the fix. While analyzing this, I realized that the schema ACL check happens very early in analyze phase [1]. I'm not sure if we can bypass the subscription owner from this check at that stage without implementing a hacky solution. Another option is to remove restrictions from the pg_conflict schema for all users and keep only table-level restrictions within that schema. I am exploring how to implement this. #1 0x0000561b547713fe in aclcheck_error (aclerr=ACLCHECK_NO_PRIV, objtype=OBJECT_SCHEMA, objectname=0x561b8299a4d0 "pg_conflict") at aclchk.c:2813 #2 0x0000561b54790fe7 in LookupExplicitNamespace (nspname=0x561b8299a4d0 "pg_conflict", missing_ok=true) at namespace.c:3481 #3 0x0000561b5478ca48 in RangeVarGetRelidExtended (relation=0x561b8299a590, lockmode=1, flags=1, callback=0x0, callback_arg=0x0) at namespace.c:531 #4 0x0000561b54645779 in relation_openrv_extended (relation=0x561b8299a590, lockmode=1, missing_ok=true) at relation.c:186 #5 0x0000561b5470e7ba in table_openrv_extended (relation=0x561b8299a590, lockmode=1, missing_ok=true) at table.c:108 #6 0x0000561b548383a2 in parserOpenTable (pstate=0x561b8299a7e0, relation=0x561b8299a590, lockmode=1) at parse_relation.c:1433 -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-04T05:48:37Z
On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > 4. pg_conflict is the catalog schema and as Nisha reported, > > non-superusers aren't allowed to access the objects within it. Because > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the > > subscription owner if that owner is a non-superuser. I am working on > > the fix. > > While analyzing this, I realized that the schema ACL check happens > very early in analyze phase [1]. I'm not sure if we can bypass the > subscription owner from this check at that stage without implementing > a hacky solution. Another option is to remove restrictions from the > pg_conflict schema for all users and keep only table-level > restrictions within that schema. I am exploring how to implement this. Dilip, instead of granting permission (or removing restrictions) on the pg_conflict schema to all users, is there a way to grant USAGE on the schema only to the subscription owner when the conflict log table is created and when the owner is altered for the subscription? I think it should resolve the problem in a better way. Thoughts? Let me know if I am missing something. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-04T05:51:53Z
On Mon, May 4, 2026 at 11:18 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > 4. pg_conflict is the catalog schema and as Nisha reported, > > > non-superusers aren't allowed to access the objects within it. Because > > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the > > > subscription owner if that owner is a non-superuser. I am working on > > > the fix. > > > > While analyzing this, I realized that the schema ACL check happens > > very early in analyze phase [1]. I'm not sure if we can bypass the > > subscription owner from this check at that stage without implementing > > a hacky solution. Another option is to remove restrictions from the > > pg_conflict schema for all users and keep only table-level > > restrictions within that schema. I am exploring how to implement this. > > Dilip, instead of granting permission (or removing restrictions) on > the pg_conflict schema to all users, is there a way to grant USAGE on > the schema only to the subscription owner when the conflict log table > is created and when the owner is altered for the subscription? I think > it should resolve the problem in a better way. Thoughts? Let me know > if I am missing something. Yeah I thought about that but when you create a subscription, you connected using the subscription owner user, who doesn't have the necessary permission to GRANT usage on pg_conflict schema. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-04T09:06:49Z
On Mon, 4 May 2026 at 11:21 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Mon, May 4, 2026 at 11:18 AM shveta malik <shveta.malik@gmail.com> > wrote: > > > > On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> > wrote: > > > > > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> > wrote: > > > > > > > > 4. pg_conflict is the catalog schema and as Nisha reported, > > > > non-superusers aren't allowed to access the objects within it. > Because > > > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the > > > > subscription owner if that owner is a non-superuser. I am working on > > > > the fix. > > > > > > While analyzing this, I realized that the schema ACL check happens > > > very early in analyze phase [1]. I'm not sure if we can bypass the > > > subscription owner from this check at that stage without implementing > > > a hacky solution. Another option is to remove restrictions from the > > > pg_conflict schema for all users and keep only table-level > > > restrictions within that schema. I am exploring how to implement this. > > > > Dilip, instead of granting permission (or removing restrictions) on > > the pg_conflict schema to all users, is there a way to grant USAGE on > > the schema only to the subscription owner when the conflict log table > > is created and when the owner is altered for the subscription? I think > > it should resolve the problem in a better way. Thoughts? Let me know > > if I am missing something. > > Yeah I thought about that but when you create a subscription, you > connected using the subscription owner user, who doesn't have the > necessary permission to GRANT usage on pg_conflict schema. After putting more thoughts I think we should be able to execute internal GRAN function which do not checks whether the user has permission to GRANT or not. — Dilip >
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-04T09:13:13Z
On Mon, May 4, 2026 at 2:37 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, 4 May 2026 at 11:21 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote: >> >> On Mon, May 4, 2026 at 11:18 AM shveta malik <shveta.malik@gmail.com> wrote: >> > >> > On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: >> > > >> > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: >> > > > >> > > > 4. pg_conflict is the catalog schema and as Nisha reported, >> > > > non-superusers aren't allowed to access the objects within it. Because >> > > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the >> > > > subscription owner if that owner is a non-superuser. I am working on >> > > > the fix. >> > > >> > > While analyzing this, I realized that the schema ACL check happens >> > > very early in analyze phase [1]. I'm not sure if we can bypass the >> > > subscription owner from this check at that stage without implementing >> > > a hacky solution. Another option is to remove restrictions from the >> > > pg_conflict schema for all users and keep only table-level >> > > restrictions within that schema. I am exploring how to implement this. >> > >> > Dilip, instead of granting permission (or removing restrictions) on >> > the pg_conflict schema to all users, is there a way to grant USAGE on >> > the schema only to the subscription owner when the conflict log table >> > is created and when the owner is altered for the subscription? I think >> > it should resolve the problem in a better way. Thoughts? Let me know >> > if I am missing something. >> >> Yeah I thought about that but when you create a subscription, you >> connected using the subscription owner user, who doesn't have the >> necessary permission to GRANT usage on pg_conflict schema. > > > After putting more thoughts I think we should be able to execute internal GRAN function which do not checks whether the user has permission to GRANT or not. > I have been trying to find an existing code example that does somethign similar, but could not find one. But if you think it is feasible and found a way, then it is the reasonable solution here. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-04T09:19:08Z
On Mon, May 4, 2026 at 2:43 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, May 4, 2026 at 2:37 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Mon, 4 May 2026 at 11:21 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > >> > >> On Mon, May 4, 2026 at 11:18 AM shveta malik <shveta.malik@gmail.com> wrote: > >> > > >> > On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > >> > > > >> > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > >> > > > > >> > > > 4. pg_conflict is the catalog schema and as Nisha reported, > >> > > > non-superusers aren't allowed to access the objects within it. Because > >> > > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the > >> > > > subscription owner if that owner is a non-superuser. I am working on > >> > > > the fix. > >> > > > >> > > While analyzing this, I realized that the schema ACL check happens > >> > > very early in analyze phase [1]. I'm not sure if we can bypass the > >> > > subscription owner from this check at that stage without implementing > >> > > a hacky solution. Another option is to remove restrictions from the > >> > > pg_conflict schema for all users and keep only table-level > >> > > restrictions within that schema. I am exploring how to implement this. > >> > > >> > Dilip, instead of granting permission (or removing restrictions) on > >> > the pg_conflict schema to all users, is there a way to grant USAGE on > >> > the schema only to the subscription owner when the conflict log table > >> > is created and when the owner is altered for the subscription? I think > >> > it should resolve the problem in a better way. Thoughts? Let me know > >> > if I am missing something. > >> > >> Yeah I thought about that but when you create a subscription, you > >> connected using the subscription owner user, who doesn't have the > >> necessary permission to GRANT usage on pg_conflict schema. > > > > > > After putting more thoughts I think we should be able to execute internal GRAN function which do not checks whether the user has permission to GRANT or not. > > > > I have been trying to find an existing code example that does > somethign similar, but could not find one. But if you think it is > feasible and found a way, then it is the reasonable solution here. Even I am not sure but I am going to experiment with this by calling ExecGrantStmt_oids() while creating the subscription to see if we can come up with something reasonable. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-04T11:28:53Z
On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > 4. pg_conflict is the catalog schema and as Nisha reported, > > non-superusers aren't allowed to access the objects within it. Because > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the > > subscription owner if that owner is a non-superuser. I am working on > > the fix. > > While analyzing this, I realized that the schema ACL check happens > very early in analyze phase [1]. I'm not sure if we can bypass the > subscription owner from this check at that stage without implementing > a hacky solution. Another option is to remove restrictions from the > pg_conflict schema for all users and keep only table-level > restrictions within that schema. I am exploring how to implement this. > How about if we grant usage privilege on pg_conflict schema to pg_create_subscription role and then allow only select, delete, truncate to table_owners on tables in pg_conflict schema? Internally the apply_worker can still make inserts to clt table in pg_conflict schema similar to what we do for toast tables. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-04T13:11:05Z
On Mon, May 4, 2026 at 4:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, May 2, 2026 at 2:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > 4. pg_conflict is the catalog schema and as Nisha reported, > > > non-superusers aren't allowed to access the objects within it. Because > > > of this, SELECT, DELETE, and TRUNCATE are disallowed even for the > > > subscription owner if that owner is a non-superuser. I am working on > > > the fix. > > > > While analyzing this, I realized that the schema ACL check happens > > very early in analyze phase [1]. I'm not sure if we can bypass the > > subscription owner from this check at that stage without implementing > > a hacky solution. Another option is to remove restrictions from the > > pg_conflict schema for all users and keep only table-level > > restrictions within that schema. I am exploring how to implement this. > > > > How about if we grant usage privilege on pg_conflict schema to > pg_create_subscription role and then allow only select, delete, > truncate to table_owners on tables in pg_conflict schema? Internally > the apply_worker can still make inserts to clt table in pg_conflict > schema similar to what we do for toast tables. I am still testing, but I quickly prototyped this approach and basic things seem to be working. <Test case Start> dilipkumarb@dilipkumarb:~/PG/install$ psql -p 5433 postgres[3614939]=# CREATE USER dilip LOGIN ; GRANT pg_create_subscription TO dilip; GRANT ALL ON DATABASE postgres TO dilip; postgres[3614939]=# \q -- Connect to nonsuper user-- dilipkumarb@dilipkumarb:~/PG/install$ psql -p 5433 -U dilip postgres[3615002]=> CREATE SUBSCRIPTION regress_clt_perm_test CONNECTION 'dbname=regress_doesnotexist password=pass' PUBLICATION testpub WITH (connect = false, conflict_log_destination = 'table'); postgres[3615002]=> select * from pg_conflict.pg_conflict_log_164 pg_conflict.pg_conflict_log_16406 pg_conflict.pg_conflict_log_16412 postgres[3615002]=> select * from pg_conflict.pg_conflict_log_16412; relid | schemaname | relname | conflict_type | remote_xid | remote_commit_lsn | remote_commit_ts | remote_origin | replica_identity | remote_tuple | local _conflicts -------+------------+---------+---------------+------------+-------------------+------------------+---------------+------------------+--------------+------ ----------- (0 rows) postgres[3615002]=> delete from pg_conflict.pg_conflict_log_16412; DELETE 0 postgres[3615002]=> TRUNCATE pg_conflict.pg_conflict_log_16412; TRUNCATE TABLE postgres[3615002]=> \q dilipkumarb@dilipkumarb:~/PG/install$ psql -p 5433 psql (19devel) Type "help" for help. --Create another user to test non subscription owner which has pg_create_subscription role granted do not have access on another subscription's conflict log tables postgres[3615293]=# CREATE USER dilip1 LOGIN; GRANT pg_create_subscription TO dilip1; GRANT ALL ON DATABASE postgres TO dilip1; dilipkumarb@dilipkumarb:~/PG/install$ psql -p 5433 -U dilip1 psql (19devel) Type "help" for help. postgres[3615370]=> select * from pg_conflict.pg_conflict_log_16412; ERROR: 42501: permission denied for table pg_conflict_log_16412 LOCATION: aclcheck_error, aclchk.c:2813 postgres[3615370]=> delete from pg_conflict.pg_conflict_log_16412; ERROR: 42501: permission denied for table pg_conflict_log_16412 LOCATION: aclcheck_error, aclchk.c:2813 <Test Case Ends> PFA, poc patch for the same. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-05T02:56:38Z
On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > PFA, poc patch for the same. > I know it is POC but I think you need more work to prevent manual inserts/updates on conflict tables. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-05T04:06:55Z
On Tue, May 5, 2026 at 8:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > PFA, poc patch for the same. I like the idea of PoC. It simplifies the implementation. > > > > I know it is POC but I think you need more work to prevent manual > inserts/updates on conflict tables. > I think CheckValidResultRel() handles it. postgres=# insert into pg_conflict.pg_conflict_16391 values (0); ERROR: cannot modify or insert data into conflict log table "pg_conflict_16391" DETAIL: Conflict log tables are system-managed and only support cleanup via DELETE or TRUNCATE thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-05T05:25:44Z
On Fri, May 1, 2026 at 7:16 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I have fixed all the reported comments except these four. > 1. I'm changing the ConflictLogDest enum from bitmap to integer. I can > revert this in the next version but I want to see Peter's opinion > first, as he suggested using a bitmap to easily apply bitwise > operators. > But that created an array size inconvenience. If you want to wait for more comments, I suggest you can keep it as a top-up patch immediately after the patch where it is introduced. Other points: * subscription’s lifecycle. I saw the above funny character in 0002's commit message. * + + ereport(NOTICE, + (errmsg("created conflict log table pg_conflict.\"%s\" for subscription \"%s\"", + relname, subname))); I think we can use a new function introduced by 0001 to get a qualified relname instead of doing it manually here. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-05T12:55:28Z
On Tue, May 5, 2026 at 9:37 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, May 5, 2026 at 8:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > PFA, poc patch for the same. > > I like the idea of PoC. It simplifies the implementation. > > > > > > > > I know it is POC but I think you need more work to prevent manual > > inserts/updates on conflict tables. > > > > I think CheckValidResultRel() handles it. > > postgres=# insert into pg_conflict.pg_conflict_16391 values (0); > ERROR: cannot modify or insert data into conflict log table "pg_conflict_16391" > DETAIL: Conflict log tables are system-managed and only support > cleanup via DELETE or TRUNCATE I think we can tweak a bit and pg_class_aclmask_ext() we can only allow truncate/delete on pg_conflict and block insert and update, here is the modified version. Please let me know your thoughts. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-06T03:54:01Z
On Fri, 1 May 2026 at 19:16, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > 2. > > > > > > +typedef enum ConflictLogDest > > > > > > +{ > > > > > > + /* Log conflicts to the server logs */ > > > > > > + CONFLICT_LOG_DEST_LOG = 1 << 0, /* 0x01 */ > > > > > > + > > > > > > + /* Log conflicts to an internally managed conflict log table */ > > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1, /* 0x02 */ > > > > > > + > > > > > > + /* Convenience bitmask for all supported destinations */ > > > > > > + CONFLICT_LOG_DEST_ALL = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE) > > > > > > +} ConflictLogDest; > > > > > > + > > > > > > +/* > > > > > > + * Array mapping for converting internal enum to string. > > > > > > + */ > > > > > > +static const char *const ConflictLogDestNames[] = { > > > > > > + [CONFLICT_LOG_DEST_LOG] = "log", > > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table", > > > > > > + [CONFLICT_LOG_DEST_ALL] = "all" > > > > > > +}; > > > > > > > > > > > > Defining an array this way could be an Array size issue. Actually the > > > > > > array has just three elements so the last element should be at > > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will > > > > > > be ConflictLogDestNames[3]. Can we define by referring the following > > > > > > existing way: > > > > > > > > I was analyzing this because I remember we were initially using the > > > > format you suggested and switched to the bit format to enable direct > > > > bitwise operations elsewhere. I think Peter suggested that [1], and > > > > the argument was that the bitwise operation is easy if we represent > > > > them as a bit. Also, since we would not have too many options, the > > > > array size shouldn't be an issue. But I understand your point: adding > > > > more elements will cause the array size to grow very fast as this is > > > > using sparse array. Let's see what others think about this, and then > > > > we can decide whether to change it back? > > > > > > > > > > The benefit of the current approach is that checking whether the > > > destination is TABLE becomes straightforward: > > > > > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE) > > > > > > if we go by regular enum values (simialr to XLogSource), then it will be: > > > > > > if (opts.logdest == CONFLICT_LOG_DEST_TABLE || > > > opts.logdest == CONFLICT_LOG_DEST_ALL) > > > > Right > > > > > For ease of extending the enum and its corresponding text mappings, my > > > personal preference is still the regular (non-bitwise) enum approach. > > > > Yeah, that's my personal preference too. But Peter had strong stand > > on keeping as bitwise so that we can directly use > > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations. > > Since this array shouldn't have many options, a sparse array is not an > > issue. So lets see what @Peter Smith has to say here and then we can > > build a concensus on this. > > > > > But if we anticipate adding more destination options in the future > > > that would be covered by ALL, checking for those in code could lead to > > > growing chains of OR conditions, whereas the bitwise approach scales > > > more cleanly in that respect. So I think the choice depends on what > > > kinds of future extensions we expect. > > > > > > Do we have plans to add more options that would naturally fall under > > > ALL? Or do we instead expect additions that are mutually exclusive; > > > for example, splitting CONFLICT_LOG_DEST_LOG into something like > > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may > > > not make sense to group under ALL in the same way? > > > > Currently, I haven't considered which options would naturally fall > > under "ALL." Perhaps if we plan targets other than logs and files, > > those might also fall under "ALL." > > I have fixed all the reported comments except these four. Few comments: 1) Currently we allow renaming of pg_conflict schema, this might be ok as we allow other sysem schema like pg_catalog and pg_toast also. postgres=# alter schema pg_conflict rename to test_conflict; ALTER SCHEMA While displaying the conflict table we will have to display the renamed schema name instead of hard coding the schema name: postgres=# \dRs+ List of subscriptions Name | Owner | Enabled | Publication | Binary | Streaming | Two-phase commit | Disable on error | Origin | Password required | Run as owner? | Failover | Server | Retain dead tuples | Max retention duration | Retention active | Synchronous commit | Conninfo | Receiver timeout | Skip LSN | Description | Conflict log destination | Confl ict log table ------+---------+---------+-------------+--------+-----------+------------------+------------------+--------+-------------------+---------------+----------+--------+--------------------+---- --------------------+------------------+--------------------+------------------------------------------+------------------+------------+-------------+--------------------------+------------- ---------------------- sub1 | vignesh | t | {pub1,pub2} | f | parallel | d | f | any | t | f | f | | f | 0 | f | off | dbname=postgres host=localhost port=5432 | -1 | 0/00000000 | | table | pg_conflict. pg_conflict_log_16397 (2 rows) postgres=# select * from pg_conflict.pg_conflict_log_16397; ERROR: relation "pg_conflict.pg_conflict_log_16397" does not exist LINE 1: select * from pg_conflict.pg_conflict_log_16397; + /* Conflict log destination is supported in v19 and higher */ + if (pset.sversion >= 190000) + { + appendPQExpBuffer(&buf, + ", subconflictlogdest AS \"%s\"\n", + gettext_noop("Conflict log destination")); + + appendPQExpBuffer(&buf, + ", (CASE WHEN subconflictlogdest IN ('table', 'all') " + " THEN 'pg_conflict.pg_conflict_log_' || oid " + " ELSE '-' END) AS \"%s\"\n", + gettext_noop("Conflict log table")); + } 2) We will have to use the renamed schema here instead of hard coding: + /* + * Check for an existing table with the sname name in the pg_conflict namespace. + * A collision should not occur under normal operation, but we must handle cases + * where a table has been created manually. + */ + if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE))) + ereport(ERROR, + (errcode(ERRCODE_DUPLICATE_TABLE), + errmsg("conflict log table pg_conflict.\"%s\" already exists", relname), + errhint("A table with the same name already exists. " + "To proceed, drop the existing table and retry."))); 3) Similarly here too: + /* Release tuple descriptor memory. */ + FreeTupleDesc(tupdesc); + + ereport(NOTICE, + (errmsg("created conflict log table pg_conflict.\"%s\" for subscription \"%s\"", + relname, subname))); Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-06T05:17:20Z
On Tue, May 5, 2026 at 6:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, May 5, 2026 at 9:37 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, May 5, 2026 at 8:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > PFA, poc patch for the same. > > > > I like the idea of PoC. It simplifies the implementation. > > > > > > > > > > > > I know it is POC but I think you need more work to prevent manual > > > inserts/updates on conflict tables. > > > > > > > I think CheckValidResultRel() handles it. > > > > postgres=# insert into pg_conflict.pg_conflict_16391 values (0); > > ERROR: cannot modify or insert data into conflict log table "pg_conflict_16391" > > DETAIL: Conflict log tables are system-managed and only support > > cleanup via DELETE or TRUNCATE > > I think we can tweak a bit and pg_class_aclmask_ext() we can only > allow truncate/delete on pg_conflict and block insert and update, here > is the modified version. Please let me know your thoughts. > BTW, I am still getting the same ERROR even after POC. See postgres=# insert into pg_conflict.pg_conflict_log_16402 values(NULL); ERROR: cannot modify or insert data into conflict log table "pg_conflict_log_16402" DETAIL: Conflict log tables are system-managed and only support cleanup via DELETE or TRUNCATE. Few other comments: * postgres=# create subscription sub1 connection 'dbname=postgres' publication pub1 WITH (conflict_log_destination='table'); NOTICE: created conflict log table pg_conflict."pg_conflict_log_16394" for subscription "sub1" NOTICE: created replication slot "sub1" on publisher CREATE SUBSCRIPTION To make the messages similar, isn't it better to use the following wording in the first message: "created conflict log table "pg_conflict.pg_conflict_log_16394" on subscriber? The part "subscription "sub1"" is clear from the command itself. * postgres=# drop subscription sub1; NOTICE: dropped replication slot "sub1" on publisher DROP SUBSCRIPTION Drop seems to have missed the NOTICE to implicitly drop the table. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-06T09:31:30Z
On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > Few comments: > 1) Currently we allow renaming of pg_conflict schema, this might be ok > as we allow other sysem schema like pg_catalog and pg_toast also. > postgres=# alter schema pg_conflict rename to test_conflict; > ALTER SCHEMA > I agree that we allow renaming other schemas including pg_toast, but I am not sure if this is consciously made decision, see BUG #18281 ast [1]. I don't favour allowing renaming pg_conflict for 2 reasons: 1) Because Postgres explicitly blocks renaming schemas to a name starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to something else, they are permanently locked out from renaming it back. 2) While the core worker might survive a rename via OID lookups; external scripts, extensions, and monitoring tools will likely hardcode the 'pg_conflict' string. If the schema is renamed, these tools will fail. One such example of scripts breaking is present event in Postgres. I did the following, and most of psql commands started failing after that due to hard-coded pg_catalog name in them. postgres=# alter schema pg_catalog rename to catalog_new; ALTER SCHEMA postgres=# \d catalog_new.* ERROR: relation "pg_catalog.pg_class" does not exist LINE 5: FROM pg_catalog.pg_class c [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-06T09:36:06Z
On Wed, May 6, 2026 at 10:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, May 5, 2026 at 6:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, May 5, 2026 at 9:37 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Tue, May 5, 2026 at 8:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Mon, May 4, 2026 at 6:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > PFA, poc patch for the same. > > > > > > I like the idea of PoC. It simplifies the implementation. > > > > > > > > > > > > > > > > I know it is POC but I think you need more work to prevent manual > > > > inserts/updates on conflict tables. > > > > > > > > > > I think CheckValidResultRel() handles it. > > > > > > postgres=# insert into pg_conflict.pg_conflict_16391 values (0); > > > ERROR: cannot modify or insert data into conflict log table "pg_conflict_16391" > > > DETAIL: Conflict log tables are system-managed and only support > > > cleanup via DELETE or TRUNCATE > > > > I think we can tweak a bit and pg_class_aclmask_ext() we can only > > allow truncate/delete on pg_conflict and block insert and update, here > > is the modified version. Please let me know your thoughts. > > > > BTW, I am still getting the same ERROR even after POC. See > postgres=# insert into pg_conflict.pg_conflict_log_16402 values(NULL); > ERROR: cannot modify or insert data into conflict log table > "pg_conflict_log_16402" > DETAIL: Conflict log tables are system-managed and only support > cleanup via DELETE or TRUNCATE. I also see the same behaviour. ~~ One observation for others to review: As a non super-user which does not have 'pg_create_subscription' privelege: postgres=> alter table pg_conflict.pg_conflict_16487 add column i int; ERROR: permission denied for schema pg_conflict <seems correct, as access is denied at schema level itself> As a non super-user which has 'pg_create_subscription' privelege, but does not own the respective sub: postgres=> alter table pg_conflict.pg_conflict_16487 add column i int; ERROR: must be owner of table pg_conflict_16487 <Due to 'pg_create_subscription', it seems schema access is provided, so it goes to check table access now and gives above error. Not sure about this error, even if the user were the owner, they still wouldn't be able to perform this operation> As a non super-user which has 'pg_create_subscription' privilege and also owns the respective sub: postgres=> alter table pg_conflict.pg_conflict_16498 add column i int; ERROR: permission denied: "pg_conflict_16498" is a system catalog <okay> As a super-user, the error is same irrespective of fact whether it actually owns that table or not: postgres=# alter table pg_conflict.pg_conflict_16487 add column i int; ERROR: permission denied: "pg_conflict_16487" is a system catalog <okay> For second case, not a strong opinion, but can the better error be: ERROR: permission denied: "pg_conflict_16487" is a system catalog? I have not analyzed code myself for this yet. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-06T10:31:41Z
On Wed, May 6, 2026 at 3:06 PM shveta malik <shveta.malik@gmail.com> wrote: > > As a non super-user which does not have 'pg_create_subscription' privelege: > postgres=> alter table pg_conflict.pg_conflict_16487 add column i int; > ERROR: permission denied for schema pg_conflict > <seems correct, as access is denied at schema level itself> > > As a non super-user which has 'pg_create_subscription' privelege, but > does not own the respective sub: > postgres=> alter table pg_conflict.pg_conflict_16487 add column i int; > ERROR: must be owner of table pg_conflict_16487 > <Due to 'pg_create_subscription', it seems schema access is provided, > so it goes to check table access now and gives above error. Not sure > about this error, even if the user were the owner, they still wouldn't > be able to perform this operation> > > As a non super-user which has 'pg_create_subscription' privilege and > also owns the respective sub: > postgres=> alter table pg_conflict.pg_conflict_16498 add column i int; > ERROR: permission denied: "pg_conflict_16498" is a system catalog > <okay> > > As a super-user, the error is same irrespective of fact whether it > actually owns that table or not: > postgres=# alter table pg_conflict.pg_conflict_16487 add column i int; > ERROR: permission denied: "pg_conflict_16487" is a system catalog > <okay> > > For second case, not a strong opinion, but can the better error be: > ERROR: permission denied: "pg_conflict_16487" is a system catalog? > > I have not analyzed code myself for this yet. > I analyzed this case and think that the current behavior is okay. As per RangeVarCallbackForAlterRelation(), we first ensure that the current user is either a table owner or superuser and then check actual permissions to perform the operations on the table. The same is true for the DROP case. I don't see the need to change it. Few cosmetic changes are attached in top-up patches. Dilip can include these in the next version, if he is okay with them. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-06T11:01:19Z
On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > > > Few comments: > > 1) Currently we allow renaming of pg_conflict schema, this might be ok > > as we allow other sysem schema like pg_catalog and pg_toast also. > > postgres=# alter schema pg_conflict rename to test_conflict; > > ALTER SCHEMA > > > > I agree that we allow renaming other schemas including pg_toast, but I > am not sure if this is consciously made decision, see BUG #18281 ast > [1]. I don't favour allowing renaming pg_conflict for 2 reasons: > > 1) Because Postgres explicitly blocks renaming schemas to a name > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to > something else, they are permanently locked out from renaming it back. > > 2) While the core worker might survive a rename via OID lookups; > external scripts, extensions, and monitoring tools will likely > hardcode the 'pg_conflict' string. If the schema is renamed, these > tools will fail. > I think we shouldn't go out of our way to disallow superusers to rename pg_conflict schema similar to other cases. We can try to prevent hard-coding schema names where possible but not sure we can guarantee that nothing related to pg_conflict schema won't break as shown by you in the following similar case for pg_conflict. > One such example of scripts breaking is present event in Postgres. I > did the following, and most of psql commands started failing after > that due to hard-coded pg_catalog name in them. > > postgres=# alter schema pg_catalog rename to catalog_new; > ALTER SCHEMA > > postgres=# \d catalog_new.* > ERROR: relation "pg_catalog.pg_class" does not exist > LINE 5: FROM pg_catalog.pg_class c > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org > -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-06T11:25:44Z
On Fri, 1 May 2026 at 19:16, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > 2. > > > > > > +typedef enum ConflictLogDest > > > > > > +{ > > > > > > + /* Log conflicts to the server logs */ > > > > > > + CONFLICT_LOG_DEST_LOG = 1 << 0, /* 0x01 */ > > > > > > + > > > > > > + /* Log conflicts to an internally managed conflict log table */ > > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1, /* 0x02 */ > > > > > > + > > > > > > + /* Convenience bitmask for all supported destinations */ > > > > > > + CONFLICT_LOG_DEST_ALL = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE) > > > > > > +} ConflictLogDest; > > > > > > + > > > > > > +/* > > > > > > + * Array mapping for converting internal enum to string. > > > > > > + */ > > > > > > +static const char *const ConflictLogDestNames[] = { > > > > > > + [CONFLICT_LOG_DEST_LOG] = "log", > > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table", > > > > > > + [CONFLICT_LOG_DEST_ALL] = "all" > > > > > > +}; > > > > > > > > > > > > Defining an array this way could be an Array size issue. Actually the > > > > > > array has just three elements so the last element should be at > > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will > > > > > > be ConflictLogDestNames[3]. Can we define by referring the following > > > > > > existing way: > > > > > > > > I was analyzing this because I remember we were initially using the > > > > format you suggested and switched to the bit format to enable direct > > > > bitwise operations elsewhere. I think Peter suggested that [1], and > > > > the argument was that the bitwise operation is easy if we represent > > > > them as a bit. Also, since we would not have too many options, the > > > > array size shouldn't be an issue. But I understand your point: adding > > > > more elements will cause the array size to grow very fast as this is > > > > using sparse array. Let's see what others think about this, and then > > > > we can decide whether to change it back? > > > > > > > > > > The benefit of the current approach is that checking whether the > > > destination is TABLE becomes straightforward: > > > > > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE) > > > > > > if we go by regular enum values (simialr to XLogSource), then it will be: > > > > > > if (opts.logdest == CONFLICT_LOG_DEST_TABLE || > > > opts.logdest == CONFLICT_LOG_DEST_ALL) > > > > Right > > > > > For ease of extending the enum and its corresponding text mappings, my > > > personal preference is still the regular (non-bitwise) enum approach. > > > > Yeah, that's my personal preference too. But Peter had strong stand > > on keeping as bitwise so that we can directly use > > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations. > > Since this array shouldn't have many options, a sparse array is not an > > issue. So lets see what @Peter Smith has to say here and then we can > > build a concensus on this. > > > > > But if we anticipate adding more destination options in the future > > > that would be covered by ALL, checking for those in code could lead to > > > growing chains of OR conditions, whereas the bitwise approach scales > > > more cleanly in that respect. So I think the choice depends on what > > > kinds of future extensions we expect. > > > > > > Do we have plans to add more options that would naturally fall under > > > ALL? Or do we instead expect additions that are mutually exclusive; > > > for example, splitting CONFLICT_LOG_DEST_LOG into something like > > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may > > > not make sense to group under ALL in the same way? > > > > Currently, I haven't considered which options would naturally fall > > under "ALL." Perhaps if we plan targets other than logs and files, > > those might also fall under "ALL." > > I have fixed all the reported comments except these four. Few minor comments: 1) Now that we create the table in pg_conflict system schema where other users cannot create the table, is there a scenario where this is possible? /* * Check for an existing table with the sname name in the pg_conflict namespace. * A collision should not occur under normal operation, but we must handle cases * where a table has been created manually. */ if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE))) ereport(ERROR, (errcode(ERRCODE_DUPLICATE_TABLE), errmsg("conflict log table pg_conflict.\"%s\" already exists", relname), errhint("A table with the same name already exists. " "To proceed, drop the existing table and retry."))); 2) I felt table_open will throw an exception in case of error, it will not return error, this check will not be hit: + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); + if (conflictlogrel == NULL) + elog(ERROR, "could not open conflict log table (OID %u)", + conflictlogrelid); 3) Typo sname should be same here: + * Check for an existing table with the sname name in the pg_conflict namespace. + * A collision should not occur under normal operation, but we must handle cases 4) This include is not required: @@ -37,6 +40,7 @@ #include "commands/subscriptioncmds.h" #include "executor/executor.h" #include "foreign/foreign.h" +#include "funcapi.h" Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-06T12:58:27Z
On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Few comments: > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok > > > as we allow other sysem schema like pg_catalog and pg_toast also. > > > postgres=# alter schema pg_conflict rename to test_conflict; > > > ALTER SCHEMA > > > > > > > I agree that we allow renaming other schemas including pg_toast, but I > > am not sure if this is consciously made decision, see BUG #18281 ast > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons: > > > > 1) Because Postgres explicitly blocks renaming schemas to a name > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to > > something else, they are permanently locked out from renaming it back. > > > > 2) While the core worker might survive a rename via OID lookups; > > external scripts, extensions, and monitoring tools will likely > > hardcode the 'pg_conflict' string. If the schema is renamed, these > > tools will fail. > > > > I think we shouldn't go out of our way to disallow superusers to > rename pg_conflict schema similar to other cases. We can try to > prevent hard-coding schema names where possible but not sure we can > guarantee that nothing related to pg_conflict schema won't break as > shown by you in the following similar case for pg_conflict. > > > One such example of scripts breaking is present event in Postgres. I > > did the following, and most of psql commands started failing after > > that due to hard-coded pg_catalog name in them. > > > > postgres=# alter schema pg_catalog rename to catalog_new; > > ALTER SCHEMA > > > > postgres=# \d catalog_new.* > > ERROR: relation "pg_catalog.pg_class" does not exist > > LINE 5: FROM pg_catalog.pg_class c > > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org I can see pg_toast and pg_catalog schema also hard coded in couple of places e.g. listPartitionedTables() { if (!pattern) appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n" " AND n.nspname !~ '^pg_toast'\n" " AND n.nspname <> 'information_schema'\n"); } I will analyze which all places we are hardcoding, I think on server side code we can easily avoid but from client side e.g. describe we might need to invent a way to identify the schema name, or we might have to store it somewhere in pg_subscription etc, I don't think we should go that route. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-06T14:04:57Z
On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > Few comments: > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok > > > > as we allow other sysem schema like pg_catalog and pg_toast also. > > > > postgres=# alter schema pg_conflict rename to test_conflict; > > > > ALTER SCHEMA > > > > > > > > > > I agree that we allow renaming other schemas including pg_toast, but I > > > am not sure if this is consciously made decision, see BUG #18281 ast > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons: > > > > > > 1) Because Postgres explicitly blocks renaming schemas to a name > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to > > > something else, they are permanently locked out from renaming it back. > > > > > > 2) While the core worker might survive a rename via OID lookups; > > > external scripts, extensions, and monitoring tools will likely > > > hardcode the 'pg_conflict' string. If the schema is renamed, these > > > tools will fail. > > > > > > > I think we shouldn't go out of our way to disallow superusers to > > rename pg_conflict schema similar to other cases. We can try to > > prevent hard-coding schema names where possible but not sure we can > > guarantee that nothing related to pg_conflict schema won't break as > > shown by you in the following similar case for pg_conflict. > > > > > One such example of scripts breaking is present event in Postgres. I > > > did the following, and most of psql commands started failing after > > > that due to hard-coded pg_catalog name in them. > > > > > > postgres=# alter schema pg_catalog rename to catalog_new; > > > ALTER SCHEMA > > > > > > postgres=# \d catalog_new.* > > > ERROR: relation "pg_catalog.pg_class" does not exist > > > LINE 5: FROM pg_catalog.pg_class c > > > > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org > > I can see pg_toast and pg_catalog schema also hard coded in couple of > places e.g. > > listPartitionedTables() > { > if (!pattern) > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n" > " AND n.nspname !~ '^pg_toast'\n" > " AND n.nspname <> 'information_schema'\n"); > } > > I will analyze which all places we are hardcoding, I think on server > side code we can easily avoid but from client side e.g. describe we > might need to invent a way to identify the schema name, or we might > have to store it somewhere in pg_subscription etc, I don't think we > should go that route. Here is updated patch set Open comments: 1. Analyze and avoid hardcoding the 'pg_conflict' schema name wherever possible 2. change the way we display clt in \dRs+ 3. Transfer the clt ownership when subscription ownership has change (Note: I have coded a poc for this but still checking whether it works in all cases) I will send the revised version by end of this week after fixing these open comments as well. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-07T02:55:42Z
On Wed, May 6, 2026 at 7:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > Few comments: > > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok > > > > > as we allow other sysem schema like pg_catalog and pg_toast also. > > > > > postgres=# alter schema pg_conflict rename to test_conflict; > > > > > ALTER SCHEMA > > > > > > > > > > > > > I agree that we allow renaming other schemas including pg_toast, but I > > > > am not sure if this is consciously made decision, see BUG #18281 ast > > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons: > > > > > > > > 1) Because Postgres explicitly blocks renaming schemas to a name > > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to > > > > something else, they are permanently locked out from renaming it back. > > > > > > > > 2) While the core worker might survive a rename via OID lookups; > > > > external scripts, extensions, and monitoring tools will likely > > > > hardcode the 'pg_conflict' string. If the schema is renamed, these > > > > tools will fail. > > > > > > > > > > I think we shouldn't go out of our way to disallow superusers to > > > rename pg_conflict schema similar to other cases. We can try to > > > prevent hard-coding schema names where possible but not sure we can > > > guarantee that nothing related to pg_conflict schema won't break as > > > shown by you in the following similar case for pg_conflict. > > > > > > > One such example of scripts breaking is present event in Postgres. I > > > > did the following, and most of psql commands started failing after > > > > that due to hard-coded pg_catalog name in them. > > > > > > > > postgres=# alter schema pg_catalog rename to catalog_new; > > > > ALTER SCHEMA > > > > > > > > postgres=# \d catalog_new.* > > > > ERROR: relation "pg_catalog.pg_class" does not exist > > > > LINE 5: FROM pg_catalog.pg_class c > > > > > > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org > > > > I can see pg_toast and pg_catalog schema also hard coded in couple of > > places e.g. > > > > listPartitionedTables() > > { > > if (!pattern) > > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n" > > " AND n.nspname !~ '^pg_toast'\n" > > " AND n.nspname <> 'information_schema'\n"); > > } > > > > I will analyze which all places we are hardcoding, I think on server > > side code we can easily avoid but from client side e.g. describe we > > might need to invent a way to identify the schema name, or we might > > have to store it somewhere in pg_subscription etc, I don't think we > > should go that route. > > Here is updated patch set > > Open comments: > 1. Analyze and avoid hardcoding the 'pg_conflict' schema name wherever possible > 2. change the way we display clt in \dRs+ > 3. Transfer the clt ownership when subscription ownership has change > (Note: I have coded a poc for this but still checking whether it works > in all cases) > > I will send the revised version by end of this week after fixing these > open comments as well. So for the ownership change, this simple change[1] is working fine, but there is another issue that currently we can assign subscription nownership to any user even that doesn't have pg_create_subscription maybe that should be fine as it is not creating the subscription but now question is how to manage the permission on the conflict log table see below test[2] [1[] diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index a2de57e17b4..c9fac56714e 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -2718,6 +2718,10 @@ AlterSubscriptionOwner_internal(Relation rel, HeapTuple tup, Oid newOwnerId) form->subowner = newOwnerId; CatalogTupleUpdate(rel, &tup->t_self, tup); + /* Update owner of the conflict log table if it exists */ + if (OidIsValid(form->subconflictlogrelid)) + ATExecChangeOwner(form->subconflictlogrelid, newOwnerId, true, AccessExclusiveLock); + /* Update owner dependency reference */ changeDependencyOnOwner(SubscriptionRelationId, form->oid, [2] -- test to show the ownership is getting changed for the table, but now this user will have access issue on the pg_conflict_log table as this user do not have pg_create_subscription role, I haven't yet checked whether the problems are only related to clt access or there would be issue for other subcription management as well. postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE relname = 'pg_conflict_log_16406'; relname | relowner -----------------------+---------- pg_conflict_log_16406 | 10 (1 row) postgres[557253]=# CREATE USER test; CREATE ROLE postgres[557253]=# ALTER SUBSCRIPTION sub OWNER TO test; ALTER SUBSCRIPTION postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE relname = 'pg_conflict_log_16406'; relname | relowner -----------------------+---------- pg_conflict_log_16406 | 16410 (1 row) -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-07T03:26:36Z
On Wed, May 6, 2026 at 4:55 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, 1 May 2026 at 19:16, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > 2. > > > > > > > +typedef enum ConflictLogDest > > > > > > > +{ > > > > > > > + /* Log conflicts to the server logs */ > > > > > > > + CONFLICT_LOG_DEST_LOG = 1 << 0, /* 0x01 */ > > > > > > > + > > > > > > > + /* Log conflicts to an internally managed conflict log table */ > > > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1, /* 0x02 */ > > > > > > > + > > > > > > > + /* Convenience bitmask for all supported destinations */ > > > > > > > + CONFLICT_LOG_DEST_ALL = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE) > > > > > > > +} ConflictLogDest; > > > > > > > + > > > > > > > +/* > > > > > > > + * Array mapping for converting internal enum to string. > > > > > > > + */ > > > > > > > +static const char *const ConflictLogDestNames[] = { > > > > > > > + [CONFLICT_LOG_DEST_LOG] = "log", > > > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table", > > > > > > > + [CONFLICT_LOG_DEST_ALL] = "all" > > > > > > > +}; > > > > > > > > > > > > > > Defining an array this way could be an Array size issue. Actually the > > > > > > > array has just three elements so the last element should be at > > > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will > > > > > > > be ConflictLogDestNames[3]. Can we define by referring the following > > > > > > > existing way: > > > > > > > > > > I was analyzing this because I remember we were initially using the > > > > > format you suggested and switched to the bit format to enable direct > > > > > bitwise operations elsewhere. I think Peter suggested that [1], and > > > > > the argument was that the bitwise operation is easy if we represent > > > > > them as a bit. Also, since we would not have too many options, the > > > > > array size shouldn't be an issue. But I understand your point: adding > > > > > more elements will cause the array size to grow very fast as this is > > > > > using sparse array. Let's see what others think about this, and then > > > > > we can decide whether to change it back? > > > > > > > > > > > > > The benefit of the current approach is that checking whether the > > > > destination is TABLE becomes straightforward: > > > > > > > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE) > > > > > > > > if we go by regular enum values (simialr to XLogSource), then it will be: > > > > > > > > if (opts.logdest == CONFLICT_LOG_DEST_TABLE || > > > > opts.logdest == CONFLICT_LOG_DEST_ALL) > > > > > > Right > > > > > > > For ease of extending the enum and its corresponding text mappings, my > > > > personal preference is still the regular (non-bitwise) enum approach. > > > > > > Yeah, that's my personal preference too. But Peter had strong stand > > > on keeping as bitwise so that we can directly use > > > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations. > > > Since this array shouldn't have many options, a sparse array is not an > > > issue. So lets see what @Peter Smith has to say here and then we can > > > build a concensus on this. > > > > > > > But if we anticipate adding more destination options in the future > > > > that would be covered by ALL, checking for those in code could lead to > > > > growing chains of OR conditions, whereas the bitwise approach scales > > > > more cleanly in that respect. So I think the choice depends on what > > > > kinds of future extensions we expect. > > > > > > > > Do we have plans to add more options that would naturally fall under > > > > ALL? Or do we instead expect additions that are mutually exclusive; > > > > for example, splitting CONFLICT_LOG_DEST_LOG into something like > > > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may > > > > not make sense to group under ALL in the same way? > > > > > > Currently, I haven't considered which options would naturally fall > > > under "ALL." Perhaps if we plan targets other than logs and files, > > > those might also fall under "ALL." > > > > I have fixed all the reported comments except these four. > > Few minor comments: > 1) Now that we create the table in pg_conflict system schema where > other users cannot create the table, is there a scenario where this is > possible? > /* > * Check for an existing table with the sname name in the > pg_conflict namespace. > * A collision should not occur under normal operation, but we > must handle cases > * where a table has been created manually. > */ > if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE))) > ereport(ERROR, > (errcode(ERRCODE_DUPLICATE_TABLE), > errmsg("conflict log table pg_conflict.\"%s\" already > exists", relname), > errhint("A table with the same name already exists. " > "To proceed, drop the existing table and retry."))); > It is possible to hit it with allow_system_table_mods=on. See issue1 raised by Nisha in [1] [1]: https://www.postgresql.org/message-id/CABdArM6jpLnzC5O%3DX48RpFXRmAr5WOSHJtw0ebT%2B7Wmb-WdfvQ%40mail.gmail.com thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-07T04:31:37Z
On Thu, May 7, 2026 at 8:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, May 6, 2026 at 7:34 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > Few comments: > > > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok > > > > > > as we allow other sysem schema like pg_catalog and pg_toast also. > > > > > > postgres=# alter schema pg_conflict rename to test_conflict; > > > > > > ALTER SCHEMA > > > > > > > > > > > > > > > > I agree that we allow renaming other schemas including pg_toast, but I > > > > > am not sure if this is consciously made decision, see BUG #18281 ast > > > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons: > > > > > > > > > > 1) Because Postgres explicitly blocks renaming schemas to a name > > > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to > > > > > something else, they are permanently locked out from renaming it back. > > > > > > > > > > 2) While the core worker might survive a rename via OID lookups; > > > > > external scripts, extensions, and monitoring tools will likely > > > > > hardcode the 'pg_conflict' string. If the schema is renamed, these > > > > > tools will fail. > > > > > > > > > > > > > I think we shouldn't go out of our way to disallow superusers to > > > > rename pg_conflict schema similar to other cases. We can try to > > > > prevent hard-coding schema names where possible but not sure we can > > > > guarantee that nothing related to pg_conflict schema won't break as > > > > shown by you in the following similar case for pg_conflict. > > > > > > > > > One such example of scripts breaking is present event in Postgres. I > > > > > did the following, and most of psql commands started failing after > > > > > that due to hard-coded pg_catalog name in them. > > > > > > > > > > postgres=# alter schema pg_catalog rename to catalog_new; > > > > > ALTER SCHEMA > > > > > > > > > > postgres=# \d catalog_new.* > > > > > ERROR: relation "pg_catalog.pg_class" does not exist > > > > > LINE 5: FROM pg_catalog.pg_class c > > > > > > > > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org > > > > > > I can see pg_toast and pg_catalog schema also hard coded in couple of > > > places e.g. > > > > > > listPartitionedTables() > > > { > > > if (!pattern) > > > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n" > > > " AND n.nspname !~ '^pg_toast'\n" > > > " AND n.nspname <> 'information_schema'\n"); > > > } > > > > > > I will analyze which all places we are hardcoding, I think on server > > > side code we can easily avoid but from client side e.g. describe we > > > might need to invent a way to identify the schema name, or we might > > > have to store it somewhere in pg_subscription etc, I don't think we > > > should go that route. > > > > Here is updated patch set > > > > Open comments: > > 1. Analyze and avoid hardcoding the 'pg_conflict' schema name wherever possible > > 2. change the way we display clt in \dRs+ > > 3. Transfer the clt ownership when subscription ownership has change > > (Note: I have coded a poc for this but still checking whether it works > > in all cases) > > > > I will send the revised version by end of this week after fixing these > > open comments as well. > > So for the ownership change, this simple change[1] is working fine, > but there is another issue that currently we can assign subscription > nownership to any user even that doesn't have pg_create_subscription > maybe that should be fine as it is not creating the subscription but > now question is how to manage the permission on the conflict log table > see below test[2] > > > [1[] > diff --git a/src/backend/commands/subscriptioncmds.c > b/src/backend/commands/subscriptioncmds.c > index a2de57e17b4..c9fac56714e 100644 > --- a/src/backend/commands/subscriptioncmds.c > +++ b/src/backend/commands/subscriptioncmds.c > @@ -2718,6 +2718,10 @@ AlterSubscriptionOwner_internal(Relation rel, > HeapTuple tup, Oid newOwnerId) > form->subowner = newOwnerId; > CatalogTupleUpdate(rel, &tup->t_self, tup); > + /* Update owner of the conflict log table if it exists */ > + if (OidIsValid(form->subconflictlogrelid)) > + ATExecChangeOwner(form->subconflictlogrelid, > newOwnerId, true, AccessExclusiveLock); > + > /* Update owner dependency reference */ > changeDependencyOnOwner(SubscriptionRelationId, > form->oid, > > [2] > -- test to show the ownership is getting changed for the table, but > now this user will have access issue on the pg_conflict_log table as > this user do not have pg_create_subscription role, I haven't yet > checked whether the problems are only related to clt access or there > would be issue for other subcription management as well. > > postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE > relname = 'pg_conflict_log_16406'; > relname | relowner > -----------------------+---------- > pg_conflict_log_16406 | 10 > (1 row) > > postgres[557253]=# CREATE USER test; > CREATE ROLE > postgres[557253]=# ALTER SUBSCRIPTION sub OWNER TO test; > ALTER SUBSCRIPTION > postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE > relname = 'pg_conflict_log_16406'; > relname | relowner > -----------------------+---------- > pg_conflict_log_16406 | 16410 > (1 row) > During my testing, I initally found it strange that user without pg_create_subscription is allowed to perform ALTER Sub. But that is base/head behaviour. Now coming to our use-case around it. postgres=# create user user1; CREATE ROLE postgres=# ALTER SUBSCRIPTION sub1 OWNER TO user1; ALTER SUBSCRIPTION postgres=# SELECT relowner::regrole FROM pg_class WHERE relname = 'pg_conflict_log_16392'; relowner ---------- user1 As Dilip stated, user1 owns the table but cannot access or truncate it. postgres=> select * from pg_conflict.pg_conflict_log_16392; ERROR: permission denied for schema pg_conflict postgres=> truncate pg_conflict.pg_conflict_log_16392; ERROR: permission denied for schema pg_conflict It looks weird at first, but I think we have exact same beahviour for toast table: --as superuser: postgres=# CREATE TABLE user_data (id int, big_text text); CREATE TABLE postgres=# SELECT reltoastrelid::regclass FROM pg_class WHERE relname = 'user_data'; reltoastrelid ------------------------- pg_toast.pg_toast_16399 postgres=# SELECT * FROM pg_toast.pg_toast_16399; chunk_id | chunk_seq | chunk_data ----------+-----------+------------ (0 rows) postgres=# alter table user_data owner to user1; ALTER TABLE --toast table ownership got changed: postgres=# \dt+ pg_toast.pg_toast_16399 Schema | Name | Type | Owner | ----------+----------------+-------------+-------+- pg_toast | pg_toast_16399 | TOAST table | user1 | As user1: postgres=> SELECT * FROM pg_toast.pg_toast_16399; ERROR: permission denied for schema pg_toast So behaviour is similar to our case. IMO, at best we can document it well, something like: Note: Conflict log tables reside in the restricted pg_conflict schema. To query or truncate these logs, a user must be a superuser or have the pg_create_subscription privilege. A subscription owner lacking these privileges will not be able to access or purge conflict log tables. Thoughts? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-07T06:46:34Z
On Thu, May 7, 2026 at 10:01 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, May 7, 2026 at 8:26 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > So for the ownership change, this simple change[1] is working fine, > > but there is another issue that currently we can assign subscription > > nownership to any user even that doesn't have pg_create_subscription > > maybe that should be fine as it is not creating the subscription but > > now question is how to manage the permission on the conflict log table > > see below test[2] > > > > > > [1[] > > diff --git a/src/backend/commands/subscriptioncmds.c > > b/src/backend/commands/subscriptioncmds.c > > index a2de57e17b4..c9fac56714e 100644 > > --- a/src/backend/commands/subscriptioncmds.c > > +++ b/src/backend/commands/subscriptioncmds.c > > @@ -2718,6 +2718,10 @@ AlterSubscriptionOwner_internal(Relation rel, > > HeapTuple tup, Oid newOwnerId) > > form->subowner = newOwnerId; > > CatalogTupleUpdate(rel, &tup->t_self, tup); > > + /* Update owner of the conflict log table if it exists */ > > + if (OidIsValid(form->subconflictlogrelid)) > > + ATExecChangeOwner(form->subconflictlogrelid, > > newOwnerId, true, AccessExclusiveLock); > > + > > /* Update owner dependency reference */ > > changeDependencyOnOwner(SubscriptionRelationId, > > form->oid, > > > > [2] > > -- test to show the ownership is getting changed for the table, but > > now this user will have access issue on the pg_conflict_log table as > > this user do not have pg_create_subscription role, I haven't yet > > checked whether the problems are only related to clt access or there > > would be issue for other subcription management as well. > > > > postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE > > relname = 'pg_conflict_log_16406'; > > relname | relowner > > -----------------------+---------- > > pg_conflict_log_16406 | 10 > > (1 row) > > > > postgres[557253]=# CREATE USER test; > > CREATE ROLE > > postgres[557253]=# ALTER SUBSCRIPTION sub OWNER TO test; > > ALTER SUBSCRIPTION > > postgres[557253]=# SELECT relname, relowner FROM pg_class WHERE > > relname = 'pg_conflict_log_16406'; > > relname | relowner > > -----------------------+---------- > > pg_conflict_log_16406 | 16410 > > (1 row) > > > > During my testing, I initally found it strange that user without > pg_create_subscription is allowed to perform ALTER Sub. But that is > base/head behaviour. Now coming to our use-case around it. > > postgres=# create user user1; > CREATE ROLE > postgres=# ALTER SUBSCRIPTION sub1 OWNER TO user1; > ALTER SUBSCRIPTION > postgres=# SELECT relowner::regrole FROM pg_class WHERE relname = > 'pg_conflict_log_16392'; > relowner > ---------- > user1 > > As Dilip stated, user1 owns the table but cannot access or truncate it. > > postgres=> select * from pg_conflict.pg_conflict_log_16392; > ERROR: permission denied for schema pg_conflict > > postgres=> truncate pg_conflict.pg_conflict_log_16392; > ERROR: permission denied for schema pg_conflict > > It looks weird at first, but I think we have exact same beahviour for > toast table: > > --as superuser: > postgres=# CREATE TABLE user_data (id int, big_text text); > CREATE TABLE > > postgres=# SELECT reltoastrelid::regclass FROM pg_class WHERE relname > = 'user_data'; > reltoastrelid > ------------------------- > pg_toast.pg_toast_16399 > > postgres=# SELECT * FROM pg_toast.pg_toast_16399; > chunk_id | chunk_seq | chunk_data > ----------+-----------+------------ > (0 rows) > > > postgres=# alter table user_data owner to user1; > ALTER TABLE > > --toast table ownership got changed: > postgres=# \dt+ pg_toast.pg_toast_16399 > Schema | Name | Type | Owner | > ----------+----------------+-------------+-------+- > pg_toast | pg_toast_16399 | TOAST table | user1 | > > As user1: > postgres=> SELECT * FROM pg_toast.pg_toast_16399; > ERROR: permission denied for schema pg_toast > > So behaviour is similar to our case. > I am not sure the case is the same for CLT tables. For allowing change to a user as an owner of a subscription that doesn't have pg_create_subscription privilege, won't that be risky? Because now the background worker will be able to insert in the CLT table whereas for regular tables, it will still use table_owner's privilege (who originally created the table) as run_as_owner is false. So, shouldn't we disallow changing to an owner who doesn't pg_create_subscrition privilege when a CLT table is associated with a subscription similar to what we do for the SERVER case. (See comment: * If the subscription uses a server, check that the new owner has USAGE... in AlterSubscriptionOwner_internal()) -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-07T08:59:46Z
On Wed, 6 May 2026 at 19:35, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > Few comments: > > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok > > > > > as we allow other sysem schema like pg_catalog and pg_toast also. > > > > > postgres=# alter schema pg_conflict rename to test_conflict; > > > > > ALTER SCHEMA > > > > > > > > > > > > > I agree that we allow renaming other schemas including pg_toast, but I > > > > am not sure if this is consciously made decision, see BUG #18281 ast > > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons: > > > > > > > > 1) Because Postgres explicitly blocks renaming schemas to a name > > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to > > > > something else, they are permanently locked out from renaming it back. > > > > > > > > 2) While the core worker might survive a rename via OID lookups; > > > > external scripts, extensions, and monitoring tools will likely > > > > hardcode the 'pg_conflict' string. If the schema is renamed, these > > > > tools will fail. > > > > > > > > > > I think we shouldn't go out of our way to disallow superusers to > > > rename pg_conflict schema similar to other cases. We can try to > > > prevent hard-coding schema names where possible but not sure we can > > > guarantee that nothing related to pg_conflict schema won't break as > > > shown by you in the following similar case for pg_conflict. > > > > > > > One such example of scripts breaking is present event in Postgres. I > > > > did the following, and most of psql commands started failing after > > > > that due to hard-coded pg_catalog name in them. > > > > > > > > postgres=# alter schema pg_catalog rename to catalog_new; > > > > ALTER SCHEMA > > > > > > > > postgres=# \d catalog_new.* > > > > ERROR: relation "pg_catalog.pg_class" does not exist > > > > LINE 5: FROM pg_catalog.pg_class c > > > > > > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org > > > > I can see pg_toast and pg_catalog schema also hard coded in couple of > > places e.g. > > > > listPartitionedTables() > > { > > if (!pattern) > > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n" > > " AND n.nspname !~ '^pg_toast'\n" > > " AND n.nspname <> 'information_schema'\n"); > > } > > > > I will analyze which all places we are hardcoding, I think on server > > side code we can easily avoid but from client side e.g. describe we > > might need to invent a way to identify the schema name, or we might > > have to store it somewhere in pg_subscription etc, I don't think we > > should go that route. > > Here is updated patch set Thanks for the updated patches, the v30 version patch posted has few issues: There is an assert at [1]: TRAP: failed Assert("conflictlogrel != NULL"), File: "../src/backend/replication/logical/conflict.c", Line: 195, PID: 59658 0xb3a472 <ExceptionalCondition+0x82> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x9433b8 <ReportApplyConflict+0x13c8> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x7d91fb <CheckAndReportConflict+0x2cb> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x7d8b4b <ExecSimpleRelationInsert+0x10b> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x96525b <apply_dispatch+0x23eb> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x966150 <start_apply+0x310> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x967010 <run_apply_worker+0x290> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x966d6d <ApplyWorkerMain+0x1d> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x90ff0c <BackgroundWorkerMain+0x1cc> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x914a25 <postmaster_child_launch+0x145> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x917b77 <maybe_start_bgworkers+0x1d7> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x9198f5 <ServerLoop+0x1c65> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x917156 <PostmasterMain+0x1116> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres 0x83c16d <main+0x48d> at /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres There are the following warnings at [2]: [14:55:07.472] conflict.c:187:6: error: variable 'log_dest_clt' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] [14:55:07.472] 187 | if (dest == CONFLICT_LOG_DEST_TABLE || dest == CONFLICT_LOG_DEST_ALL) [14:55:07.472] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [14:55:07.472] conflict.c:193:6: note: uninitialized use occurs here [14:55:07.472] 193 | if (log_dest_clt) [14:55:07.472] | ^~~~~~~~~~~~ [14:55:07.472] conflict.c:187:2: note: remove the 'if' if its condition is always true [14:55:07.472] 187 | if (dest == CONFLICT_LOG_DEST_TABLE || dest == CONFLICT_LOG_DEST_ALL) [14:55:07.472] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [14:55:07.472] 188 | log_dest_clt = true; [14:55:07.472] conflict.c:176:21: note: initialize the variable 'log_dest_clt' to silence this warning [14:55:07.472] 176 | bool log_dest_clt; [14:55:07.472] | ^ [14:55:07.472] | = false [14:55:07.472] 1 error generated. [1] - https://api.cirrus-ci.com/v1/artifact/task/5630092254117888/testrun/build/testrun/subscription/026_stats/log/026_stats_subscriber.log [2] - https://cirrus-ci.com/task/5770829742473216 Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-07T10:45:38Z
On Thu, 7 May 2026 at 14:29, vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 6 May 2026 at 19:35, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, May 6, 2026 at 6:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Wed, May 6, 2026 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Wed, May 6, 2026 at 3:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > On Wed, May 6, 2026 at 9:24 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > Few comments: > > > > > > 1) Currently we allow renaming of pg_conflict schema, this might be ok > > > > > > as we allow other sysem schema like pg_catalog and pg_toast also. > > > > > > postgres=# alter schema pg_conflict rename to test_conflict; > > > > > > ALTER SCHEMA > > > > > > > > > > > > > > > > I agree that we allow renaming other schemas including pg_toast, but I > > > > > am not sure if this is consciously made decision, see BUG #18281 ast > > > > > [1]. I don't favour allowing renaming pg_conflict for 2 reasons: > > > > > > > > > > 1) Because Postgres explicitly blocks renaming schemas to a name > > > > > starting with 'pg_'. If an admin accidentally renames 'pg_conflict' to > > > > > something else, they are permanently locked out from renaming it back. > > > > > > > > > > 2) While the core worker might survive a rename via OID lookups; > > > > > external scripts, extensions, and monitoring tools will likely > > > > > hardcode the 'pg_conflict' string. If the schema is renamed, these > > > > > tools will fail. > > > > > > > > > > > > > I think we shouldn't go out of our way to disallow superusers to > > > > rename pg_conflict schema similar to other cases. We can try to > > > > prevent hard-coding schema names where possible but not sure we can > > > > guarantee that nothing related to pg_conflict schema won't break as > > > > shown by you in the following similar case for pg_conflict. > > > > > > > > > One such example of scripts breaking is present event in Postgres. I > > > > > did the following, and most of psql commands started failing after > > > > > that due to hard-coded pg_catalog name in them. > > > > > > > > > > postgres=# alter schema pg_catalog rename to catalog_new; > > > > > ALTER SCHEMA > > > > > > > > > > postgres=# \d catalog_new.* > > > > > ERROR: relation "pg_catalog.pg_class" does not exist > > > > > LINE 5: FROM pg_catalog.pg_class c > > > > > > > > > > [1]: https://www.postgresql.org/message-id/flat/18281-5b1b6c5991d345aa%40postgresql.org > > > > > > I can see pg_toast and pg_catalog schema also hard coded in couple of > > > places e.g. > > > > > > listPartitionedTables() > > > { > > > if (!pattern) > > > appendPQExpBufferStr(&buf, " AND n.nspname <> 'pg_catalog'\n" > > > " AND n.nspname !~ '^pg_toast'\n" > > > " AND n.nspname <> 'information_schema'\n"); > > > } > > > > > > I will analyze which all places we are hardcoding, I think on server > > > side code we can easily avoid but from client side e.g. describe we > > > might need to invent a way to identify the schema name, or we might > > > have to store it somewhere in pg_subscription etc, I don't think we > > > should go that route. > > > > Here is updated patch set > > Thanks for the updated patches, the v30 version patch posted has few issues: > There is an assert at [1]: > TRAP: failed Assert("conflictlogrel != NULL"), File: > "../src/backend/replication/logical/conflict.c", Line: 195, PID: 59658 > 0xb3a472 <ExceptionalCondition+0x82> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x9433b8 <ReportApplyConflict+0x13c8> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x7d91fb <CheckAndReportConflict+0x2cb> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x7d8b4b <ExecSimpleRelationInsert+0x10b> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x96525b <apply_dispatch+0x23eb> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x966150 <start_apply+0x310> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x967010 <run_apply_worker+0x290> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x966d6d <ApplyWorkerMain+0x1d> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x90ff0c <BackgroundWorkerMain+0x1cc> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x914a25 <postmaster_child_launch+0x145> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x917b77 <maybe_start_bgworkers+0x1d7> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x9198f5 <ServerLoop+0x1c65> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x917156 <PostmasterMain+0x1116> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > 0x83c16d <main+0x48d> at > /tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres > > There are the following warnings at [2]: > [14:55:07.472] conflict.c:187:6: error: variable 'log_dest_clt' is > used uninitialized whenever 'if' condition is false > [-Werror,-Wsometimes-uninitialized] > [14:55:07.472] 187 | if (dest == CONFLICT_LOG_DEST_TABLE || > dest == CONFLICT_LOG_DEST_ALL) > [14:55:07.472] | > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > [14:55:07.472] conflict.c:193:6: note: uninitialized use occurs here > [14:55:07.472] 193 | if (log_dest_clt) > [14:55:07.472] | ^~~~~~~~~~~~ > [14:55:07.472] conflict.c:187:2: note: remove the 'if' if its > condition is always true > [14:55:07.472] 187 | if (dest == CONFLICT_LOG_DEST_TABLE || > dest == CONFLICT_LOG_DEST_ALL) > [14:55:07.472] | > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > [14:55:07.472] 188 | log_dest_clt = true; > [14:55:07.472] conflict.c:176:21: note: initialize the variable > 'log_dest_clt' to silence this warning > [14:55:07.472] 176 | bool log_dest_clt; > [14:55:07.472] | ^ > [14:55:07.472] | = false > [14:55:07.472] 1 error generated. > > [1] - https://api.cirrus-ci.com/v1/artifact/task/5630092254117888/testrun/build/testrun/subscription/026_stats/log/026_stats_subscriber.log > [2] - https://cirrus-ci.com/task/5770829742473216 In the below, log_dest_clt is declared without initialization. Later, they are assigned only for specific dest values. This leaves a bug when dest is set to CONFLICT_LOG_DEST_LOG. In that case,log_dest_clt retains an indeterminate stack value. Because log_dest_clt is uninitialized, it may evaluate to true depending on the garbage value present on the stack. That can incorrectly enter the CLT insertion path and trigger assertion failure. ... @@ -131,30 +170,92 @@ ReportApplyConflict(EState *estate, ResultRelInfo *relinfo, int elevel, ConflictType type, TupleTableSlot *searchslot, TupleTableSlot *remoteslot, List *conflicttuples) { - Relation localrel = relinfo->ri_RelationDesc; - StringInfoData err_detail; + Relation localrel = relinfo->ri_RelationDesc; + ConflictLogDest dest; + Relation conflictlogrel; + bool log_dest_clt; + bool log_dest_logfile; ... ... - pgstat_report_subscription_conflict(MySubscription->oid, type); + if (dest == CONFLICT_LOG_DEST_TABLE || dest == CONFLICT_LOG_DEST_ALL) + log_dest_clt = true; + if (dest == CONFLICT_LOG_DEST_LOG || dest == CONFLICT_LOG_DEST_ALL) + log_dest_logfile = true; - ereport(elevel, - errcode_apply_conflict(type), - errmsg("conflict detected on relation \"%s.%s\": conflict=%s", - get_namespace_name(RelationGetNamespace(localrel)), - RelationGetRelationName(localrel), - ConflictTypeNames[type]), - errdetail_internal("%s", err_detail.data)); + /* Insert to table if requested. */ + if (log_dest_clt) + { + Assert(conflictlogrel != NULL); ... The attached v31 version has the changes to fix this issue by initializing the variable. This also has the rebased version along with the rebased version of the 'Preserve conflict log destination and subscription OID for subscriptions' patch which is present in the 0005 patch. Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Nisha Moond <nisha.moond412@gmail.com> — 2026-05-07T11:53:58Z
> The attached v31 version has the changes to fix this issue by > initializing the variable. > This also has the rebased version along with the rebased version of > the 'Preserve conflict log destination and subscription OID for > subscriptions' patch which is present in the 0005 patch. Thanks for the patches, please find a few comments on the patches 002 to 004: 1) I noticed that if a non-superuser creates the subscription, but a superuser later runs: ALTER SUBSCRIPTION ... SET (conflict_log_table = all) then the conflict table ends up being owned by the superuser instead of the subscription owner. Though, apply_worker would be able to insert into the CLT, but the subscription owner cannot access its associated conflict log table, I think this happens because the heap_create_with_catalog() call uses GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER SUBSCRIPTION, it causes the table to be created under the ALTER command executor’s ownership instead of the subscription owner. Since only the subscription owner or a superuser can run ALTER SUBSCRIPTION, should we always create the table with the subscription owner as the owner? 2) In GetConflictLogDestAndTable(): + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); + if (conflictlogrel == NULL) + elog(ERROR, "could not open conflict log table (OID %u)", + conflictlogrelid); + + return conflictlogrel; I think the "if (conflictlogrel == NULL)" check is unreachable. The table_open()->relation_open() will error-out if it fails to open the relation. 3) Minor typo in create_conflict_log_table() comments: + /* + * Check for an existing table with the sname name in the pg_conflict namespace. + * A collision should not occur under normal operation, but we must handle cases + * where a table has been created manually. + */ ==> double space in "A collision should not" 4) The document patch-0004 is still referring to the old name "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". -- Thanks, Nisha
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-08T02:58:17Z
On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > The attached v31 version has the changes to fix this issue by > > initializing the variable. > > This also has the rebased version along with the rebased version of > > the 'Preserve conflict log destination and subscription OID for > > subscriptions' patch which is present in the 0005 patch. > > Thanks for the patches, please find a few comments on the patches 002 to 004: > > 1) I noticed that if a non-superuser creates the subscription, but a > superuser later runs: > ALTER SUBSCRIPTION ... SET (conflict_log_table = all) > then the conflict table ends up being owned by the superuser instead > of the subscription owner. Though, apply_worker would be able to > insert into the CLT, but the subscription owner cannot access its > associated conflict log table, > > I think this happens because the heap_create_with_catalog() call uses > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER > SUBSCRIPTION, it causes the table to be created under the ALTER > command executor’s ownership instead of the subscription owner. > > Since only the subscription owner or a superuser can run ALTER > SUBSCRIPTION, should we always create the table with the subscription > owner as the owner? Yeah that makes sense. > 2) In GetConflictLogDestAndTable(): > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > + if (conflictlogrel == NULL) > + elog(ERROR, "could not open conflict log table (OID %u)", > + conflictlogrelid); > + > + return conflictlogrel; > > I think the "if (conflictlogrel == NULL)" check is unreachable. The > table_open()->relation_open() will error-out if it fails to open the > relation. Yeah, that's a valid point. > 3) Minor typo in create_conflict_log_table() comments: > + /* > + * Check for an existing table with the sname name in the pg_conflict > namespace. > + * A collision should not occur under normal operation, but we must > handle cases > + * where a table has been created manually. > + */ > ==> double space in "A collision should not" > > 4) The document patch-0004 is still referring to the old name > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". I will fix these in next version. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-08T12:09:53Z
On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > The attached v31 version has the changes to fix this issue by > > > initializing the variable. > > > This also has the rebased version along with the rebased version of > > > the 'Preserve conflict log destination and subscription OID for > > > subscriptions' patch which is present in the 0005 patch. > > > > Thanks for the patches, please find a few comments on the patches 002 to 004: > > > > 1) I noticed that if a non-superuser creates the subscription, but a > > superuser later runs: > > ALTER SUBSCRIPTION ... SET (conflict_log_table = all) > > then the conflict table ends up being owned by the superuser instead > > of the subscription owner. Though, apply_worker would be able to > > insert into the CLT, but the subscription owner cannot access its > > associated conflict log table, > > > > I think this happens because the heap_create_with_catalog() call uses > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER > > SUBSCRIPTION, it causes the table to be created under the ALTER > > command executor’s ownership instead of the subscription owner. > > > > Since only the subscription owner or a superuser can run ALTER > > SUBSCRIPTION, should we always create the table with the subscription > > owner as the owner? > > Yeah that makes sense. > > > 2) In GetConflictLogDestAndTable(): > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > > + if (conflictlogrel == NULL) > > + elog(ERROR, "could not open conflict log table (OID %u)", > > + conflictlogrelid); > > + > > + return conflictlogrel; > > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The > > table_open()->relation_open() will error-out if it fails to open the > > relation. > > Yeah, that's a valid point. > > > 3) Minor typo in create_conflict_log_table() comments: > > + /* > > + * Check for an existing table with the sname name in the pg_conflict > > namespace. > > + * A collision should not occur under normal operation, but we must > > handle cases > > + * where a table has been created manually. > > + */ > > ==> double space in "A collision should not" > > > > 4) The document patch-0004 is still referring to the old name > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". > > I will fix these in next version. > This fixes all 4 comments Nisha reported. And 0002 is an add-on patch to allow ownership transfer. I haven't yet changed the clt display witjh \dRs+ reported by shveta. I have a work-in-progress patch, but I couldn't get it to work. I will try to debug that tomorrow or next week whenever I get time. Open Items: - Add comments explaining the reasoning for the ownership change - change clt display - Test cases for ownership change, truncation, deletion, and select from a non-superuser owner of subscriber. @vignesh C Your patch needs to be rebased. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-11T06:21:19Z
On Fri, May 8, 2026 at 5:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > The attached v31 version has the changes to fix this issue by > > > > initializing the variable. > > > > This also has the rebased version along with the rebased version of > > > > the 'Preserve conflict log destination and subscription OID for > > > > subscriptions' patch which is present in the 0005 patch. > > > > > > Thanks for the patches, please find a few comments on the patches 002 to 004: > > > > > > 1) I noticed that if a non-superuser creates the subscription, but a > > > superuser later runs: > > > ALTER SUBSCRIPTION ... SET (conflict_log_table = all) > > > then the conflict table ends up being owned by the superuser instead > > > of the subscription owner. Though, apply_worker would be able to > > > insert into the CLT, but the subscription owner cannot access its > > > associated conflict log table, > > > > > > I think this happens because the heap_create_with_catalog() call uses > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER > > > SUBSCRIPTION, it causes the table to be created under the ALTER > > > command executor’s ownership instead of the subscription owner. > > > > > > Since only the subscription owner or a superuser can run ALTER > > > SUBSCRIPTION, should we always create the table with the subscription > > > owner as the owner? > > > > Yeah that makes sense. > > > > > 2) In GetConflictLogDestAndTable(): > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > > > + if (conflictlogrel == NULL) > > > + elog(ERROR, "could not open conflict log table (OID %u)", > > > + conflictlogrelid); > > > + > > > + return conflictlogrel; > > > > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The > > > table_open()->relation_open() will error-out if it fails to open the > > > relation. > > > > Yeah, that's a valid point. > > > > > 3) Minor typo in create_conflict_log_table() comments: > > > + /* > > > + * Check for an existing table with the sname name in the pg_conflict > > > namespace. > > > + * A collision should not occur under normal operation, but we must > > > handle cases > > > + * where a table has been created manually. > > > + */ > > > ==> double space in "A collision should not" > > > > > > 4) The document patch-0004 is still referring to the old name > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". > > > > I will fix these in next version. > > > > This fixes all 4 comments Nisha reported. And 0002 is an add-on patch > to allow ownership transfer. I haven't yet changed the clt display > witjh \dRs+ reported by shveta. I have a work-in-progress patch, but > I couldn't get it to work. I will try to debug that tomorrow or next > week whenever I get time. > > Open Items: > - Add comments explaining the reasoning for the ownership change > - change clt display > - Test cases for ownership change, truncation, deletion, and select > from a non-superuser owner of subscriber. > > @vignesh C Your patch needs to be rebased. > Few comments on 001: 1) + /* + * Check for an existing table with the sname name in the pg_conflict namespace. + * A collision should not occur under normal operation, but we must handle cases + * where a table has been created manually. + */ We can extend the comment to mention 'allow_system_table_mods' otherwise it may be difficult to understand how a table could be created in pg_conflict. Suggestion: ...has been created manually when allow_system_table_mods is ON. 2) + /* Create conflict log table. */ + relid = heap_create_with_catalog(relname, + PG_CONFLICT_NAMESPACE, Post this, it will be good to have sanity check on relid before we start using it. Assert(relid != InvalidOid); 3) Currently the structure of CLT is: +const ConflictLogColumnDef ConflictLogSchema[] = { + { .attname = "relid", .atttypid = OIDOID }, + { .attname = "schemaname", .atttypid = TEXTOID }, + { .attname = "relname", .atttypid = TEXTOID }, + { .attname = "conflict_type", .atttypid = TEXTOID }, + { .attname = "remote_xid", .atttypid = XIDOID }, + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, + { .attname = "remote_origin", .atttypid = TEXTOID }, + { .attname = "replica_identity", .atttypid = JSONOID }, + { .attname = "remote_tuple", .atttypid = JSONOID }, + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } +}; So if user has to delete a conflict from CLT after resolving it, then what is the user-friendly way to do it? IMO, it will be cumbersome (and perhaps error-prone) to write a query with remote_commit_lsn, remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others) think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS AS IDENTITY). This provides a simple, unique identifier so the user can easily target a single row (WHERE log_id = 105) or purge a batch of old conflicts (WHERE log_id < 1000). 4) When querying pg_subscription, I noticed that the two CLT-related fields (subconflictlogrelid and subconflictlogdest) are positioned far apart, making them difficult to track and relate. Do you think we shall have both next to each other. If we do that, that will mean 'subconflictlogdes' coming before 'subconninfo', but is should be fine (IMO), as it will be right next to 'subconflictlogrelid' postgres=# select * from pg_subscription; oid | subdbid | subskiplsn | subname | subowner | subenabled | subbinary | substream | subtwophasestate | subdisableonerr | subpasswordrequired | subrunasowner | subfailover | subretaindeadtuples | submaxretent ion | subretentionactive | subserver | subconflictlogrelid | subconninfo | subslotname | subsynccommit | subwalrcvtimeout | subpublications | subconflictlogdes t | suborigin -------+---------+------------+---------+----------+------------+-----------+-----------+------------------+-----------------+---------------------+---------------+-------------+---------------------+------------- ----+--------------------+-----------+---------------------+-------------------------------------------------------------------+-------------+---------------+------------------+-----------------+------------------ --+----------- 16387 | 5 | 0/00000000 | sub1 | 10 | t | f | p | d | f | t | f | f | f | 0 | f | 0 | 16388 | dbname=postgres host=localhost user=shveta port=5433 | sub1 | off | -1 | {pub1} | all 5) +-- verify subconflictlogdest is 'log' and relid is 0 (InvalidOid) for default case We can mention 'subconflictlogrelid' instead of 'relid' thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-11T09:07:46Z
Please see the test below: CREATE USER user1 LOGIN ; ALTER subscription sub1 owner to user1; --Now as expected, user1 is able to access, delete or truncate: postgres=> select count(*) from pg_conflict.pg_conflict_log_16387; 0 postgres=> delete from pg_conflict.pg_conflict_log_16387; DELETE 0 --When user1 tries to do insert, it gets error: postgres=> insert into pg_conflict.pg_conflict_log_16387 values (0); ERROR: permission denied for table pg_conflict_log_16387 While superuser gets postgres=# insert into pg_conflict.pg_conflict_log_16387 values (0); ERROR: cannot modify or insert data into conflict log table "pg_conflict_log_16387" DETAIL: Conflict log tables are system-managed and only support cleanup via DELETE or TRUNCATE. ----- The error for user1 seems less intuitive as user1 owns pg_conflict_log_16387. Shouldn't the non-superuser but the owner of the CLT see the same error as the superuser is getting? I think the error is due to the recent changes made in pg_class_aclmask_ext(). What do others think here? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Shlok Kyal <shlok.kyal.oss@gmail.com> — 2026-05-11T09:29:22Z
On Fri, 8 May 2026 at 17:40, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > The attached v31 version has the changes to fix this issue by > > > > initializing the variable. > > > > This also has the rebased version along with the rebased version of > > > > the 'Preserve conflict log destination and subscription OID for > > > > subscriptions' patch which is present in the 0005 patch. > > > > > > Thanks for the patches, please find a few comments on the patches 002 to 004: > > > > > > 1) I noticed that if a non-superuser creates the subscription, but a > > > superuser later runs: > > > ALTER SUBSCRIPTION ... SET (conflict_log_table = all) > > > then the conflict table ends up being owned by the superuser instead > > > of the subscription owner. Though, apply_worker would be able to > > > insert into the CLT, but the subscription owner cannot access its > > > associated conflict log table, > > > > > > I think this happens because the heap_create_with_catalog() call uses > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER > > > SUBSCRIPTION, it causes the table to be created under the ALTER > > > command executor’s ownership instead of the subscription owner. > > > > > > Since only the subscription owner or a superuser can run ALTER > > > SUBSCRIPTION, should we always create the table with the subscription > > > owner as the owner? > > > > Yeah that makes sense. > > > > > 2) In GetConflictLogDestAndTable(): > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > > > + if (conflictlogrel == NULL) > > > + elog(ERROR, "could not open conflict log table (OID %u)", > > > + conflictlogrelid); > > > + > > > + return conflictlogrel; > > > > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The > > > table_open()->relation_open() will error-out if it fails to open the > > > relation. > > > > Yeah, that's a valid point. > > > > > 3) Minor typo in create_conflict_log_table() comments: > > > + /* > > > + * Check for an existing table with the sname name in the pg_conflict > > > namespace. > > > + * A collision should not occur under normal operation, but we must > > > handle cases > > > + * where a table has been created manually. > > > + */ > > > ==> double space in "A collision should not" > > > > > > 4) The document patch-0004 is still referring to the old name > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". > > > > I will fix these in next version. > > > > This fixes all 4 comments Nisha reported. And 0002 is an add-on patch > to allow ownership transfer. I haven't yet changed the clt display > witjh \dRs+ reported by shveta. I have a work-in-progress patch, but > I couldn't get it to work. I will try to debug that tomorrow or next > week whenever I get time. > > Open Items: > - Add comments explaining the reasoning for the ownership change > - change clt display > - Test cases for ownership change, truncation, deletion, and select > from a non-superuser owner of subscriber. > > @vignesh C Your patch needs to be rebased. > Hi Dilip, I started reviewing the patches. Here are minor comments for 0001 patch: 1. If allow_system_table_mods=on we can add/drop columns of conflict log tables But the same for pg_toast or other catalog tables are prohibited. Also for other system tables we are getting following error. postgres=# ALTER TABLE pg_toast.pg_toast_16413 DROP COLUMN chunk_seq; ERROR: ALTER action DROP COLUMN cannot be performed on relation "pg_toast_16413" DETAIL: This operation is not supported for TOAST tables. postgres=# ALTER TABLE pg_publication DROP COLUMN pubname; ERROR: cannot drop column pubname of table pg_publication because it is required by the database system postgres=# ALTER TABLE pg_description DROP COLUMN description; ERROR: cannot drop column description of table pg_description because it is required by the database system postgres=# ALTER TABLE pg_conflict.pg_conflict_log_16408 DROP COLUMN relname; ALTER TABLE Should we prohibit it for conflict log tables as well? 2. Should we also have a 'dropped conflict log table' NOTICE, when the subscription is dropped? postgres=# CREATE SUBSCRIPTION sub1 connection 'dbname=postgres host=localhost port=5432' publication pub1 WITH (conflict_log_destination = 'TABLE'); NOTICE: created conflict log table "pg_conflict.pg_conflict_log_16394" for subscription "sub1" NOTICE: created replication slot "sub1" on publisher CREATE SUBSCRIPTION postgres=# drop subscription sub1; NOTICE: dropped replication slot "sub1" on publisher DROP SUBSCRIPTION 3. Typo: + /* + * Check for an existing table with the sname name in the pg_conflict namespace. sname -> same Thanks, Shlok Kyal
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-11T10:43:52Z
On Fri, 8 May 2026 at 17:40, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > The attached v31 version has the changes to fix this issue by > > > > initializing the variable. > > > > This also has the rebased version along with the rebased version of > > > > the 'Preserve conflict log destination and subscription OID for > > > > subscriptions' patch which is present in the 0005 patch. > > > > > > Thanks for the patches, please find a few comments on the patches 002 to 004: > > > > > > 1) I noticed that if a non-superuser creates the subscription, but a > > > superuser later runs: > > > ALTER SUBSCRIPTION ... SET (conflict_log_table = all) > > > then the conflict table ends up being owned by the superuser instead > > > of the subscription owner. Though, apply_worker would be able to > > > insert into the CLT, but the subscription owner cannot access its > > > associated conflict log table, > > > > > > I think this happens because the heap_create_with_catalog() call uses > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER > > > SUBSCRIPTION, it causes the table to be created under the ALTER > > > command executor’s ownership instead of the subscription owner. > > > > > > Since only the subscription owner or a superuser can run ALTER > > > SUBSCRIPTION, should we always create the table with the subscription > > > owner as the owner? > > > > Yeah that makes sense. > > > > > 2) In GetConflictLogDestAndTable(): > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > > > + if (conflictlogrel == NULL) > > > + elog(ERROR, "could not open conflict log table (OID %u)", > > > + conflictlogrelid); > > > + > > > + return conflictlogrel; > > > > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The > > > table_open()->relation_open() will error-out if it fails to open the > > > relation. > > > > Yeah, that's a valid point. > > > > > 3) Minor typo in create_conflict_log_table() comments: > > > + /* > > > + * Check for an existing table with the sname name in the pg_conflict > > > namespace. > > > + * A collision should not occur under normal operation, but we must > > > handle cases > > > + * where a table has been created manually. > > > + */ > > > ==> double space in "A collision should not" > > > > > > 4) The document patch-0004 is still referring to the old name > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". > > > > I will fix these in next version. > > > > This fixes all 4 comments Nisha reported. And 0002 is an add-on patch > to allow ownership transfer. I haven't yet changed the clt display > witjh \dRs+ reported by shveta. I have a work-in-progress patch, but > I couldn't get it to work. I will try to debug that tomorrow or next > week whenever I get time. > > Open Items: > - Add comments explaining the reasoning for the ownership change > - change clt display > - Test cases for ownership change, truncation, deletion, and select > from a non-superuser owner of subscriber. The attached patch addresses the remaining open items and is provided separately as patch 0005. @Dilip Kumar, if the changes look good to you, please merge them into the corresponding patch. Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-12T06:00:54Z
On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote: > > Few comments on 001: > 3) > Currently the structure of CLT is: > > +const ConflictLogColumnDef ConflictLogSchema[] = { > + { .attname = "relid", .atttypid = OIDOID }, > + { .attname = "schemaname", .atttypid = TEXTOID }, > + { .attname = "relname", .atttypid = TEXTOID }, > + { .attname = "conflict_type", .atttypid = TEXTOID }, > + { .attname = "remote_xid", .atttypid = XIDOID }, > + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, > + { .attname = "remote_origin", .atttypid = TEXTOID }, > + { .attname = "replica_identity", .atttypid = JSONOID }, > + { .attname = "remote_tuple", .atttypid = JSONOID }, > + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } > +}; > > So if user has to delete a conflict from CLT after resolving it, then > what is the user-friendly way to do it? IMO, it will be cumbersome > (and perhaps error-prone) to write a query with remote_commit_lsn, > remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others) > think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS > AS IDENTITY). This provides a simple, unique identifier so the user > can easily target a single row (WHERE log_id = 105) or purge a batch > of old conflicts (WHERE log_id < 1000). I agree with this. I could think of a few other possible approaches as well. The following options seem possible to make row identification/deletion easier: a) Use existing remote_commit_ts ex: DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts = '2026-05-12 10:25:46.483899+05:30'; DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts < now() - interval '100 minutes'; b) Use existing system column ctid ex: DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE ctid = '(0,1)'; c) Add a dedicated identifier conflict_id column as Shveta said DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id = 42; DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id < 100; d) Add a local conflict_logged_at timestamp DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at = '2026-05-12 10:25:46.483899+05:30'; DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at < now() - interval '100 minutes'; I'm not sure which approach would be best here. Thoughts? Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-12T06:29:58Z
On Tue, May 12, 2026 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote: > > > > Few comments on 001: > > 3) > > Currently the structure of CLT is: > > > > +const ConflictLogColumnDef ConflictLogSchema[] = { > > + { .attname = "relid", .atttypid = OIDOID }, > > + { .attname = "schemaname", .atttypid = TEXTOID }, > > + { .attname = "relname", .atttypid = TEXTOID }, > > + { .attname = "conflict_type", .atttypid = TEXTOID }, > > + { .attname = "remote_xid", .atttypid = XIDOID }, > > + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, > > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, > > + { .attname = "remote_origin", .atttypid = TEXTOID }, > > + { .attname = "replica_identity", .atttypid = JSONOID }, > > + { .attname = "remote_tuple", .atttypid = JSONOID }, > > + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } > > +}; > > > > So if user has to delete a conflict from CLT after resolving it, then > > what is the user-friendly way to do it? IMO, it will be cumbersome > > (and perhaps error-prone) to write a query with remote_commit_lsn, > > remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others) > > think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS > > AS IDENTITY). This provides a simple, unique identifier so the user > > can easily target a single row (WHERE log_id = 105) or purge a batch > > of old conflicts (WHERE log_id < 1000). > > I agree with this. I could think of a few other possible approaches as well. > The following options seem possible to make row identification/deletion easier: > a) Use existing remote_commit_ts > ex: > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts = > '2026-05-12 10:25:46.483899+05:30'; > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts < > now() - interval '100 minutes'; > b) Use existing system column ctid > ex: > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE ctid = '(0,1)'; > c) Add a dedicated identifier conflict_id column as Shveta said > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id = 42; > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id < 100; > d) Add a local conflict_logged_at timestamp > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at > = '2026-05-12 10:25:46.483899+05:30'; > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at > < now() - interval '100 minutes'; > I like c and d. IMO, approach 'a' is cumbersome to write query with. Approach 'b' may not be known to all. I had earlier suggested a timestamp column (pt 3 at [1]) to record conflict-occurence time (mainly 'conflict_logged_at' column) in CLT but the idea was kept on hold awaiting more feedback. Now we can revisit this. I feel 'conflict_logged_at' could be more beneficial because, going forward (based on feedback), we may range-partition this table on that field which may form as basis of historical data purge. I also suggested this in [2] (see 'That said, irrespective of what we decide') . Such a field could be basis of purging mechanism. [1]: https://www.postgresql.org/message-id/CAJpy0uCMDqcWGepcTwFPH%2BhTDjD8b72KnbL-S%2Bd-qd7ChomOyQ%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CAJpy0uAfRZa4axLV_e4gvVdmunb8BOVx%2BYr%3DXecECAVD0KnD%3DA%40mail.gmail.com thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-12T06:52:50Z
On Mon, May 11, 2026 at 2:59 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Fri, 8 May 2026 at 17:40, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > > > The attached v31 version has the changes to fix this issue by > > > > > initializing the variable. > > > > > This also has the rebased version along with the rebased version of > > > > > the 'Preserve conflict log destination and subscription OID for > > > > > subscriptions' patch which is present in the 0005 patch. > > > > > > > > Thanks for the patches, please find a few comments on the patches 002 to 004: > > > > > > > > 1) I noticed that if a non-superuser creates the subscription, but a > > > > superuser later runs: > > > > ALTER SUBSCRIPTION ... SET (conflict_log_table = all) > > > > then the conflict table ends up being owned by the superuser instead > > > > of the subscription owner. Though, apply_worker would be able to > > > > insert into the CLT, but the subscription owner cannot access its > > > > associated conflict log table, > > > > > > > > I think this happens because the heap_create_with_catalog() call uses > > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER > > > > SUBSCRIPTION, it causes the table to be created under the ALTER > > > > command executor’s ownership instead of the subscription owner. > > > > > > > > Since only the subscription owner or a superuser can run ALTER > > > > SUBSCRIPTION, should we always create the table with the subscription > > > > owner as the owner? > > > > > > Yeah that makes sense. > > > > > > > 2) In GetConflictLogDestAndTable(): > > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > > > > + if (conflictlogrel == NULL) > > > > + elog(ERROR, "could not open conflict log table (OID %u)", > > > > + conflictlogrelid); > > > > + > > > > + return conflictlogrel; > > > > > > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The > > > > table_open()->relation_open() will error-out if it fails to open the > > > > relation. > > > > > > Yeah, that's a valid point. > > > > > > > 3) Minor typo in create_conflict_log_table() comments: > > > > + /* > > > > + * Check for an existing table with the sname name in the pg_conflict > > > > namespace. > > > > + * A collision should not occur under normal operation, but we must > > > > handle cases > > > > + * where a table has been created manually. > > > > + */ > > > > ==> double space in "A collision should not" > > > > > > > > 4) The document patch-0004 is still referring to the old name > > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". > > > > > > I will fix these in next version. > > > > > > > This fixes all 4 comments Nisha reported. And 0002 is an add-on patch > > to allow ownership transfer. I haven't yet changed the clt display > > witjh \dRs+ reported by shveta. I have a work-in-progress patch, but > > I couldn't get it to work. I will try to debug that tomorrow or next > > week whenever I get time. > > > > Open Items: > > - Add comments explaining the reasoning for the ownership change > > - change clt display > > - Test cases for ownership change, truncation, deletion, and select > > from a non-superuser owner of subscriber. > > > > @vignesh C Your patch needs to be rebased. > > > Hi Dilip, > > I started reviewing the patches. > Here are minor comments for 0001 patch: > > 1. If allow_system_table_mods=on we can add/drop columns of conflict log tables > But the same for pg_toast or other catalog tables are prohibited. Also > for other system tables we are getting following error. > > postgres=# ALTER TABLE pg_toast.pg_toast_16413 DROP COLUMN chunk_seq; > ERROR: ALTER action DROP COLUMN cannot be performed on relation > "pg_toast_16413" > > DETAIL: This operation is not supported for TOAST tables. > postgres=# ALTER TABLE pg_publication DROP COLUMN pubname; > ERROR: cannot drop column pubname of table pg_publication because it > is required by the database system > postgres=# ALTER TABLE pg_description DROP COLUMN description; > ERROR: cannot drop column description of table pg_description because > it is required by the database system > > postgres=# ALTER TABLE pg_conflict.pg_conflict_log_16408 DROP COLUMN relname; > ALTER TABLE > > Should we prohibit it for conflict log tables as well? > Good catch Shlok, yes it should be restricted IMO. Another thing I found was that we could attach CLT as a partition of another table. And then add it indirectly to publication. Test: ------------------------- CREATE TABLE public.conflict_parent (LIKE pg_conflict.pg_conflict_log_16387 INCLUDING ALL) PARTITION BY LIST (conflict_type); ALTER TABLE public.conflict_parent ATTACH PARTITION pg_conflict.pg_conflict_log_16387 FOR VALUES IN ('insert_exists'); CREATE publication pub1 FOR TABLE public.conflict_parent WITH(PUBLISH_VIA_PARTITION_ROOT =false); postgres=# select * from pg_publication_tables; pubname | schemaname | tablename ---------+-------------+-----------------------+------------ pub1 | pg_conflict | pg_conflict_log_16387 --------------------------- While for toast table, 'LIKE' operation failed for the toast table: postgres=# CREATE TABLE public.fake_toast_parent ( LIKE pg_toast.pg_toast_16459 INCLUDING ALL) PARTITION BY LIST (chunk_seq); ERROR: relation "pg_toast_16459" is invalid in LIKE clause LINE 1: CREATE TABLE public.fake_toast_parent ( LIKE pg_toast.pg_toa... ^ DETAIL: This operation is not supported for TOAST tables. ~~ Trying it differently, attaching it as a partition also fails. postgres=# CREATE TABLE public.fake_toast_parent ( chunk_id oid, chunk_seq int4, chunk_data bytea) PARTITION BY LIST (chunk_seq); CREATE TABLE postgres=# ALTER TABLE public.fake_toast_parent ATTACH PARTITION pg_toast.pg_toast_16459 FOR VALUES IN (1); ERROR: ALTER action ATTACH PARTITION cannot be performed on relation "pg_toast_16459" DETAIL: This operation is not supported for TOAST tables. ~~ I have tried above tests with allow_system_table_mods=on; So toast table does not support 'LIKE'. It also does not support attaching it as a partition to another table. IMO, we need the same restrcitions for CLT. Thoughts? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-12T09:19:15Z
On Tue, May 12, 2026 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote: > > > > Few comments on 001: > > 3) > > Currently the structure of CLT is: > > > > +const ConflictLogColumnDef ConflictLogSchema[] = { > > + { .attname = "relid", .atttypid = OIDOID }, > > + { .attname = "schemaname", .atttypid = TEXTOID }, > > + { .attname = "relname", .atttypid = TEXTOID }, > > + { .attname = "conflict_type", .atttypid = TEXTOID }, > > + { .attname = "remote_xid", .atttypid = XIDOID }, > > + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, > > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, > > + { .attname = "remote_origin", .atttypid = TEXTOID }, > > + { .attname = "replica_identity", .atttypid = JSONOID }, > > + { .attname = "remote_tuple", .atttypid = JSONOID }, > > + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } > > +}; > > > > So if user has to delete a conflict from CLT after resolving it, then > > what is the user-friendly way to do it? IMO, it will be cumbersome > > (and perhaps error-prone) to write a query with remote_commit_lsn, > > remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others) > > think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS > > AS IDENTITY). This provides a simple, unique identifier so the user > > can easily target a single row (WHERE log_id = 105) or purge a batch > > of old conflicts (WHERE log_id < 1000). > > I agree with this. I could think of a few other possible approaches as well. > The following options seem possible to make row identification/deletion easier: > a) Use existing remote_commit_ts > ex: > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts = > '2026-05-12 10:25:46.483899+05:30'; > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE remote_commit_ts < > now() - interval '100 minutes'; > b) Use existing system column ctid > ex: > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE ctid = '(0,1)'; > c) Add a dedicated identifier conflict_id column as Shveta said > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id = 42; > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_id < 100; > d) Add a local conflict_logged_at timestamp > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at > = '2026-05-12 10:25:46.483899+05:30'; > DELETE FROM pg_conflict.pg_conflict_log_16400 WHERE conflict_logged_at > < now() - interval '100 minutes'; > We can use approach (c) as that sounds easier for manual conflict resolutions. Though, I feel in practise different fields could be used while removing, say when transactions are interleaved, one may prefer to remove based on remote_xid or remote_lsn. -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-12T09:21:15Z
On Tue, May 12, 2026 at 12:00 PM shveta malik <shveta.malik@gmail.com> wrote: > > I had earlier suggested a timestamp column (pt 3 at [1]) to record > conflict-occurence time (mainly 'conflict_logged_at' column) in CLT > but the idea was kept on hold awaiting more feedback. Now we can > revisit this. > > I feel 'conflict_logged_at' could be more beneficial because, going > forward (based on feedback), we may range-partition this table on that > field which may form as basis of historical data purge. I also > suggested this in [2] (see 'That said, irrespective of what we > decide') . Such a field could be basis of purging mechanism. > Fair enough. We can extend the table with this field after more discussion, so it will be better to pick up this discussion once the base feature is committed. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-12T09:26:25Z
On Tue, May 12, 2026 at 2:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, May 12, 2026 at 12:00 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > I had earlier suggested a timestamp column (pt 3 at [1]) to record > > conflict-occurence time (mainly 'conflict_logged_at' column) in CLT > > but the idea was kept on hold awaiting more feedback. Now we can > > revisit this. > > > > I feel 'conflict_logged_at' could be more beneficial because, going > > forward (based on feedback), we may range-partition this table on that > > field which may form as basis of historical data purge. I also > > suggested this in [2] (see 'That said, irrespective of what we > > decide') . Such a field could be basis of purging mechanism. > > > > Fair enough. We can extend the table with this field after more > discussion, so it will be better to pick up this discussion once the > base feature is committed. > Okay. Works for me. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-13T06:07:19Z
On Fri, May 1, 2026 at 11:46 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Apr 30, 2026 at 10:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Wed, Apr 29, 2026 at 12:34 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Wed, Apr 29, 2026 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > On Tue, Apr 28, 2026 at 7:53 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > 2. > > > > > > +typedef enum ConflictLogDest > > > > > > +{ > > > > > > + /* Log conflicts to the server logs */ > > > > > > + CONFLICT_LOG_DEST_LOG = 1 << 0, /* 0x01 */ > > > > > > + > > > > > > + /* Log conflicts to an internally managed conflict log table */ > > > > > > + CONFLICT_LOG_DEST_TABLE = 1 << 1, /* 0x02 */ > > > > > > + > > > > > > + /* Convenience bitmask for all supported destinations */ > > > > > > + CONFLICT_LOG_DEST_ALL = (CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE) > > > > > > +} ConflictLogDest; > > > > > > + > > > > > > +/* > > > > > > + * Array mapping for converting internal enum to string. > > > > > > + */ > > > > > > +static const char *const ConflictLogDestNames[] = { > > > > > > + [CONFLICT_LOG_DEST_LOG] = "log", > > > > > > + [CONFLICT_LOG_DEST_TABLE] = "table", > > > > > > + [CONFLICT_LOG_DEST_ALL] = "all" > > > > > > +}; > > > > > > > > > > > > Defining an array this way could be an Array size issue. Actually the > > > > > > array has just three elements so the last element should be at > > > > > > ConflictLogDestNames[2] but if we go by the above definition, it will > > > > > > be ConflictLogDestNames[3]. Can we define by referring the following > > > > > > existing way: > > > > > > > > I was analyzing this because I remember we were initially using the > > > > format you suggested and switched to the bit format to enable direct > > > > bitwise operations elsewhere. I think Peter suggested that [1], and > > > > the argument was that the bitwise operation is easy if we represent > > > > them as a bit. Also, since we would not have too many options, the > > > > array size shouldn't be an issue. But I understand your point: adding > > > > more elements will cause the array size to grow very fast as this is > > > > using sparse array. Let's see what others think about this, and then > > > > we can decide whether to change it back? > > > > > > > > > > The benefit of the current approach is that checking whether the > > > destination is TABLE becomes straightforward: > > > > > > IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE) > > > > > > if we go by regular enum values (simialr to XLogSource), then it will be: > > > > > > if (opts.logdest == CONFLICT_LOG_DEST_TABLE || > > > opts.logdest == CONFLICT_LOG_DEST_ALL) > > > > Right > > > > > For ease of extending the enum and its corresponding text mappings, my > > > personal preference is still the regular (non-bitwise) enum approach. > > > > Yeah, that's my personal preference too. But Peter had strong stand > > on keeping as bitwise so that we can directly use > > IsSet(opts.conflictLogDest, CONFLICT_LOG_DEST_TABLE) operations. > > Since this array shouldn't have many options, a sparse array is not an > > issue. So lets see what @Peter Smith has to say here and then we can > > build a concensus on this. > > > > > But if we anticipate adding more destination options in the future > > > that would be covered by ALL, checking for those in code could lead to > > > growing chains of OR conditions, whereas the bitwise approach scales > > > more cleanly in that respect. So I think the choice depends on what > > > kinds of future extensions we expect. > > > > > > Do we have plans to add more options that would naturally fall under > > > ALL? Or do we instead expect additions that are mutually exclusive; > > > for example, splitting CONFLICT_LOG_DEST_LOG into something like > > > CONFLICT_LOG_DEST_JSON_LOG and CONFLICT_LOG_DEST_TEXT_LOG, which may > > > not make sense to group under ALL in the same way? > > > > Currently, I haven't considered which options would naturally fall > > under "ALL." Perhaps if we plan targets other than logs and files, > > those might also fall under "ALL." > > I have fixed all the reported comments except these four. > 1. I'm changing the ConflictLogDest enum from bitmap to integer. I can > revert this in the next version but I want to see Peter's opinion > first, as he suggested using a bitmap to easily apply bitwise > operators. > Sorry for the delay in responding. I have been away. Yes, I recall thinking bitmaps were a tidy way of checking if a CLT was required, just by: IsSet(opts.conflictlogdest,CONFLICT_LOG_DEST_TABLE) IMO, "all" is not really a discrete target value... it meant more like "a combination of all the other ones". That is why bitmaps felt like a better fit to me. Of course, then you will have the (not very) sparse designated-initializer array of names that some people objected to: +static const char *const ConflictLogDestNames[] = { + [CONFLICT_LOG_DEST_LOG] = "log", + [CONFLICT_LOG_DEST_TABLE] = "table", + [CONFLICT_LOG_DEST_ALL] = "all" +}; TBH, I did not think the sparse array posed any real problem because even if there were 5 target values (which is way more than I could imagine it growing to) that would still only be a sparse array of 2^5 elements which seemed hardly worth worrying about. Anyway, it is fine by me if you want to revert to a plain enum. The code of CreateSubscription/AlterSubscription becomes a bit clunkier now having to check CONFLICT_LOG_DEST_ALL, but it's OK. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-13T06:12:54Z
Hi Dilip/Vignesh. Some review comments for v33-0001. ====== src/backend/catalog/aclchk.c pg_class_aclmask_ext: 1. if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) && - IsSystemClass(table_oid, classForm) && - classForm->relkind != RELKIND_VIEW && + IsConflictClass(classForm) && !superuser_arg(roleid)) - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE); + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) && + IsSystemClass(table_oid, classForm) && + classForm->relkind != RELKIND_VIEW && + !superuser_arg(roleid)) + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); The new patched code seems a bit repetitive. How about refactoring like below and putting the comments where they belong. if (!superuser_arg(roleid)) { if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) { if (IsSystemClass(table_oid, classForm) && classForm->relkind != RELKIND_VIEW) { /* * Deny anyone permission to update a system catalog unless * pg_authid.rolsuper is set. * * As of 7.4 we have some updatable system views; those shouldn't be * protected in this way. Assume the view rules can take care of * themselves. ACL_USAGE is if we ever have system sequences. */ mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); } else if (IsConflictClass(classForm)) { /* * For conflict log tables, we allow non-superusers to perform DELETE * and TRUNCATE for maintenance, while still restricting INSERT, * UPDATE, and USAGE. */ mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE); } } } else { /* Superusers bypass all permission-checking. */ ReleaseSysCache(tuple); return mask; } ====== src/backend/catalog/catalog.c IsConflictClass: 2. +/* + * IsConflictClass - Check if the given pg_class tuple belongs to the conflict + * namespace. + */ This function comment looks different from all the nearby ones where the function name appears on a line by itself. ====== src/backend/catalog/heap.c heap_create: 3. if (!allow_system_table_mods && ((IsCatalogNamespace(relnamespace) && relkind != RELKIND_INDEX) || - IsToastNamespace(relnamespace)) && + IsToastNamespace(relnamespace) || + IsConflictNamespace(relnamespace)) && Is this code correct? It seems like it is conveniently re-using a similar error, which is not quite appropriate. The comment refers to creating relations in pg_catalog. The errdetail refers to "System catalog modifications" But, the CLT is neither in pg_catalog schema, nor is it a system catalog. ====== src/backend/catalog/namespace.c CheckSetNamespace: 4. - * We complain if either the old or new namespaces is a temporary schema - * (or temporary toast schema), or if either the old or new namespaces is the - * TOAST schema. + * We complain if either the old or new namespaces is a temporary schema, + * temporary toast schema, the TOAST schema, or the CONFLICT schema. TOAST is uppercase because it is an acronym, but I see no reason why "CONFLICT" is uppercase. Maybe replace that with pg_conflict. ~~~ 5. + + /* similarly for CONFLICT schema */ + if (nspOid == PG_CONFLICT_NAMESPACE || oldNspOid == PG_CONFLICT_NAMESPACE) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot move objects into or out of CONFLICT schema"))); Ditto for the uppercase "CONFLICT" in the comment and in the errmsg. Say pg_conflict. ====== src/backend/catalog/pg_publication.c 6. + + /* Can't be conflict log table */ + if (IsConflictNamespace(RelationGetNamespace(targetrel))) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg(errormsg, relname), + errdetail("This operation is not supported for conflict log tables."))); I felt this code is quite similar to the "Can't be system table" check, so it might be better to move it to be adjacent to that. ====== src/backend/commands/subscriptioncmds.c CreateSubscription: 7. + /* Always set the destination, default will be 'log'. */ + values[Anum_pg_subscription_subconflictlogdest - 1] = + CStringGetTextDatum(ConflictLogDestNames[opts.conflictlogdest]); None of the other values[] assignments here have comments talking about defaults etc, so why is this one different? ~~~ 8. Despite some of these just being static, I am beginning to think that the "conflict" specific CLT code might be more appropriate to be put in conflict.c, along with the CLT schema etc. e.g. functions like: - create_conflict_log_table_tupdesc - create_conflict_log_table - GetLogDestination ~~~ create_conflict_log_table: 9. + snprintf(relname, NAMEDATALEN, "pg_conflict_log_%u", subid); Would it be more helpful if the generated table name describes what that %u means? e.g. "pg_conflict_log_for_subid_%u" ~~~ 10. + /* + * Check for an existing table with the sname name in the pg_conflict namespace. + * A collision should not occur under normal operation, but we must handle cases + * where a table has been created manually. + */ + if (OidIsValid(get_relname_relid(relname, PG_CONFLICT_NAMESPACE))) + ereport(ERROR, + (errcode(ERRCODE_DUPLICATE_TABLE), + errmsg("conflict log table pg_conflict.\"%s\" already exists", relname), + errhint("A table with the same name already exists. " + "To proceed, drop the existing table and retry."))); + 10a. Typo /sname name/same name/ ~ 10b. That 1st sentence of the errhint seems unnecessary because it is saying the same as the errmsg. ====== src/backend/executor/execMain.c 11. + + /* + * Conflict log tables are managed by the system to record logical + * replication conflicts. We allow DELETE and TRUNCATE to permit users to + * manually prune these logs, but manual data insertion or modification + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the + * system-generated logs. + * + * Since TRUNCATE is handled as a separate utility command, we only need + * to explicitly permit CMD_DELETE here. + */ + if (IsConflictNamespace(RelationGetNamespace(resultRel)) && + operation != CMD_DELETE) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("cannot modify or insert data into conflict log table \"%s\"", + RelationGetRelationName(resultRel)), + errdetail("Conflict log tables are system-managed and only support cleanup via DELETE or TRUNCATE."))); It somehow feels backwards to check "operation != CMD_DELETE", with the obscure comment that TRUNCATE is handled elsewhere. How about just check if "(operation == CMD_INSERT || operation == CMD_UPDATE || operation == CMD_MERGE)". ~~~ 12. + + /* + * Conflict log tables are managed by the system to record logical + * replication conflicts. We do not allow locking rows in CONFLICT + * relations. + */ + if (IsConflictNamespace(RelationGetNamespace(rel))) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg("cannot lock rows in conflict log table \"%s\"", + RelationGetRelationName(rel)))); I was not sure what was meant by "CONFLICT relations.". Does it mean "... relations in the pg_conflict schema.". Anyway, is there any value to that 2nd sentence because it is much the same text as the errmsg. ====== src/backend/replication/logical/conflict.c 13. +const char *const ConflictLogDestNames[] = { + [CONFLICT_LOG_DEST_LOG] = "log", + [CONFLICT_LOG_DEST_TABLE] = "table", + [CONFLICT_LOG_DEST_ALL] = "all" +}; + +const ConflictLogColumnDef v[] = { + { .attname = "relid", .atttypid = OIDOID }, + { .attname = "schemaname", .atttypid = TEXTOID }, + { .attname = "relname", .atttypid = TEXTOID }, + { .attname = "conflict_type", .atttypid = TEXTOID }, + { .attname = "remote_xid", .atttypid = XIDOID }, + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, + { .attname = "remote_origin", .atttypid = TEXTOID }, + { .attname = "replica_identity", .atttypid = JSONOID }, + { .attname = "remote_tuple", .atttypid = JSONOID }, + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } +}; 13a. Both these arrays could benefit with some comments. ~ 13b. In the ConflictLogSchema, would it be better to keep all those "remote_" columns grouped together, instead of being broken by "replica_identity". ~ 13c. TBH, I preferred code how it used to be -- where all the CLT constants and structs and enums and schemas were kept together. Now they are split across conflict.h and conflict.c making it harder to read as well as introducing need for static asserts that were not needed before. (Keeping everything together might become easier if the CLT stuff is all colocated in the conflict.c per comment #8) ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-13T07:11:18Z
Hi Dilip/Vignesh, I was looking at patch v33-0002. Shouldn't there be some accompanying tests in this patch to verify that altering ownership works as expected when the subscription has a CLT? ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-14T01:54:19Z
Hi Dilip/Vignesh. Some review comments for v33-0004 (docs). ====== doc/src/sgml/logical-replication.sgml (29.2. Subscription) 1. Perhaps the "conflict log table" should be using <firstterm> SGML markup the first time it gets mentioned? ~~~ (29.8. Conflicts) 2. <para> - The log format for logical replication conflicts is as follows: + The <link linkend="sql-createsubscription-params-with-conflict-log-destination"><literal>conflict_log_destination</literal></link> + parameter automatically creates a dedicated conflict log table. This table is created in the dedicated + <literal>pg_conflict</literal> namespace. The name of the conflict log table + is <literal>pg_conflict_log_<subid></literal>. The predefined schema of this table is + detailed in + <xref linkend="logical-replication-conflict-log-schema"/>. + </para> 2a. It's not really correct to say that it "automatically creates a dedicated conflict log table.", because that sounds like it will always happen. SUGGESTION The conflict_log_destination parameter can be set to automatically create a dedicated conflict log table. ~ 2b. Also it seems overkill to say the word "dedicated" multiple times. Maybe remove the 2nd one. ~~~ 3. + <para> + The conflicting row data, including the incoming remote row (<literal>remote_tuple</literal>) + and the associated local conflict details (<literal>local_conflicts</literal>), is stored in + <type>JSON</type> formats, for flexible querying and analysis. + </para> + Comma typo: /formats, for/formats for/ ~~~ (29.9. Restrictions) 4. + + <listitem> + <para> + Conflict log tables (see <link linkend="sql-createsubscription-params-with-conflict-log-destination"><literal>conflict_log_destination</literal></link> parameter) + are never published, even when using FOR ALL TABLES in a publication. + </para> + </listitem> The "FOR ALL TABLES" should have SGML <literal> markup. ====== doc/src/sgml/ref/create_subscription.sgml (conflict_log_destination (enum)) 5. + <para> + If post-mortem analysis may be needed, back up the conflict log table before + removing the subscription. + </para> 5a. My AI tool says that the "post-mortem analysis" wording is a bit overkill for online documentation: SUGGESTION If conflict history may be needed later, back up... ~ 5b. That note only says about "removing the subscription", but AFAIK the user will also need to do backup if changing from "table/all" to "log". Should that also be mentioned? It might make this caution a bit repetitive -- Maybe it is simply easier to reword this sentence like: SUGGESTION If conflict history may be needed later, be sure to back up the conflict log table before it gets removed. ====== GENERAL -- add new subsections 6. Apart from those minor review comments above, I felt that the current single "29.8. Conflicts" section should be broken into subsections for readability and for easier referencing. I propose that it should look like this: 29.8. Conflicts 29.8.1. Conflict logging 29.8.2. Table-based logging 29.8.3. File-based logging 29.8.4. Notes PSA a POC patch where I've done this restructuring. It looks much better to me. See what you think. Most of the original patch wording is unchanged. Some xrefs are added on the CREATE SUBSCRIPTION page. ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-14T07:15:08Z
On Mon, 11 May 2026 at 11:51, shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, May 8, 2026 at 5:40 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > > > The attached v31 version has the changes to fix this issue by > > > > > initializing the variable. > > > > > This also has the rebased version along with the rebased version of > > > > > the 'Preserve conflict log destination and subscription OID for > > > > > subscriptions' patch which is present in the 0005 patch. > > > > > > > > Thanks for the patches, please find a few comments on the patches 002 to 004: > > > > > > > > 1) I noticed that if a non-superuser creates the subscription, but a > > > > superuser later runs: > > > > ALTER SUBSCRIPTION ... SET (conflict_log_table = all) > > > > then the conflict table ends up being owned by the superuser instead > > > > of the subscription owner. Though, apply_worker would be able to > > > > insert into the CLT, but the subscription owner cannot access its > > > > associated conflict log table, > > > > > > > > I think this happens because the heap_create_with_catalog() call uses > > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER > > > > SUBSCRIPTION, it causes the table to be created under the ALTER > > > > command executor’s ownership instead of the subscription owner. > > > > > > > > Since only the subscription owner or a superuser can run ALTER > > > > SUBSCRIPTION, should we always create the table with the subscription > > > > owner as the owner? > > > > > > Yeah that makes sense. > > > > > > > 2) In GetConflictLogDestAndTable(): > > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > > > > + if (conflictlogrel == NULL) > > > > + elog(ERROR, "could not open conflict log table (OID %u)", > > > > + conflictlogrelid); > > > > + > > > > + return conflictlogrel; > > > > > > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The > > > > table_open()->relation_open() will error-out if it fails to open the > > > > relation. > > > > > > Yeah, that's a valid point. > > > > > > > 3) Minor typo in create_conflict_log_table() comments: > > > > + /* > > > > + * Check for an existing table with the sname name in the pg_conflict > > > > namespace. > > > > + * A collision should not occur under normal operation, but we must > > > > handle cases > > > > + * where a table has been created manually. > > > > + */ > > > > ==> double space in "A collision should not" > > > > > > > > 4) The document patch-0004 is still referring to the old name > > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". > > > > > > I will fix these in next version. > > > > > > > This fixes all 4 comments Nisha reported. And 0002 is an add-on patch > > to allow ownership transfer. I haven't yet changed the clt display > > witjh \dRs+ reported by shveta. I have a work-in-progress patch, but > > I couldn't get it to work. I will try to debug that tomorrow or next > > week whenever I get time. > > > > Open Items: > > - Add comments explaining the reasoning for the ownership change > > - change clt display > > - Test cases for ownership change, truncation, deletion, and select > > from a non-superuser owner of subscriber. > > > > @vignesh C Your patch needs to be rebased. > > > > Few comments on 001: > > 3) > Currently the structure of CLT is: > > +const ConflictLogColumnDef ConflictLogSchema[] = { > + { .attname = "relid", .atttypid = OIDOID }, > + { .attname = "schemaname", .atttypid = TEXTOID }, > + { .attname = "relname", .atttypid = TEXTOID }, > + { .attname = "conflict_type", .atttypid = TEXTOID }, > + { .attname = "remote_xid", .atttypid = XIDOID }, > + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, > + { .attname = "remote_origin", .atttypid = TEXTOID }, > + { .attname = "replica_identity", .atttypid = JSONOID }, > + { .attname = "remote_tuple", .atttypid = JSONOID }, > + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } > +}; > > So if user has to delete a conflict from CLT after resolving it, then > what is the user-friendly way to do it? IMO, it will be cumbersome > (and perhaps error-prone) to write a query with remote_commit_lsn, > remote_commit_ts, remote_xid etc in WHERE clause. Do you (or others) > think we shall add a log_id column (perhaps a bigint GENERATED ALWAYS > AS IDENTITY). This provides a simple, unique identifier so the user > can easily target a single row (WHERE log_id = 105) or purge a batch > of old conflicts (WHERE log_id < 1000). I have fixed the other comments except this one. I will think more about this and reply separately. The attached patch has the changes for the rest of the comments. The patch also addresses comments from [1]. [1] - https://www.postgresql.org/message-id/CAJpy0uANkzTyUjO2W0%3DRtaJCGg%3DVYcwLGGCpqax%3DzKJgNbB0Hw%40mail.gmail.com Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-14T07:20:49Z
Hi Dilip/Vignesh. Some review comments for patch v330003 ====== Commit Message 1. SELECT remote_xid, relname, remote_origin, local_conflicts[1] ->> 'xid' AS local_xid, local_conflicts[1] ->> 'tuple' AS local_tuple FROM myschema.conflict_log_history2; ~ Shouldn't this example be querying the pg_conflict schema (not myschema), for a CLT name like pg_conflict_log_1234? ====== src/backend/replication/logical/conflict.c 2. +/* Schema for the elements within the 'local_conflicts' JSON array */ +static const ConflictLogColumnDef LocalConflictSchema[] = +{ + { .attname = "xid", .atttypid = XIDOID }, + { .attname = "commit_ts", .atttypid = TIMESTAMPTZOID }, + { .attname = "origin", .atttypid = TEXTOID }, + { .attname = "key", .atttypid = JSONOID }, + { .attname = "tuple", .atttypid = JSONOID } +}; I think this all belongs directly beneath the ConflictLogSchema[] where 'local_conflicts' was defined. ~~~ 3. +#define MAX_LOCAL_CONFLICT_INFO_ATTRS lengthof(LocalConflictSchema) "MAX_" doesn't really seem appropriate as a prefix because this is not some upper limit; it is just a number. A better name is "NUM_LOCAL_CONFLICT_ATTRS". ~~ Ditto for the other "MAX_CONFLICT_ATTR_NUM" of patch 0001. How about "NUM_CONFLICT_ATTRS". ~~~ RequestApplyConflict: 4. + if (dest == CONFLICT_LOG_DEST_TABLE || dest == CONFLICT_LOG_DEST_ALL) + log_dest_clt = true; + if (dest == CONFLICT_LOG_DEST_LOG || dest == CONFLICT_LOG_DEST_ALL) + log_dest_logfile = true; This code could be improved by introducing some macros to hide all the checking. There was also similar code in patch 0001 where such macros would have been helpful. SUGGESTION log_dest_clt = CONFLICTS_LOGGED_TO_TABLE(dest); log_dest_logfile = CONFLICTS_LOGGED_TO_FILE(dest); ~~~ 5. + ereport(elevel, + errcode_apply_conflict(type), + errmsg("conflict detected on relation \"%s.%s\": conflict=%s", + get_namespace_name(RelationGetNamespace(localrel)), + RelationGetRelationName(localrel), + ConflictTypeNames[type]), + errdetail("Conflict details are logged to the conflict log table: %s", + RelationGetRelationName(conflictlogrel))); I think there is some recently committed function for getting fully-qualified relation names that this error can make use of. ~~~ 6. + /* Standard reporting with full internal details. */ + ereport(elevel, + errcode_apply_conflict(type), + errmsg("conflict detected on relation \"%s.%s\": conflict=%s", + get_namespace_name(RelationGetNamespace(localrel)), + RelationGetRelationName(localrel), + ConflictTypeNames[type]), + errdetail_internal("%s", err_detail.data)); Ditto. I think there is some recently committed function for getting fully-qualified relation names that this error can make use of. ~~~ GetConflictLogDestAndTable: 7. + /* Quick exit if a conflict log table was not requested. */ + if (*log_dest == CONFLICT_LOG_DEST_LOG) + return NULL; It would be more intuitive to use that new macro here that I suggested in a previous review comment. SUGGESTION if (!CONFLICTS_LOGGED_TO_TABLE(*log_dest)) return NULL; ~~~ InsertConflictLogTuple: 8. + int options = HEAP_INSERT_NO_LOGICAL; This variable seems unnecessary. Easier to just pass HEAP_INSERT_NO_LOGICAL as a function parameter. ====== src/backend/replication/logical/worker.c start_apply: 9. + /* Open conflict log table and insert the tuple. */ + conflictlogrel = GetConflictLogDestAndTable(&dest); + Assert(dest != CONFLICT_LOG_DEST_LOG); + InsertConflictLogTuple(conflictlogrel); + table_close(conflictlogrel, RowExclusiveLock); Another place where using the suggested new macro would be more intuitive. SUGGESTION Assert(CONFLICTS_LOGGED_TO_TABLE(dest)); ====== src/test/subscription/t/035_conflicts.pl 10. +# Verify the contents of the Conflict Log Table (CLT) +# This section ensures that the clt contains the expected +# type and specific key data. /clt/CLT/ ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-14T08:52:11Z
On Mon, May 11, 2026 at 4:14 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, 8 May 2026 at 17:40, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, May 8, 2026 at 8:28 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, May 7, 2026 at 5:24 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > > > The attached v31 version has the changes to fix this issue by > > > > > initializing the variable. > > > > > This also has the rebased version along with the rebased version of > > > > > the 'Preserve conflict log destination and subscription OID for > > > > > subscriptions' patch which is present in the 0005 patch. > > > > > > > > Thanks for the patches, please find a few comments on the patches 002 to 004: > > > > > > > > 1) I noticed that if a non-superuser creates the subscription, but a > > > > superuser later runs: > > > > ALTER SUBSCRIPTION ... SET (conflict_log_table = all) > > > > then the conflict table ends up being owned by the superuser instead > > > > of the subscription owner. Though, apply_worker would be able to > > > > insert into the CLT, but the subscription owner cannot access its > > > > associated conflict log table, > > > > > > > > I think this happens because the heap_create_with_catalog() call uses > > > > GetUserId(). It is fine during CREATE SUBSCRIPTION, but during ALTER > > > > SUBSCRIPTION, it causes the table to be created under the ALTER > > > > command executor’s ownership instead of the subscription owner. > > > > > > > > Since only the subscription owner or a superuser can run ALTER > > > > SUBSCRIPTION, should we always create the table with the subscription > > > > owner as the owner? > > > > > > Yeah that makes sense. > > > > > > > 2) In GetConflictLogDestAndTable(): > > > > + conflictlogrel = table_open(conflictlogrelid, RowExclusiveLock); > > > > + if (conflictlogrel == NULL) > > > > + elog(ERROR, "could not open conflict log table (OID %u)", > > > > + conflictlogrelid); > > > > + > > > > + return conflictlogrel; > > > > > > > > I think the "if (conflictlogrel == NULL)" check is unreachable. The > > > > table_open()->relation_open() will error-out if it fails to open the > > > > relation. > > > > > > Yeah, that's a valid point. > > > > > > > 3) Minor typo in create_conflict_log_table() comments: > > > > + /* > > > > + * Check for an existing table with the sname name in the pg_conflict > > > > namespace. > > > > + * A collision should not occur under normal operation, but we must > > > > handle cases > > > > + * where a table has been created manually. > > > > + */ > > > > ==> double space in "A collision should not" > > > > > > > > 4) The document patch-0004 is still referring to the old name > > > > "pg_conflict_<subid>", it should be "pg_conflict_log_<subid>". > > > > > > I will fix these in next version. > > > > > > > This fixes all 4 comments Nisha reported. And 0002 is an add-on patch > > to allow ownership transfer. I haven't yet changed the clt display > > witjh \dRs+ reported by shveta. I have a work-in-progress patch, but > > I couldn't get it to work. I will try to debug that tomorrow or next > > week whenever I get time. > > > > Open Items: > > - Add comments explaining the reasoning for the ownership change > > - change clt display > > - Test cases for ownership change, truncation, deletion, and select > > from a non-superuser owner of subscriber. > > The attached patch addresses the remaining open items and is provided > separately as patch 0005. @Dilip Kumar, if the changes look good to > you, please merge them into the corresponding patch. > Thanks Vignesh, Please find a few comments on 0005: 1) listSubscriptions has: + pg_log_error("The server (version %s) does not support publications.", publications --> subscriptions 2) printfPQExpBuffer(&buf, "/* %s */\n", _("Get matching subscriptions")); appendPQExpBuffer(&buf, "SELECT subname AS \"%s\"\n" ", pg_catalog.pg_get_userbyid(subowner) AS \"%s\"\n" ", subenabled AS \"%s\"\n" ", subpublications AS \"%s\"\n", gettext_noop("Name"), gettext_noop("Owner"), gettext_noop("Enabled"), gettext_noop("Publication")); /* Only display subscriptions in current database. */ appendPQExpBufferStr(&buf, "FROM pg_catalog.pg_subscription\n" "WHERE subdbid = (SELECT oid\n" " FROM pg_catalog.pg_database\n" " WHERE datname = pg_catalog.current_database())"); Why have we split the query? Can we have it in one go itself? 3) + appendPQExpBuffer(&buf, + "SELECT oid, subname AS \"%s\"\n" + ", pg_catalog.pg_get_userbyid(subowner) AS \"%s\"\n" + ", subenabled AS \"%s\"\n" + ", subpublications AS \"%s\"\n", + gettext_noop("Name"), + gettext_noop("Owner"), + gettext_noop("Enabled"), + gettext_noop("Publication")); + ncols = 3; The query has 5 columns and we have set ncols as 3. A comment will help here. 4) + snprintf(conflictlogtable, + sizeof(conflictlogtable), + "pg_conflict.pg_conflict_log_%s", + subid); Should be avoid hard-coding the namespace name like above? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-14T08:53:14Z
On Mon, 11 May 2026 at 14:59, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > I started reviewing the patches. > Here are minor comments for 0001 patch: > > 1. If allow_system_table_mods=on we can add/drop columns of conflict log tables > But the same for pg_toast or other catalog tables are prohibited. Also > for other system tables we are getting following error. > > postgres=# ALTER TABLE pg_toast.pg_toast_16413 DROP COLUMN chunk_seq; > ERROR: ALTER action DROP COLUMN cannot be performed on relation > "pg_toast_16413" > > DETAIL: This operation is not supported for TOAST tables. > postgres=# ALTER TABLE pg_publication DROP COLUMN pubname; > ERROR: cannot drop column pubname of table pg_publication because it > is required by the database system > postgres=# ALTER TABLE pg_description DROP COLUMN description; > ERROR: cannot drop column description of table pg_description because > it is required by the database system > > postgres=# ALTER TABLE pg_conflict.pg_conflict_log_16408 DROP COLUMN relname; > ALTER TABLE > > Should we prohibit it for conflict log tables as well? The reason it fails for regular system catalogs is that IsPinnedObject() returns true for them. Objects with OIDs less than FirstUnpinnedObjectId(12000) are considered pinned, which includes the core catalogs created during initdb. In such cases, PostgreSQL immediately throws the following error: /* * If the target object is pinned, we can just error out immediately; it * won't have any objects recorded as depending on it. */ if (IsPinnedObject(object->classId, object->objectId)) ereport(ERROR, (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST), errmsg("cannot drop %s because it is required by the database system", getObjectDescription(object, false)))); The call chain is: ATExecDropColumn -> performMultipleDeletions -> findDependentObjects -> IsPinnedObject However, the conflict log tables are not created during initdb; they are created later during subscription creation. Therefore, they are not considered pinned objects, IsPinnedObject() returns false, and the DROP COLUMN operation is allowed. I also noticed that ADD COLUMN is currently allowed on system tables when allow_system_table_mods is enabled: postgres=# SET allow_system_table_mods = on; SET postgres=# ALTER TABLE pg_description ADD COLUMN fake text; ALTER TABLE There are also cases where such operations lead to assertion failures. For example: postgres=# SET allow_system_table_mods = on; SET postgres=# alter table pg_type add column fake int; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. The connection to the server was lost. Attempting reset: Failed. TRAP: failed Assert("i >= 0 && i < tupdesc->natts"), File: "../../../src/include/access/tupdesc.h", Line: 182, PID: 11443 postgres: vignesh postgres [local] ALTER TABLE(ExceptionalCondition+0xba) [0x616a67fc753c] postgres: vignesh postgres [local] ALTER TABLE(+0x7057fa) [0x616a67d067fa] postgres: vignesh postgres [local] ALTER TABLE(build_column_default+0x34) [0x616a67d08961] postgres: vignesh postgres [local] ALTER TABLE(+0x3e8875) [0x616a679e9875] postgres: vignesh postgres [local] ALTER TABLE(+0x3e34e8) [0x616a679e44e8] postgres: vignesh postgres [local] ALTER TABLE(+0x3e2e24) [0x616a679e3e24] The documentation also explicitly warns about this behavior at [1]: Allows modification of the structure of system tables as well as certain other risky actions on system tables. This is otherwise not allowed even for superusers. Ill-advised use of this setting can cause irretrievable data loss or seriously corrupt the database system. Given this, I am not sure whether we should specifically prevent dropping columns from conflict log tables when allow_system_table_mods is enabled. Rest of the comments are addressed in the v34 version patch posted at [2]. [1] - https://www.postgresql.org/docs/current/runtime-config-developer.html [2] - https://www.postgresql.org/message-id/CALDaNm1ZOWAbv5WCsORPBqo7tjHn4f7E%2BB5LZhExfnPMs-zo9A%40mail.gmail.com Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-14T09:25:27Z
. On Thu, May 14, 2026 at 2:23 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 11 May 2026 at 14:59, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > I started reviewing the patches. > > Here are minor comments for 0001 patch: > > > > 1. If allow_system_table_mods=on we can add/drop columns of conflict log tables > > But the same for pg_toast or other catalog tables are prohibited. Also > > for other system tables we are getting following error. > > > > postgres=# ALTER TABLE pg_toast.pg_toast_16413 DROP COLUMN chunk_seq; > > ERROR: ALTER action DROP COLUMN cannot be performed on relation > > "pg_toast_16413" > > > > DETAIL: This operation is not supported for TOAST tables. > > postgres=# ALTER TABLE pg_publication DROP COLUMN pubname; > > ERROR: cannot drop column pubname of table pg_publication because it > > is required by the database system > > postgres=# ALTER TABLE pg_description DROP COLUMN description; > > ERROR: cannot drop column description of table pg_description because > > it is required by the database system > > > > postgres=# ALTER TABLE pg_conflict.pg_conflict_log_16408 DROP COLUMN relname; > > ALTER TABLE > > > > Should we prohibit it for conflict log tables as well? > > The reason it fails for regular system catalogs is that > IsPinnedObject() returns true for them. Objects with OIDs less than > FirstUnpinnedObjectId(12000) are considered pinned, which includes the > core catalogs created during initdb. In such cases, PostgreSQL > immediately throws the following error: > /* > * If the target object is pinned, we can just error out immediately; it > * won't have any objects recorded as depending on it. > */ > if (IsPinnedObject(object->classId, object->objectId)) > ereport(ERROR, > (errcode(ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST), > errmsg("cannot drop %s because it is required by the > database system", > getObjectDescription(object, false)))); > The call chain is: > ATExecDropColumn -> performMultipleDeletions -> findDependentObjects > -> IsPinnedObject > > However, the conflict log tables are not created during initdb; they > are created later during subscription creation. Therefore, they are > not considered pinned objects, IsPinnedObject() returns false, and the > DROP COLUMN operation is allowed. > > I also noticed that ADD COLUMN is currently allowed on system tables > when allow_system_table_mods is enabled: > postgres=# SET allow_system_table_mods = on; > SET > postgres=# ALTER TABLE pg_description ADD COLUMN fake text; > ALTER TABLE > > There are also cases where such operations lead to assertion failures. > For example: > postgres=# SET allow_system_table_mods = on; > SET > postgres=# alter table pg_type add column fake int; > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > The connection to the server was lost. Attempting reset: Failed. > > TRAP: failed Assert("i >= 0 && i < tupdesc->natts"), File: > "../../../src/include/access/tupdesc.h", Line: 182, PID: 11443 > postgres: vignesh postgres [local] ALTER > TABLE(ExceptionalCondition+0xba) [0x616a67fc753c] > postgres: vignesh postgres [local] ALTER TABLE(+0x7057fa) [0x616a67d067fa] > postgres: vignesh postgres [local] ALTER > TABLE(build_column_default+0x34) [0x616a67d08961] > postgres: vignesh postgres [local] ALTER TABLE(+0x3e8875) [0x616a679e9875] > postgres: vignesh postgres [local] ALTER TABLE(+0x3e34e8) [0x616a679e44e8] > postgres: vignesh postgres [local] ALTER TABLE(+0x3e2e24) [0x616a679e3e24] > > The documentation also explicitly warns about this behavior at [1]: > Allows modification of the structure of system tables as well as > certain other risky actions on system tables. This is otherwise not > allowed even for superusers. Ill-advised use of this setting can cause > irretrievable data loss or seriously corrupt the database system. > > Given this, I am not sure whether we should specifically prevent > dropping columns from conflict log tables when allow_system_table_mods > is enabled. > +1. We can keep the current behavior as-is since it only applies when allow_system_table_mods is enabled. The documentation already clearly warns about the associated risks, so this should be fine. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-14T10:48:33Z
On Wed, May 13, 2026 at 11:43 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Some review comments for v33-0001. > > ====== > src/backend/catalog/aclchk.c > > pg_class_aclmask_ext: > > 1. > if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | > ACL_USAGE)) && > - IsSystemClass(table_oid, classForm) && > - classForm->relkind != RELKIND_VIEW && > + IsConflictClass(classForm) && > !superuser_arg(roleid)) > - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE); > + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | > ACL_TRUNCATE | ACL_USAGE)) && > + IsSystemClass(table_oid, classForm) && > + classForm->relkind != RELKIND_VIEW && > + !superuser_arg(roleid)) > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); > > The new patched code seems a bit repetitive. > > How about refactoring like below and putting the comments where they belong. > > if (!superuser_arg(roleid)) > { > if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) > { > if (IsSystemClass(table_oid, classForm) && > classForm->relkind != RELKIND_VIEW) > { > /* > * Deny anyone permission to update a system catalog unless > * pg_authid.rolsuper is set. > * > * As of 7.4 we have some updatable system views; those shouldn't be > * protected in this way. Assume the view rules can take care of > * themselves. ACL_USAGE is if we ever have system sequences. > */ > mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | > ACL_USAGE); > } > else if (IsConflictClass(classForm)) > { > /* > * For conflict log tables, we allow non-superusers to perform DELETE > * and TRUNCATE for maintenance, while still restricting INSERT, > * UPDATE, and USAGE. > */ > mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE); > } > } > } > else > { > /* Superusers bypass all permission-checking. */ > > ReleaseSysCache(tuple); > return mask; > } > It is okay to reduce duplicity here but the check for IsConflictClass should be first because IsSystemClass also contains the similar check though for a different reason. > > 8. > Despite some of these just being static, I am beginning to think that > the "conflict" specific CLT code might be more appropriate to be put > in conflict.c, along with the CLT schema etc. > > e.g. functions like: > - create_conflict_log_table_tupdesc > - create_conflict_log_table > - GetLogDestination > +1. > > ====== > src/backend/replication/logical/conflict.c > > 13. > +const char *const ConflictLogDestNames[] = { > + [CONFLICT_LOG_DEST_LOG] = "log", > + [CONFLICT_LOG_DEST_TABLE] = "table", > + [CONFLICT_LOG_DEST_ALL] = "all" > +}; > + > +const ConflictLogColumnDef v[] = { > + { .attname = "relid", .atttypid = OIDOID }, > + { .attname = "schemaname", .atttypid = TEXTOID }, > + { .attname = "relname", .atttypid = TEXTOID }, > + { .attname = "conflict_type", .atttypid = TEXTOID }, > + { .attname = "remote_xid", .atttypid = XIDOID }, > + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, > + { .attname = "remote_origin", .atttypid = TEXTOID }, > + { .attname = "replica_identity", .atttypid = JSONOID }, > + { .attname = "remote_tuple", .atttypid = JSONOID }, > + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } > +}; > ... > > 13c. > TBH, I preferred code how it used to be -- where all the CLT constants > and structs and enums and schemas were kept together. Now they are > split across conflict.h and conflict.c making it harder to read as > well as introducing need for static asserts that were not needed > before. > That would lead to unnecessary inclusions at multiple places where it is not required. See my 4th comment in email [1]. [1]: https://www.postgresql.org/message-id/CAA4eK1LhOHa_TEznw%2BgFoq%2Bw0vMvvsDG2g9Xq8Mwa8xZMY73og%40mail.gmail.com -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Nisha Moond <nisha.moond412@gmail.com> — 2026-05-15T10:29:23Z
On Thu, May 14, 2026 at 12:45 PM vignesh C <vignesh21@gmail.com> wrote: > > I have fixed the other comments except this one. I will think more > about this and reply separately. The attached patch has the changes > for the rest of the comments. The patch also addresses comments from > [1]. > Thanks for the patches. Please find below comments for v34 patch-set. 1) Bug report: When disable_on_error = true for a subscription, and an ERROR-level conflict such as insert_exists occurs, the subscription gets disabled without logging the conflict into the CLT. patch-001: 2) execMain.c: + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("cannot modify or insert data into conflict log table \"%s\"", + RelationGetRelationName(resultRel)), Is ERRCODE_INSUFFICIENT_PRIVILEGE the right error code here? It gives the impression that the operation might succeed with higher privileges. Should we instead use ERRCODE_WRONG_OBJECT_TYPE, similar to nearby restrictions? 3) No notice is shown when the conflict log table is removed after changing conflict_log_destination from table/all to log. Example: postgres=# alter subscription sub1 set (conflict_log_destination = table); NOTICE: created conflict log table "pg_conflict.pg_conflict_log_16400" for subscription "sub1" ALTER SUBSCRIPTION postgres=# alter subscription sub1 set (conflict_log_destination = log); ALTER SUBSCRIPTION We already show a notice when changing from log to table/all. Should we add a similar notice as in DROP SUBSCRIPTION for above case? patch-003: 4) conflict.c: ReportApplyConflict() + bool log_dest_clt = false; + bool log_dest_logfile; log_dest_logfile should also be initialized to false, since for dest == CONFLICT_LOG_DEST_TABLE, it is never assigned. 5) worker_internal.h extern PGDLLIMPORT List *table_states_not_ready; +extern XLogRecPtr remote_final_lsn; +extern TimestampTz remote_commit_ts; +extern TransactionId remote_xid; Should these new declarations also use PGDLLIMPORT? 6) worker.c: apply_handle_stream_start() + remote_xid = stream_xid; + remote_final_lsn = InvalidXLogRecPtr; + remote_commit_ts = 0; + if (!TransactionIdIsValid(stream_xid)) ereport(ERROR, (errcode(ERRCODE_PROTOCOL_VIOLATION), Should the remote_xid assignment be moved after the validity check? We could move all three assignments below the check. patch-005: 7) subscriptioncmds.c: DropSubscription() + if (OidIsValid(form->subconflictlogrelid)) + { + char *conflictrelname = get_rel_name(form->subconflictlogrelid); .... "form" is being used here after the tuple it points to has already been deleted: /* Remove the tuple from catalog. */ CatalogTupleDelete(rel, &tup->t_self); ReleaseSysCache(tup); I think form->subconflictlogrelid should be saved beforehand and then used later, similar to subid. -- Thanks, Nisha -
Re: Proposal: Conflict log history table for Logical Replication
Nisha Moond <nisha.moond412@gmail.com> — 2026-05-18T09:12:14Z
While testing with all patches(v34) applied, I noticed an unexpected behavior change in \dRs+ output. I see that we changed the \dRs+ output format to display "Conflict log table:" separately instead of as a column, but the output ordering also seems to have changed. Without the patch, both \dRs and \dRs+ display subscriptions in alphabetical order by name. With this patch, \dRs still shows the expected ordering, but \dRs+ now appears ordered by subscription creation order (likely subid) instead of subscription name. This is not a major issue, but it seems to break consistency. For example, \dRp+ has a similar display pattern, but its output is ordered by pub-name. -- Thanks, Nisha
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-18T12:35:40Z
On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Dilip/Vignesh. > > Some review comments for v33-0001. > > ====== > src/backend/executor/execMain.c > > 11. > + > + /* > + * Conflict log tables are managed by the system to record logical > + * replication conflicts. We allow DELETE and TRUNCATE to permit users to > + * manually prune these logs, but manual data insertion or modification > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the > + * system-generated logs. > + * > + * Since TRUNCATE is handled as a separate utility command, we only need > + * to explicitly permit CMD_DELETE here. > + */ > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) && > + operation != CMD_DELETE) > + ereport(ERROR, > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("cannot modify or insert data into conflict log table \"%s\"", > + RelationGetRelationName(resultRel)), > + errdetail("Conflict log tables are system-managed and only support > cleanup via DELETE or TRUNCATE."))); > > It somehow feels backwards to check "operation != CMD_DELETE", with > the obscure comment that TRUNCATE is handled elsewhere. > > How about just check if "(operation == CMD_INSERT || operation == > CMD_UPDATE || operation == CMD_MERGE)". I felt the existing is ok here, as it is mentioned "we only need to explicitly permit CMD_DELETE" . Are you seeing any commands other than INSERT, UPDATE & MERGE possible here? > ~~~ > > 12. > + > + /* > + * Conflict log tables are managed by the system to record logical > + * replication conflicts. We do not allow locking rows in CONFLICT > + * relations. > + */ > + if (IsConflictNamespace(RelationGetNamespace(rel))) > + ereport(ERROR, > + (errcode(ERRCODE_WRONG_OBJECT_TYPE), > + errmsg("cannot lock rows in conflict log table \"%s\"", > + RelationGetRelationName(rel)))); > > I was not sure what was meant by "CONFLICT relations.". > > Does it mean "... relations in the pg_conflict schema.". Anyway, is > there any value to that 2nd sentence because it is much the same text > as the errmsg. Yes, it means the relations in pg_conflict schema. Removed the second sentence. > ====== > src/backend/replication/logical/conflict.c > > 13. > +const char *const ConflictLogDestNames[] = { > + [CONFLICT_LOG_DEST_LOG] = "log", > + [CONFLICT_LOG_DEST_TABLE] = "table", > + [CONFLICT_LOG_DEST_ALL] = "all" > +}; > + > +const ConflictLogColumnDef v[] = { > + { .attname = "relid", .atttypid = OIDOID }, > + { .attname = "schemaname", .atttypid = TEXTOID }, > + { .attname = "relname", .atttypid = TEXTOID }, > + { .attname = "conflict_type", .atttypid = TEXTOID }, > + { .attname = "remote_xid", .atttypid = XIDOID }, > + { .attname = "remote_commit_lsn",.atttypid = LSNOID }, > + { .attname = "remote_commit_ts", .atttypid = TIMESTAMPTZOID }, > + { .attname = "remote_origin", .atttypid = TEXTOID }, > + { .attname = "replica_identity", .atttypid = JSONOID }, > + { .attname = "remote_tuple", .atttypid = JSONOID }, > + { .attname = "local_conflicts", .atttypid = JSONARRAYOID } > +}; > > 13a. > Both these arrays could benefit with some comments. Added comments > ~ > > 13b. > In the ConflictLogSchema, would it be better to keep all those > "remote_" columns grouped together, instead of being broken by > "replica_identity". Modified > ~ > > 13c. > TBH, I preferred code how it used to be -- where all the CLT constants > and structs and enums and schemas were kept together. Now they are > split across conflict.h and conflict.c making it harder to read as > well as introducing need for static asserts that were not needed > before. No change done, as this change is required. Amit has given the explanation at [1]. Rest of the comments were addressed. The attached v35 version patch has the changes for the same. I have kept the review comment fixes as separate patches so that Dilip can merge them when convenient. Due to the additional review-fix patches, Dilip's original patches 0001, 0002, 0003, and 0004 are now renumbered as 0001, 0003, 0005, and 0007 respectively. The intermediate patches contain the review comment fixes: a) 0002 contains fixes for 0001 b) 0004 contains fixes for 0003 c) 0006 contains fixes for 0005 d) 0008 contains fixes for 0007 Also comments from [2] and [3] are addressed in this. [1] - https://www.postgresql.org/message-id/CAA4eK1Ki5mBgAubBkUPcBjN%3DO1jeT3AUh7vLQBm8w%3DgQiHO5Jw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAHut%2BPv%2BBK7iM3KZNcrXzPMYagrL2O4%3D46Hi3stT2XT-RmsjRQ%40mail.gmail.com [3] - https://www.postgresql.org/message-id/CAJpy0uARoVZkTA_PV4PB1MtUXZMyxkun1Cg5o1YOxaKsCbWxCA%40mail.gmail.com Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-19T06:31:51Z
On Mon, May 18, 2026 at 10:35 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote: Hi Vignesh. Thanks for addressing lots of my previous v33-0001 review comments. Here are some more review comments for the combined v35-0001/0002 patches. ====== Commit message. 1. If the user chooses to enable logging to a table (by selecting 'table' or 'all'), an internal logging table named pg_conflict_log_<subid> is automatically created within a dedicated, system-managed 'pg_conflict' namespace to prevent users from manually dropping or altering it. This also prevents accidental name collisions with user-created tables. This table is linked to the subscription via an internal dependency, ensuring it is automatically dropped when the subscription is removed ~ The internal name of the CLT table has changed slightly, so the commit message needs updating. ====== src/backend/catalog/heap.c 2. + * Don't allow creating relations in pg_catalog/pg_conflict directly, even + * though it is allowed to move user defined relations there. Semantics + * with search paths including pg_catalog are too confusing for now. I think "pg_catalog/pg_conflict" could be misinterpreted. Better to say "pg_catalog or pg_conflict". ~~~ 3. + if (!allow_system_table_mods && IsNormalProcessingMode()) + { + if ((IsCatalogNamespace(relnamespace) && relkind != RELKIND_INDEX) || + IsToastNamespace(relnamespace)) + { + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("permission denied to create \"%s.%s\"", + get_namespace_name(relnamespace), relname), + errdetail("System catalog modifications are currently disallowed."))); + } + + if (IsConflictNamespace(relnamespace)) + { + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("permission denied to create \"%s.%s\"", + get_namespace_name(relnamespace), relname), + errdetail("Conflict schema modifications are currently disallowed."))); + } + } The curly-braces are unnecesary for those nested if-blocks. ====== src/backend/catalog/namespace.c CheckSetNamespace: 4. + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot move objects into or out of pg_conflict schema"))); Is it better to say "the pg_conflict schema". ====== src/backend/commands/subscriptioncmds.c 5. - Looks like this was some unintended whitespace removal just after the static function forward declarations. ~~~ AlterSubscription: 6. + bool want_table = (opts.conflictlogdest == CONFLICT_LOG_DEST_TABLE || + opts.conflictlogdest == CONFLICT_LOG_DEST_ALL); + bool has_oldtable = (old_dest == CONFLICT_LOG_DEST_TABLE || + old_dest == CONFLICT_LOG_DEST_ALL); These should be simplified using the new macro: CONFLICTS_LOGGED_TO_TABLE. ====== src/backend/commands/tablecmds.c DropSubscription: 7. + ObjectAddress object; This can be declared at the lower scope closer to where it is actually used. ~~~ 8. + if (OidIsValid(form->subconflictlogrelid)) + { + char *conflictrelname = get_rel_name(form->subconflictlogrelid); + /* There should be a blank line before that block comment. > > ====== > > src/backend/executor/execMain.c > > > > 11. > > + > > + /* > > + * Conflict log tables are managed by the system to record logical > > + * replication conflicts. We allow DELETE and TRUNCATE to permit users to > > + * manually prune these logs, but manual data insertion or modification > > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the > > + * system-generated logs. > > + * > > + * Since TRUNCATE is handled as a separate utility command, we only need > > + * to explicitly permit CMD_DELETE here. > > + */ > > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) && > > + operation != CMD_DELETE) > > + ereport(ERROR, > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > > + errmsg("cannot modify or insert data into conflict log table \"%s\"", > > + RelationGetRelationName(resultRel)), > > + errdetail("Conflict log tables are system-managed and only support > > cleanup via DELETE or TRUNCATE."))); > > > > It somehow feels backwards to check "operation != CMD_DELETE", with > > the obscure comment that TRUNCATE is handled elsewhere. > > > > How about just check if "(operation == CMD_INSERT || operation == > > CMD_UPDATE || operation == CMD_MERGE)". > > I felt the existing is ok here, as it is mentioned "we only need to > explicitly permit CMD_DELETE" . Are you seeing any commands other than > INSERT, UPDATE & MERGE possible here? 9. YMMV. No, I'm not seeing other commands. I guess the current code works. My previous review comment was because: 1. IMO, conditions that are positive instead of negative are easier to comprehend 2. It would make the checking code consistent with the comment “(INSERT, UPDATE, MERGE) is prohibited”, and with the error message “cannot modify or insert”. 3. Doing it the suggested way eliminates any need to mention that strange comment “Since TRUNCATE…” CheckValidRowMarkRel: 10. + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg("cannot lock rows in conflict log table \"%s\"", Should that say "in the"? ====== src/backend/replication/logical/conflict.c > > 13c. > > TBH, I preferred code how it used to be -- where all the CLT constants > > and structs and enums and schemas were kept together. Now they are > > split across conflict.h and conflict.c making it harder to read as > > well as introducing need for static asserts that were not needed > > before. > > No change done, as this change is required. Amit has given the > explanation at [1]. > By refactoring the conflict functions into conflict.c, it means nearly everything is now kept together anyhow, just in the .c file instead of the .h file :-) ~~~ 11. +StaticAssertDecl(lengthof(ConflictLogSchema) == NUM_CONFLICT_ATTRS, + "ConflictLogSchema length mismatch"); + + 11a. In fact, NUM_CONFLICT_ATTRS is not used outside this file, so now it can be defined right here. It means the assertion is unnecessary. Instead, the code here should look like: #define NUM_CONFLICT_ATTRS lengthof(ConflictLogSchema) ~ 11b. Unnecessary extra whitespace here. ~~~ create_conflict_log_table: 12. + Assert(relid != InvalidOid); Favour using the macro OidIsValid(relid). ====== src/include/catalog/pg_subscription.h 13. #include "catalog/objectaddress.h" #include "parser/parse_node.h" +#include "replication/conflict.h" I am guessing that this #include is probably no longer needed, because you removed the extern function that was using ConflictLogDest. ====== src/include/replication/conflict.h 14. +/* Structure to hold metadata for one column of the conflict log table */ +typedef struct ConflictLogColumnDef +{ + const char *attname; /* Column name */ + Oid atttypid; /* Data type OID */ +} ConflictLogColumnDef; + AFAIK, you can move this into conflict.c now because it is only used in that file. ~~~ 15. +/* The single source of truth for the conflict log table schema */ +extern PGDLLIMPORT const ConflictLogColumnDef ConflictLogSchema[]; + AFAIK, you can remove this because all usages are now within conflict.c. ~~~ 16. +#define NUM_CONFLICT_ATTRS 11 + Move this into conflict.c -- see an earlier review comment. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-19T14:00:43Z
On Fri, 15 May 2026 at 15:59, Nisha Moond <nisha.moond412@gmail.com> wrote: > > Thanks for the patches. Please find below comments for v34 patch-set. > > patch-003: > 4) conflict.c: ReportApplyConflict() > + bool log_dest_clt = false; > + bool log_dest_logfile; > > log_dest_logfile should also be initialized to false, since for dest > == CONFLICT_LOG_DEST_TABLE, it is never assigned. It is not required to be initialized now as it is being assigned before used in this function now. > 5) worker_internal.h > extern PGDLLIMPORT List *table_states_not_ready; > > +extern XLogRecPtr remote_final_lsn; > +extern TimestampTz remote_commit_ts; > +extern TransactionId remote_xid; > > Should these new declarations also use PGDLLIMPORT? I think these don't require PGDLLIMPORT as it will be used by the same apply worker backend process. Rest of the comments are handled, the attached v36 version patches have the changes for the same. Also the comment from [1] has been fixed in this version. [1] - https://www.postgresql.org/message-id/CABdArM5XgHE4-HCryi54BxENgNqLDn81cMCUyqBdCeF9d3dbvA%40mail.gmail.com Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-20T06:32:11Z
On Tue, May 19, 2026 at 7:30 PM vignesh C <vignesh21@gmail.com> wrote: > > Rest of the comments are handled, the attached v36 version patches > have the changes for the same. > Also the comment from [1] has been fixed in this version. > Thanks Vignesh. A few comments for 0001 and 002 combined (I merged them and reviewed for ease of review) 1) + * IsConflictLogTableClass + * True iff namespace is pg_conflict. + * + * Does not perform any catalog accesses. */ bool -IsConflictClass(Form_pg_class reltuple) +IsConflictLogTableClass(Form_pg_class reltuple) I think this function is trying to find if the reltuple is a CLT rather than namepspace is pg_conflict. We should change this comment. See IsToastRelation, IsToastClass. Suggestion: True iff Form_pg_class tuple represents a subscription-specific Conflict Log Table. 2) Both DropSubscription and AlterSubscription has below code to drop CLT: + if (OidIsValid(subconflictlogrelid)) + { + char *conflictrelname = get_rel_name(subconflictlogrelid); + + /* + * Conflict log tables are recorded as internal dependencies of the + * subscription. We must drop the dependent objects before the + * subscription itself is removed. By using + * PERFORM_DELETION_SKIP_ORIGINAL, we ensure that only the conflict log + * table is reaped while the subscription remains for the final + * deletion step. + */ + ObjectAddressSet(object, SubscriptionRelationId, subid); + performDeletion(&object, DROP_CASCADE, + PERFORM_DELETION_INTERNAL | + PERFORM_DELETION_SKIP_ORIGINAL); + + ereport(NOTICE, + errmsg("dropped conflict log table \"%s\" for subscription \"%s\"", + get_qualified_objname(PG_CONFLICT_NAMESPACE, conflictrelname), + subname)); + } Why don't we create a function drop_conflict_log_table(subconflictlogrelid) and use it both places. 3) +++ b/src/backend/commands/subscriptioncmds.c +#include "catalog/heap.h" +#include "catalog/pg_am_d.h" It compiles now without these inclusion. 002 should remove these as well. 4) AlterSubscription: + bool want_table = (opts.conflictlogdest == CONFLICT_LOG_DEST_TABLE || + opts.conflictlogdest == CONFLICT_LOG_DEST_ALL); + bool has_oldtable = (old_dest == CONFLICT_LOG_DEST_TABLE || + old_dest == CONFLICT_LOG_DEST_ALL); Shall we replace checks at both places with CONFLICTS_LOGGED_TO_TABLE ~~ 003,004: No comments ~~ Reviewing further. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-20T09:35:13Z
On Tue, 19 May 2026 at 12:02, Peter Smith <smithpb2250@gmail.com> wrote: > > On Mon, May 18, 2026 at 10:35 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Vignesh. > > Thanks for addressing lots of my previous v33-0001 review comments. > > Here are some more review comments for the combined v35-0001/0002 patches. > > ====== > Commit message. > > 1. > If the user chooses to enable logging to a table (by selecting 'table' > or 'all'), > an internal logging table named pg_conflict_log_<subid> is automatically > created within a dedicated, system-managed 'pg_conflict' namespace to prevent > users from manually dropping or altering it. This also prevents accidental > name collisions with user-created tables. This table is linked to the > subscription via an internal dependency, ensuring it is automatically dropped > when the subscription is removed > > ~ > > The internal name of the CLT table has changed slightly, so the commit > message needs updating. This change is done as part of 0002 review comment fixes patch. I will let Dilip do this change when he merges the review comment fixes patch to 0001 patch. > > > ====== > > > src/backend/executor/execMain.c > > > > > > 11. > > > + > > > + /* > > > + * Conflict log tables are managed by the system to record logical > > > + * replication conflicts. We allow DELETE and TRUNCATE to permit users to > > > + * manually prune these logs, but manual data insertion or modification > > > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the > > > + * system-generated logs. > > > + * > > > + * Since TRUNCATE is handled as a separate utility command, we only need > > > + * to explicitly permit CMD_DELETE here. > > > + */ > > > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) && > > > + operation != CMD_DELETE) > > > + ereport(ERROR, > > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > > > + errmsg("cannot modify or insert data into conflict log table \"%s\"", > > > + RelationGetRelationName(resultRel)), > > > + errdetail("Conflict log tables are system-managed and only support > > > cleanup via DELETE or TRUNCATE."))); > > > > > > It somehow feels backwards to check "operation != CMD_DELETE", with > > > the obscure comment that TRUNCATE is handled elsewhere. > > > > > > How about just check if "(operation == CMD_INSERT || operation == > > > CMD_UPDATE || operation == CMD_MERGE)". > > > > I felt the existing is ok here, as it is mentioned "we only need to > > explicitly permit CMD_DELETE" . Are you seeing any commands other than > > INSERT, UPDATE & MERGE possible here? > > 9. > YMMV. > > No, I'm not seeing other commands. I guess the current code works. I preferred the current way in this case. > ====== > src/backend/replication/logical/conflict.c > > > > 13c. > > > TBH, I preferred code how it used to be -- where all the CLT constants > > > and structs and enums and schemas were kept together. Now they are > > > split across conflict.h and conflict.c making it harder to read as > > > well as introducing need for static asserts that were not needed > > > before. > > > > No change done, as this change is required. Amit has given the > > explanation at [1]. > > > > By refactoring the conflict functions into conflict.c, it means nearly > everything is now kept together anyhow, just in the .c file instead of > the .h file :-) No change done here because of the reason stated in the earlier mail. Rest of the comments were fixed. The attached v37 version patch has the changes for the same. Also Peter's comments on the documentation patch from [1] and Shveta's comments from [2] are addressed in the attached patch. [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-20T10:42:02Z
On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > Rest of the comments were fixed. > The attached v37 version patch has the changes for the same. Also > Peter's comments on the documentation patch from [1] and Shveta's > comments from [2] are addressed in the attached patch. > > [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com > [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com > I have not yet looked at v37. But here are a few comments on v36-005, 006. I have merged them and reviewed together. 1) +#include "utils/fmgroids.h" +#include "utils/json.h" conflict.c compiles without above inclusions. 2) + bool log_dest_clt = false; + bool log_dest_logfile; A better and more clear name would be log_dest_table instead of log_dest_clt here. 3) @@ -6069,6 +6049,8 @@ DisableSubscriptionAndExit(void) */ pgstat_report_subscription_error(MyLogicalRepWorker->subid); + ProcessPendingConflictLogTuple(); It does not look obvious as in why we are trying to process conflict-tuple during disable-subscription? A comment will help here. 4) tuple_table_slot_to_indextup_json(): + indexDesc = index_open(indexid, NoLock); + + build_index_datums_from_slot(estate, localrel, slot, indexDesc, values, + isnull); + tupdesc = RelationGetDescr(indexDesc); + + /* Bless the tupdesc so it can be looked up by row_to_json. */ + BlessTupleDesc(tupdesc); We get the index's relcache pointer and pass it directly to BlessTupleDesc which internally changes it by assigning tdtypmod. Is this intentional i.e. do we want to change the relcache entry of index directly? Shouldn't we copy it (CreateTupleDescCopy) and then Bless it? 5) build_conflict_tupledesc() does 'CreateTemplateTupleDesc' and Bless it each time the conflict is raised. Since the tuple-descriptor here is not going to change, IMO, it will be better to create and bless it once and reuse it everytime. We can have a 'static' TupleDesc here. Thoughts? thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Shlok Kyal <shlok.kyal.oss@gmail.com> — 2026-05-20T10:50:34Z
On Wed, 20 May 2026 at 15:05, vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 19 May 2026 at 12:02, Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Mon, May 18, 2026 at 10:35 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote: > > > > Hi Vignesh. > > > > Thanks for addressing lots of my previous v33-0001 review comments. > > > > Here are some more review comments for the combined v35-0001/0002 patches. > > > > ====== > > Commit message. > > > > 1. > > If the user chooses to enable logging to a table (by selecting 'table' > > or 'all'), > > an internal logging table named pg_conflict_log_<subid> is automatically > > created within a dedicated, system-managed 'pg_conflict' namespace to prevent > > users from manually dropping or altering it. This also prevents accidental > > name collisions with user-created tables. This table is linked to the > > subscription via an internal dependency, ensuring it is automatically dropped > > when the subscription is removed > > > > ~ > > > > The internal name of the CLT table has changed slightly, so the commit > > message needs updating. > > This change is done as part of 0002 review comment fixes patch. I will > let Dilip do this change when he merges the review comment fixes patch > to 0001 patch. > > > > > ====== > > > > src/backend/executor/execMain.c > > > > > > > > 11. > > > > + > > > > + /* > > > > + * Conflict log tables are managed by the system to record logical > > > > + * replication conflicts. We allow DELETE and TRUNCATE to permit users to > > > > + * manually prune these logs, but manual data insertion or modification > > > > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the > > > > + * system-generated logs. > > > > + * > > > > + * Since TRUNCATE is handled as a separate utility command, we only need > > > > + * to explicitly permit CMD_DELETE here. > > > > + */ > > > > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) && > > > > + operation != CMD_DELETE) > > > > + ereport(ERROR, > > > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > > > > + errmsg("cannot modify or insert data into conflict log table \"%s\"", > > > > + RelationGetRelationName(resultRel)), > > > > + errdetail("Conflict log tables are system-managed and only support > > > > cleanup via DELETE or TRUNCATE."))); > > > > > > > > It somehow feels backwards to check "operation != CMD_DELETE", with > > > > the obscure comment that TRUNCATE is handled elsewhere. > > > > > > > > How about just check if "(operation == CMD_INSERT || operation == > > > > CMD_UPDATE || operation == CMD_MERGE)". > > > > > > I felt the existing is ok here, as it is mentioned "we only need to > > > explicitly permit CMD_DELETE" . Are you seeing any commands other than > > > INSERT, UPDATE & MERGE possible here? > > > > 9. > > YMMV. > > > > No, I'm not seeing other commands. I guess the current code works. > > I preferred the current way in this case. > > > ====== > > src/backend/replication/logical/conflict.c > > > > > > 13c. > > > > TBH, I preferred code how it used to be -- where all the CLT constants > > > > and structs and enums and schemas were kept together. Now they are > > > > split across conflict.h and conflict.c making it harder to read as > > > > well as introducing need for static asserts that were not needed > > > > before. > > > > > > No change done, as this change is required. Amit has given the > > > explanation at [1]. > > > > > > > By refactoring the conflict functions into conflict.c, it means nearly > > everything is now kept together anyhow, just in the .c file instead of > > the .h file :-) > > No change done here because of the reason stated in the earlier mail. > > Rest of the comments were fixed. > The attached v37 version patch has the changes for the same. Also > Peter's comments on the documentation patch from [1] and Shveta's > comments from [2] are addressed in the attached patch. > > [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com > [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com > Hi Vignesh, Here are some minor comments: Comment for all patches. 1. At multiple places (code comments and test cases) we are using the word 'internal conflict log table'. Do we need to use the word 'internal'? I think using 'conflict log table' is sufficient? Comments for 0002: 2. We can rename the schema pg_conflict to a different schema name. Is it ok to hardcode the schema name to 'pg_conflict'? - errmsg("cannot move objects into or out of CONFLICT schema"))); + errmsg("cannot move objects into or out of pg_conflict schema"))); Example: postgres=# ALTER SCHEMA pg_conflict RENAME TO sc1; ALTER SCHEMA postgres=# ALTER TABLE t2 SET SCHEMA sc1; ERROR: cannot move objects into or out of pg_conflict schema Comment for 0005/0006: 3. static const char *const ConflictTypeNames[] = { [CT_INSERT_EXISTS] = "insert_exists", [CT_UPDATE_ORIGIN_DIFFERS] = "update_origin_differs", [CT_UPDATE_EXISTS] = "update_exists", [CT_UPDATE_MISSING] = "update_missing", [CT_DELETE_ORIGIN_DIFFERS] = "delete_origin_differs", [CT_UPDATE_DELETED] = "update_deleted", [CT_DELETE_MISSING] = "delete_missing", [CT_MULTIPLE_UNIQUE_CONFLICTS] = "multiple_unique_conflicts" }; There are a few extra blank lines after declaration of ConflictTypeNames. Thanks, Shlok Kyal -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-21T00:02:25Z
On Wed, May 20, 2026 at 8:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: .. > Comments for 0002: > 2. We can rename the schema pg_conflict to a different schema name. > Is it ok to hardcode the schema name to 'pg_conflict'? > - errmsg("cannot move objects into or out of CONFLICT schema"))); > + errmsg("cannot move objects into or out of > pg_conflict schema"))); > > Example: > postgres=# ALTER SCHEMA pg_conflict RENAME TO sc1; > ALTER SCHEMA > postgres=# ALTER TABLE t2 SET SCHEMA sc1; > ERROR: cannot move objects into or out of pg_conflict schema > Yikes! I am not sure that the error message is the problem here. There are worse things that are similar to this. e.g. I found that you can do the same trick of renaming the 'pg_catalog' schema, and it breaks anything that refers to that schema by name -- all the internal SQL!! test_pub=# ALTER SCHEMA pg_catalog RENAME TO mycatalog; ALTER SCHEMA test_pub=# \dRp+ ERROR: relation "pg_catalog.pg_publication" does not exist LINE 9: FROM pg_catalog.pg_publication ^ ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-21T01:19:18Z
Hi Vignesh. I checked the latest v37-0001/0002 patches combined. My only comment is below. ====== 1. +/* + * drop_conflict_log_table + * Drop the conflict log table associated with a subscription. + * + * The conflict log table is registered as an internal dependency of the + * subscription. This function removes the dependency by performing a + * cascading deletion on the subscription object, which in turn drops the + * associated conflict log table. + * + * This is used to clean up conflict log tables that are no longer required, + * preventing accumulation of stale or orphaned relations. + * + * NOTE: + * Only conflict log tables are currently managed via this internal dependency + * mechanism. If additional internal dependencies are introduced in future, + * this function may require refinement to avoid unintended deletions. + */ +void +drop_conflict_log_table(Oid subid, char *subname, Oid subconflictlogrelid) +{ + ObjectAddress object; + char *conflictrelname; + + conflictrelname = get_rel_name(subconflictlogrelid); + + ObjectAddressSet(object, SubscriptionRelationId, subid); + performDeletion(&object, DROP_CASCADE, + PERFORM_DELETION_INTERNAL | + PERFORM_DELETION_SKIP_ORIGINAL); + + ereport(NOTICE, + errmsg("dropped conflict log table \"%s\" for subscription \"%s\"", + get_qualified_objname(PG_CONFLICT_NAMESPACE, conflictrelname), + subname)); +} + IIUC, this is a function that drops the subscription dependencies via cascade. Since the CLT happens to be the only such dependency, it gets dropped. The current implementation feels backwards to me. IMO, this is really a subscription function, so it should be refactored to be called something like 'drop_subscription_dependencies', and not be in the conflicts.c file. Refactoring/renaming to what it *really* does means you won't need to give the other warnings like "may require refinement to avoid unintended deletions". Maybe the callers do not need to be guarded anymore -- this code can check internally so that it only does anything when there is a known CLT associated with the subscription. Also, the function comment should make it clearer that PERFORM_DELETION_SKIP_ORIGINAL means the parent subscription object is not deleted. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-21T03:59:52Z
Hi Vignesh. Thanks for addressing my review comments for the documentation. Here is one more comment for the v37-0008/0009 (combined) docs patches ====== doc/src/sgml/logical-replication.sgml 1. + <row> + <entry><literal>replica_identity</literal></entry> + <entry><type>json</type></entry> + <entry>The JSON representation of the replica identity.</entry> + </row> + <row> I think patch 0002 modified the CLT column order. This doc's table row order should match the order of the CLT columns, so please compare again with the schema defined by the latest conflict.c. ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-21T04:57:10Z
Hi Vignesh Some trivial review comments for the combined v37-0003/0004 (transfer ownership) patches. ====== src/test/regress/sql/subscription.sql 1. +ALTER SUBSCRIPTION regress_conflict_test1 owner to regress_subscription_user2; /owner to/OWNER TO/ ~~~ 2. +-- Restore the original subscription owner. +ALTER SUBSCRIPTION regress_conflict_test1 owner to regress_subscription_user; /owner to/OWNER TO/ ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-21T05:51:02Z
Hi Vignesh, Some minor review comments for patches v37-0005/0006 combined. ====== src/backend/replication/logical/conflict.c 1. +/* Schema for the elements within the 'local_conflicts' JSON array */ +static const ConflictLogColumnDef LocalConflictSchema[] = +{ + { .attname = "xid", .atttypid = XIDOID }, + { .attname = "commit_ts", .atttypid = TIMESTAMPTZOID }, + { .attname = "origin", .atttypid = TEXTOID }, + { .attname = "key", .atttypid = JSONOID }, + { .attname = "tuple", .atttypid = JSONOID } +}; + +#define NUM_LOCAL_CONFLICT_ATTRS lengthof(LocalConflictSchema) + IMO this belongs *below* the ConflictLogSchema[], which is where 'local_conflicts' attribute was introduced, instead of above it. ~~~ 2. + + static int errcode_apply_conflict(ConflictType type); ~ There are some spurious blank lines here that should not be in the patch. ~~~ ProcessPendingConflictLogTuple: 3. + /* Open conflict log table and insert the tuple */ + conflictlogrel = GetConflictLogDestAndTable(&dest); + Assert(CONFLICTS_LOGGED_TO_TABLE(dest)); Maybe here it's better to say Assert(conflictlogrel); ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-21T06:01:29Z
Amit, Vignesh, A part of 007 patch is about preserving subscription-oid. Another thread (origin migration) also needs the same logic as per discussion at [1]. And there was a old thread which already attempted preserving subscription-oid at [2], but the idea was rejected at that time. Why don't we attempt to resume the same thread ([2]) and implement preserving subscription-oid as a separate thread as we now have multiple dependencies on it? Thoughts? [1]: https://www.postgresql.org/message-id/CALDaNm2-uwpbJ8fnrssp%2BhORvOutsqRoZAsa05xVVzXe5Bt3bw%40mail.gmail.com [2]: https://www.postgresql.org/message-id/flat/CALDaNm2Wj63VcbB0SY2NECHr1mKM1YSaV1ZydrdQVxyox2O2hg%40mail.gmail.com thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-21T07:08:28Z
A few comments on v36-007: 1) AlterSubscriptionConflictLogDestination + want_table = (logdest == CONFLICT_LOG_DEST_TABLE || + logdest == CONFLICT_LOG_DEST_ALL); + has_oldtable = (old_dest == CONFLICT_LOG_DEST_TABLE || + old_dest == CONFLICT_LOG_DEST_ALL); Shall we replace checks at both places with CONFLICTS_LOGGED_TO_TABLE? 2) I think we can move 'AlterSubscriptionConflictLogDestination' into the configuration patch itself (if needed). It is not directly used anywhere in upgrade flow as such. IIUC, even if upgrade flow uses it, it will only be used through AlterSubscription. 3) AlterSubscriptionConflictLogDestination: + if (want_table && !has_oldtable) + { + char relname[NAMEDATALEN]; + + snprintf(relname, NAMEDATALEN, "pg_conflict_log_for_subid_%u", sub->oid); + + /* + * In upgrade scenarios, the conflict log table already exists. Update + * the catalog to record the association. + */ + relid = get_relname_relid(relname, PG_CONFLICT_NAMESPACE); + if (!OidIsValid(relid)) + relid = create_conflict_log_table(sub->oid, sub->name, sub->owner); So this function will now be used during upgrade where destination is TABLE/ALL as well as regular Alter-Subscription to change destination from LOG to TABLE/ALL. In upgrade case, we expect the relid (CLT) to be present already while in regular case, we don't expect any CLT to be present. The above code does not take care of maintaining the sanity checks. It should be able to distinguish the 2 cases and Assert/Error if the condition is opposed to what we expect. 4) Also , I do not understand how can upgrade ever pass this check: + if (want_table && !has_oldtable) It is not obvious how the upgrade flow will pass this check because theoretically both the old and new setup should have the exact same configuration; i.e. if 'want_table' is true, 'has_oldtable' will be true. We can add a comment to clarify the situation here. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-21T07:11:03Z
On Thu, 21 May 2026 at 05:32, Peter Smith <smithpb2250@gmail.com> wrote: > > On Wed, May 20, 2026 at 8:50 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > .. > > Comments for 0002: > > 2. We can rename the schema pg_conflict to a different schema name. > > Is it ok to hardcode the schema name to 'pg_conflict'? > > - errmsg("cannot move objects into or out of CONFLICT schema"))); > > + errmsg("cannot move objects into or out of > > pg_conflict schema"))); > > > > Example: > > postgres=# ALTER SCHEMA pg_conflict RENAME TO sc1; > > ALTER SCHEMA > > postgres=# ALTER TABLE t2 SET SCHEMA sc1; > > ERROR: cannot move objects into or out of pg_conflict schema > > > > Yikes! > > I am not sure that the error message is the problem here. There are > worse things that are similar to this. e.g. I found that you can do > the same trick of renaming the 'pg_catalog' schema, and it breaks > anything that refers to that schema by name -- all the internal SQL!! > > test_pub=# ALTER SCHEMA pg_catalog RENAME TO mycatalog; > ALTER SCHEMA > test_pub=# \dRp+ > ERROR: relation "pg_catalog.pg_publication" does not exist > LINE 9: FROM pg_catalog.pg_publication > ^ I noticed this behavior with several other commands as well. For example: postgres=# ALTER SCHEMA pg_catalog RENAME TO test; ALTER SCHEMA postgres=# \d ERROR: relation "pg_catalog.pg_class" does not exist LINE 6: FROM pg_catalog.pg_class c ^ postgres=# \dn ERROR: relation "pg_catalog.pg_namespace" does not exist LINE 4: FROM pg_catalog.pg_namespace n ^ I observed similar behavior when creating a table in the renamed schema: postgres=# CREATE TABLE test.t1(c1 int); ERROR: schema "pg_catalog" does not exist LINE 1: CREATE TABLE test.t1(c1 int); ^ Given that this appears to be a broader issue related to renaming pg_catalog, I think we can skip handling this case for now. If we decide to address it, it would be better to handle it together with the general pg_catalog rename behavior. Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Shlok Kyal <shlok.kyal.oss@gmail.com> — 2026-05-21T07:13:15Z
On Wed, 20 May 2026 at 15:05, vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 19 May 2026 at 12:02, Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Mon, May 18, 2026 at 10:35 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Wed, 13 May 2026 at 11:43, Peter Smith <smithpb2250@gmail.com> wrote: > > > > Hi Vignesh. > > > > Thanks for addressing lots of my previous v33-0001 review comments. > > > > Here are some more review comments for the combined v35-0001/0002 patches. > > > > ====== > > Commit message. > > > > 1. > > If the user chooses to enable logging to a table (by selecting 'table' > > or 'all'), > > an internal logging table named pg_conflict_log_<subid> is automatically > > created within a dedicated, system-managed 'pg_conflict' namespace to prevent > > users from manually dropping or altering it. This also prevents accidental > > name collisions with user-created tables. This table is linked to the > > subscription via an internal dependency, ensuring it is automatically dropped > > when the subscription is removed > > > > ~ > > > > The internal name of the CLT table has changed slightly, so the commit > > message needs updating. > > This change is done as part of 0002 review comment fixes patch. I will > let Dilip do this change when he merges the review comment fixes patch > to 0001 patch. > > > > > ====== > > > > src/backend/executor/execMain.c > > > > > > > > 11. > > > > + > > > > + /* > > > > + * Conflict log tables are managed by the system to record logical > > > > + * replication conflicts. We allow DELETE and TRUNCATE to permit users to > > > > + * manually prune these logs, but manual data insertion or modification > > > > + * (INSERT, UPDATE, MERGE) is prohibited to maintain the integrity of the > > > > + * system-generated logs. > > > > + * > > > > + * Since TRUNCATE is handled as a separate utility command, we only need > > > > + * to explicitly permit CMD_DELETE here. > > > > + */ > > > > + if (IsConflictNamespace(RelationGetNamespace(resultRel)) && > > > > + operation != CMD_DELETE) > > > > + ereport(ERROR, > > > > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > > > > + errmsg("cannot modify or insert data into conflict log table \"%s\"", > > > > + RelationGetRelationName(resultRel)), > > > > + errdetail("Conflict log tables are system-managed and only support > > > > cleanup via DELETE or TRUNCATE."))); > > > > > > > > It somehow feels backwards to check "operation != CMD_DELETE", with > > > > the obscure comment that TRUNCATE is handled elsewhere. > > > > > > > > How about just check if "(operation == CMD_INSERT || operation == > > > > CMD_UPDATE || operation == CMD_MERGE)". > > > > > > I felt the existing is ok here, as it is mentioned "we only need to > > > explicitly permit CMD_DELETE" . Are you seeing any commands other than > > > INSERT, UPDATE & MERGE possible here? > > > > 9. > > YMMV. > > > > No, I'm not seeing other commands. I guess the current code works. > > I preferred the current way in this case. > > > ====== > > src/backend/replication/logical/conflict.c > > > > > > 13c. > > > > TBH, I preferred code how it used to be -- where all the CLT constants > > > > and structs and enums and schemas were kept together. Now they are > > > > split across conflict.h and conflict.c making it harder to read as > > > > well as introducing need for static asserts that were not needed > > > > before. > > > > > > No change done, as this change is required. Amit has given the > > > explanation at [1]. > > > > > > > By refactoring the conflict functions into conflict.c, it means nearly > > everything is now kept together anyhow, just in the .c file instead of > > the .h file :-) > > No change done here because of the reason stated in the earlier mail. > > Rest of the comments were fixed. > The attached v37 version patch has the changes for the same. Also > Peter's comments on the documentation patch from [1] and Shveta's > comments from [2] are addressed in the attached patch. > > [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com > [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com > Hi Vignesh, I reviewed v37-0007 patch. Here is some review comments: 1. subinfo[i].subconflictlogdest is assigned multiple times: + if (PQgetisnull(res, i, i_sublogdestination)) + subinfo[i].subconflictlogdest = NULL; + else + subinfo[i].subconflictlogdest = + pg_strdup(PQgetvalue(res, i, i_sublogdestination)); + + if (PQgetisnull(res, i, i_sublogdestination)) + subinfo[i].subconflictlogdest = NULL; + else + subinfo[i].subconflictlogdest = + pg_strdup(PQgetvalue(res, i, i_sublogdestination)); 2. I think we should add a version check before: + appendPQExpBuffer(query, + "\n\nALTER SUBSCRIPTION %s SET (conflict_log_destination = %s);\n", + qsubname, + subinfo->subconflictlogdest); When we run pg_dump on a server with Postgres 18, we get the following output. ALTER SUBSCRIPTION sub2 SET (conflict_log_destination = (null)); Thanks, Shlok Kyal -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-21T10:39:23Z
Few comments on doc patches v36-008 and 009 combined: 1) + An array of JSON objects representing the local state for each conflict attempt. 'each conflict attempt' looks misleading. We do not attempt to cause conflicts; we attempt to apply, but it may result in conflicts. Shall we rephrase to: 'An array of JSON objects representing the state of existing local row(s) that caused the conflict.' There could be multiple rows as well for multiple_unique_conflicts, thus the 'row(s)' 2) + The <link linkend="sql-createsubscription-params-with-conflict-log-destination"><literal>conflict_log_destination</literal></link> + parameter automatically creates a dedicated conflict log table. 'conflict_log_destination' parameter does not create the table automatically unless it is set to table. We shall clarify it. The conflict_log_destination when set to table or all automatically creates a dedicated conflict log table. 3) + Conflicts that occur during replication are, by default, logged as plain text When we say 'Conflicts' here, we shall make it a link to '29.8. Conflicts' chapter. That way it will be more clear. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Nisha Moond <nisha.moond412@gmail.com> — 2026-05-22T04:51:21Z
On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > Rest of the comments were fixed. > The attached v37 version patch has the changes for the same. Also > Peter's comments on the documentation patch from [1] and Shveta's > comments from [2] are addressed in the attached patch. > Here are few comments based on v37 testing: 1) Should we consider using TOAST tables for tuple-data columns like remote_tuple and local_conflicts (the JSON columns)? This may be a corner case, but if the tuple data becomes too large to fit into an 8KB heap tuple, then the apply worker keeps failing while inserting into the CLT with errors like: ERROR: row is too big: size 19496, maximum size 8160 LOG: background worker "logical replication apply worker" (PID 41226) exited with exit code 1 Noticed that even disable_on_error=true does not disable the subscription in this case. We can think about optimizations such as deciding when TOAST tables should be created, or avoiding the error by trimming/capping the data size before inserting into the CLT if don't want TOAST. ~~~ 2) Currently, parallel apply workers do not seem to insert conflicts into the CLT. The parallel worker logs the conflict to the logfile and then exits with an error without handling CLT insertion. A small test to reproduce this with a 't1' table subscription using a CLT table: -- on publisher ALTER SYSTEM SET logical_decoding_work_mem = '64kB'; SELECT pg_reload_conf(); -- Create a conflict scenario on subscriber: pre-insert a row that will conflict INSERT INTO t1 VALUES (99999, 11); -- on publisher: big transaction that hits the conflict BEGIN; INSERT INTO t1 SELECT i, i FROM generate_series(1, 10000) i; INSERT INTO t1 VALUES (99999, 99); -- this conflicts COMMIT; logfile: ERROR: conflict detected on relation "public.t1": conflict=insert_exists DETAIL: Could not apply remote change: remote row (99999, 99). Key already exists in unique index "t1_pkey", modified locally in transaction 842 at 2026-05-21 21:10:51.497681+05:30: key (a)=(99999), local row (99999, 42). ... ERROR: logical replication parallel apply worker exited due to error CONTEXT: processing remote data for replication origin "pg_16398" during message type "INSERT" for replication target relation "public.t1" in transaction 720 logical replication parallel apply worker processing remote data for replication origin "pg_16398" during message type "STREAM COMMIT" in transaction 720, finished at 0/01AC9758 LOG: subscription "sub1" has been disabled because of an error ERROR: lost connection to the logical replication parallel apply worker LOG: background worker "logical replication parallel worker" (PID 66271) exited with exit code 1 ~~~ 3) I think somewhere in patch-0005, the remote_tuple and replica_identity columns may have been swapped. The replica identity key seems to be written into the remote_tuple column, while the remote slot row is written into replica_identity, for example: postgres=# select relname, conflict_type, remote_xid, remote_tuple, replica_identity from pg_conflict_log_for_subid_16398; relname | conflict_type | remote_xid | remote_tuple | replica_identity ---------+-----------------------+------------+--------------+------------------ t1 | insert_exists | 699 | | {"a":3,"b":11} t1 | update_origin_differs | 700 | {"a":3} | {"a":3,"b":111} (2 rows) -- Thanks, Nisha -
Re: Proposal: Conflict log history table for Logical Replication
Nisha Moond <nisha.moond412@gmail.com> — 2026-05-22T10:12:20Z
On Fri, May 22, 2026 at 10:21 AM Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Rest of the comments were fixed. > > The attached v37 version patch has the changes for the same. Also > > Peter's comments on the documentation patch from [1] and Shveta's > > comments from [2] are addressed in the attached patch. > > > > Here are few comments based on v37 testing: > Here are few more review comments - 1) Patch-0001 + 0002: In subscription.sql: -- Verify the table OID for reap check SELECT 'pg_conflict_log_for_subid_' || oid AS internal_tablename FROM pg_subscription WHERE subname = 'regress_conflict_test1' \gset SET client_min_messages = WARNING; DROP SUBSCRIPTION regress_conflict_test1; -- should return NULL, meaning the conflict log table was reaped via dependency SELECT to_regclass(:'internal_tablename'); Here, internal_tablename becomes "pg_conflict_log_*" without the pg_conflict. schema prefix. So, "SELECT to_regclass(:'internal_tablename');" will always return NULL even if the table still exists in the pg_conflict schema, which skips the actual drop verification scenario. Should we instead use: "SELECT 'pg_conflict.pg_conflict_log_' || oid AS internal_tablename..." ~~~ For Patch-0007: 2) @@ -2067,9 +2095,31 @@ selectDumpableNamespace(NamespaceInfo *nsinfo, Archive *fout) static void selectDumpableTable(TableInfo *tbinfo, Archive *fout) .... + if (strcmp(tbinfo->dobj.namespace->dobj.name, "pg_conflict") == 0) ... + * Dump pg_conflict tables only during binary upgrade. The schema + * is assumed to already exist. + */ + tbinfo->dobj.dump = DUMP_COMPONENT_DEFINITION; .... + else + tbinfo->dobj.dump = DUMP_COMPONENT_NONE; + } + For conflict log tables during binary upgrade, we set: tbinfo->dobj.dump = DUMP_COMPONENT_DEFINITION; but then execution falls through to the later logic: ... else tbinfo->dobj.dump = tbinfo->dobj.namespace->dobj.dump_contains; which seems to overwrite the earlier 'dobj.dump' value. So it looks like DUMP_COMPONENT_DEFINITION may never actually survive here. Should we return from this block instead of continuing further? 3) @@ -5656,6 +5757,11 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo) appendPQExpBufferStr(query, ");\n"); + appendPQExpBuffer(query, + "\n\nALTER SUBSCRIPTION %s SET (conflict_log_destination = %s);\n", + qsubname, + subinfo->subconflictlogdest); + The above ALTER SUBSCRIPTION command seems to be dumped unconditionally for every subscription. Since the default value during subscription creation is already "subconflictlogdest = 'log' ", should we emit this command only when subconflictlogdest is non-NULL and not 'log'? 4) + if (PQgetisnull(res, i, i_sublogdestination)) + subinfo[i].subconflictlogdest = NULL; + else + subinfo[i].subconflictlogdest = + pg_strdup(PQgetvalue(res, i, i_sublogdestination)); + + if (PQgetisnull(res, i, i_sublogdestination)) + subinfo[i].subconflictlogdest = NULL; + else + subinfo[i].subconflictlogdest = + pg_strdup(PQgetvalue(res, i, i_sublogdestination)); + /* Decide whether we want to dump it */ Looks like the same if-else block is repeated twice here. -- Thanks, Nisha -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-23T06:10:40Z
On Wed, 20 May 2026 at 16:12, shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Rest of the comments were fixed. > > The attached v37 version patch has the changes for the same. Also > > Peter's comments on the documentation patch from [1] and Shveta's > > comments from [2] are addressed in the attached patch. > > > > [1] - https://www.postgresql.org/message-id/CAHut%2BPsrnU2BB1%2BM3c%2BDr5h62BLYfwBzhTg%3DBM7QtBoPwHYrKw%40mail.gmail.com > > [2] - https://www.postgresql.org/message-id/CAJpy0uCX53c40xopqmHtWSWBmh78BqhLVGXa88fU42eOi6w%2BLQ%40mail.gmail.com > > > > I have not yet looked at v37. But here are a few comments on v36-005, > 006. I have merged them and reviewed together. > > 1) > +#include "utils/fmgroids.h" > +#include "utils/json.h" > > conflict.c compiles without above inclusions. > > 2) > + bool log_dest_clt = false; > + bool log_dest_logfile; > > A better and more clear name would be log_dest_table instead of > log_dest_clt here. > > 3) > @@ -6069,6 +6049,8 @@ DisableSubscriptionAndExit(void) > */ > pgstat_report_subscription_error(MyLogicalRepWorker->subid); > > + ProcessPendingConflictLogTuple(); > > It does not look obvious as in why we are trying to process > conflict-tuple during disable-subscription? A comment will help here. > > > 4) > tuple_table_slot_to_indextup_json(): > > + indexDesc = index_open(indexid, NoLock); > + > + build_index_datums_from_slot(estate, localrel, slot, indexDesc, values, > + isnull); > + tupdesc = RelationGetDescr(indexDesc); > + > + /* Bless the tupdesc so it can be looked up by row_to_json. */ > + BlessTupleDesc(tupdesc); > > We get the index's relcache pointer and pass it directly to > BlessTupleDesc which internally changes it by assigning tdtypmod. Is > this intentional i.e. do we want to change the relcache entry of index > directly? Shouldn't we copy it (CreateTupleDescCopy) and then Bless > it? > > 5) > build_conflict_tupledesc() does 'CreateTemplateTupleDesc' and Bless it > each time the conflict is raised. Since the tuple-descriptor here is > not going to change, IMO, it will be better to create and bless it > once and reuse it everytime. We can have a 'static' TupleDesc here. > Thoughts? Thanks for the comments, these comments are addressed in the v38 version attached. Apart from this, the comments from [1], [2], [3], [4], [5], [6], [7], and [8] are also addressed. [1] - https://www.postgresql.org/message-id/CAJpy0uC43NTKheuLo%2BMsHG7Sfh-QWQM9QP-EVPL5LChiPfisJw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CANhcyEU8qr9%2BPMU2Kn0qqZakVptVvRsbRu3Ee2Q40YX9aivXww%40mail.gmail.com [3] - https://www.postgresql.org/message-id/CAJpy0uB19XxfF2Yj1w%3DC90iVBLMHb%3DDMBZ1h3rqzJhEbTSwtag%40mail.gmail.com [4] - https://www.postgresql.org/message-id/CAHut%2BPvSaJAYwNUS9GnO6MCTfuPpVLdU1r8cZBf6gjGjvnbWpQ%40mail.gmail.com [5] - https://www.postgresql.org/message-id/CAHut%2BPtUWTnUD8QpfmNpU8iU6Pg%2BE29nDALYAfMUudad8oYezw%40mail.gmail.com [6] - https://www.postgresql.org/message-id/CAHut%2BPvW%3DFd-OSM6oe-9D3ycAG0qLfGEnaT%3DBUB%2BPMeUFeEAyQ%40mail.gmail.com [7] - https://www.postgresql.org/message-id/CAHut%2BPu4ErbjstY86kWbKOepHn623Zp9MNiKW4DoMG3iVdG2fA%40mail.gmail.com [8] - https://www.postgresql.org/message-id/CANhcyEUGoaSpJKDJaQfrQR6%2B-4%2B_PgQ%3D0DmZZztPAEheMkMw7w%40mail.gmail.com Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-23T15:40:08Z
On Wed, May 20, 2026 at 11:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > Amit, Vignesh, > > A part of 007 patch is about preserving subscription-oid. Another > thread (origin migration) also needs the same logic as per discussion > at [1]. And there was a old thread which already attempted preserving > subscription-oid at [2], but the idea was rejected at that time. Why > don't we attempt to resume the same thread ([2]) and implement > preserving subscription-oid as a separate thread as we now have > multiple dependencies on it? Thoughts? > Agreed, but I think we can move the discussion/review to a separate thread. However, at this stage, we can make initial patches ready and then move to it. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-25T01:36:42Z
Hi Vignesh, Some review comments for v38-0001/0002 combined. ====== src/backend/commands/subscriptioncmds.c AlterSubscriptionConflictLogDestination: 1. * Update the conflict log table associated with a subscription when its * conflict log destination is changed. Somehow, that 'its' sounded awkward to me. SUGGESTION. When the subscription's 'conflict_log_destination' is changed, update the conflict log table if required. ~~~ 2. + * If the new destination requires a conflict log table and none was previously + * required, this function validates an existing conflict log table identified + * by the subscription specific naming convention or creates a new one. What does this mean: "validates an existing conflict log table". How is there an "existing" CLT when you already said "none was previously required". Maybe this needs more explanation. If it is talking about "not already associated with another subscription", then it should just say that. Anyway, it seems validation that the comment claims this function is doing is not done here at all, but is really done by 'create_conflict_log_table'. ~~~ 3. +static bool +AlterSubscriptionConflictLogDestination(Subscription *sub, + ConflictLogDest logdest, + Oid *conflicttablerelid) 3a. There was no forward declaration of this static function, but there was for all the others. ~ 3b. Static functions should use snake-case names. ~~~ 4. Personally, I think it is more natural to read LEFT-TO-RIGHT, OLD-THEN-NEW, etc., so I felt that the has_oldtable check should always come before want_table. Also, the 'ifs' seemed tricky because it's not obvious what has/need_table combinations are missing. e.g. The following seems easier to me. And probably lots of comments could be moved to here in the code as well, instead of in the function comment. SUGGESTION if (has_old_table) { /* There is a CLT already. */ if (!want_table) { /* Remove it. */ drop_subscription_dependencies(sub->oid, sub->name, sub->conflictlogrelid); update_relid = true; } } else { /* There was no previous CLT. */ if (want_table) { /* Create one. */ relid = create_conflict_log_table(sub->oid, sub->name, sub->owner); update_relid = true; } } ~~~ 5. +static void +drop_subscription_dependencies(Oid subid, char *subname, + Oid subconflictlogrelid) +{ + ObjectAddress object; + char *conflictrelname; + + conflictrelname = get_rel_name(subconflictlogrelid); + + /* + * By using PERFORM_DELETION_SKIP_ORIGINAL, we ensure that only the + * conflict log table is deleted while the subscription remains. + */ + ObjectAddressSet(object, SubscriptionRelationId, subid); + performDeletion(&object, DROP_CASCADE, + PERFORM_DELETION_INTERNAL | + PERFORM_DELETION_SKIP_ORIGINAL); + + ereport(NOTICE, + errmsg("dropped conflict log table \"%s\" for subscription \"%s\"", + get_qualified_objname(PG_CONFLICT_NAMESPACE, conflictrelname), + subname)); +} + One day, this function might do more than just remove the CLT, so IMO all this function body should be within a check: if (OidIsValid(subconflictlogrelid)) { /* Drop any dependent CLT */ ... } ~~~ DropSubscription 6. + if (OidIsValid(subconflictlogrelid)) + drop_subscription_dependencies(subid, subname, subconflictlogrelid); Make it unconditional. Instead, add the condition inside the 'drop_subscription_dependencies', per the previous review comment #5. ====== src/test/regress/sql/subscription.sql 7. +-- +-- PUBLICATION: Verify conflict log tables are not publishable +-- +-- pg_relation_is_publishable should return false for internal conflict log +-- tables to prevent them from being accidentally included in publications +-- Everywhere else, you had removed the word "internal", but this one (maybe others?) was missed. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-25T04:18:43Z
On Sat, May 23, 2026 at 9:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, May 20, 2026 at 11:01 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Amit, Vignesh, > > > > A part of 007 patch is about preserving subscription-oid. Another > > thread (origin migration) also needs the same logic as per discussion > > at [1]. And there was a old thread which already attempted preserving > > subscription-oid at [2], but the idea was rejected at that time. Why > > don't we attempt to resume the same thread ([2]) and implement > > preserving subscription-oid as a separate thread as we now have > > multiple dependencies on it? Thoughts? > > > > Agreed, but I think we can move the discussion/review to a separate > thread. However, at this stage, we can make initial patches ready and > then move to it. > Okay, works for me. thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-25T04:42:53Z
On Fri, 22 May 2026 at 15:42, Nisha Moond <nisha.moond412@gmail.com> wrote: > > Here are few more review comments - > 1) Patch-0001 + 0002: > In subscription.sql: > -- Verify the table OID for reap check > SELECT 'pg_conflict_log_for_subid_' || oid AS internal_tablename FROM > pg_subscription WHERE subname = 'regress_conflict_test1' \gset > SET client_min_messages = WARNING; > DROP SUBSCRIPTION regress_conflict_test1; > -- should return NULL, meaning the conflict log table was reaped via dependency > SELECT to_regclass(:'internal_tablename'); > > Here, internal_tablename becomes "pg_conflict_log_*" without the > pg_conflict. schema prefix. So, "SELECT > to_regclass(:'internal_tablename');" will always return NULL even if > the table still exists in the pg_conflict schema, which skips the > actual drop verification scenario. > Should we instead use: > "SELECT 'pg_conflict.pg_conflict_log_' || oid AS internal_tablename..." > ~~~ > > For Patch-0007: > 2) > @@ -2067,9 +2095,31 @@ selectDumpableNamespace(NamespaceInfo *nsinfo, > Archive *fout) > static void > selectDumpableTable(TableInfo *tbinfo, Archive *fout) > .... > + if (strcmp(tbinfo->dobj.namespace->dobj.name, "pg_conflict") == 0) > ... > + * Dump pg_conflict tables only during binary upgrade. The schema > + * is assumed to already exist. > + */ > + tbinfo->dobj.dump = DUMP_COMPONENT_DEFINITION; > .... > + else > + tbinfo->dobj.dump = DUMP_COMPONENT_NONE; > + } > + > > For conflict log tables during binary upgrade, we set: > tbinfo->dobj.dump = DUMP_COMPONENT_DEFINITION; > > but then execution falls through to the later logic: > ... > else > tbinfo->dobj.dump = tbinfo->dobj.namespace->dobj.dump_contains; > > which seems to overwrite the earlier 'dobj.dump' value. So it looks > like DUMP_COMPONENT_DEFINITION may never actually survive here. > Should we return from this block instead of continuing further? > > 3) > @@ -5656,6 +5757,11 @@ dumpSubscription(Archive *fout, const > SubscriptionInfo *subinfo) > > appendPQExpBufferStr(query, ");\n"); > > + appendPQExpBuffer(query, > + "\n\nALTER SUBSCRIPTION %s SET (conflict_log_destination = %s);\n", > + qsubname, > + subinfo->subconflictlogdest); > + > > The above ALTER SUBSCRIPTION command seems to be dumped > unconditionally for every subscription. > Since the default value during subscription creation is already > "subconflictlogdest = 'log' ", should we emit this command only when > subconflictlogdest is non-NULL and not 'log'? > > 4) > + if (PQgetisnull(res, i, i_sublogdestination)) > + subinfo[i].subconflictlogdest = NULL; > + else > + subinfo[i].subconflictlogdest = > + pg_strdup(PQgetvalue(res, i, i_sublogdestination)); > + > + if (PQgetisnull(res, i, i_sublogdestination)) > + subinfo[i].subconflictlogdest = NULL; > + else > + subinfo[i].subconflictlogdest = > + pg_strdup(PQgetvalue(res, i, i_sublogdestination)); > + > /* Decide whether we want to dump it */ > > Looks like the same if-else block is repeated twice here. Thanks for the comments, the attached v39 version patch has the changes for the same. Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-26T05:53:51Z
On Mon, 25 May 2026 at 07:07, Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Vignesh, > > Some review comments for v38-0001/0002 combined. > > ====== > src/backend/commands/subscriptioncmds.c > > AlterSubscriptionConflictLogDestination: > > 1. > * Update the conflict log table associated with a subscription when its > * conflict log destination is changed. > > Somehow, that 'its' sounded awkward to me. > > SUGGESTION. > When the subscription's 'conflict_log_destination' is changed, update > the conflict log table if required. > > ~~~ > > 2. > + * If the new destination requires a conflict log table and none was previously > + * required, this function validates an existing conflict log table identified > + * by the subscription specific naming convention or creates a new one. > > What does this mean: "validates an existing conflict log table". How > is there an "existing" CLT when you already said "none was previously > required". Maybe this needs more explanation. If it is talking about > "not already associated with another subscription", then it should > just say that. > > Anyway, it seems validation that the comment claims this function is > doing is not done here at all, but is really done by > 'create_conflict_log_table'. > > ~~~ > > 3. > +static bool > +AlterSubscriptionConflictLogDestination(Subscription *sub, > + ConflictLogDest logdest, > + Oid *conflicttablerelid) > > 3a. > There was no forward declaration of this static function, but there > was for all the others. > > ~ > > 3b. > Static functions should use snake-case names. > > ~~~ > > 4. > Personally, I think it is more natural to read LEFT-TO-RIGHT, > OLD-THEN-NEW, etc., so I felt that the has_oldtable check should > always come before want_table. > > Also, the 'ifs' seemed tricky because it's not obvious what > has/need_table combinations are missing. e.g. The following seems > easier to me. And probably lots of comments could be moved to here in > the code as well, instead of in the function comment. > > SUGGESTION > if (has_old_table) > { > /* There is a CLT already. */ > > if (!want_table) > { > /* Remove it. */ > drop_subscription_dependencies(sub->oid, sub->name, sub->conflictlogrelid); > update_relid = true; > } > } > else > { > /* There was no previous CLT. */ > > if (want_table) > { > /* Create one. */ > relid = create_conflict_log_table(sub->oid, sub->name, sub->owner); > update_relid = true; > } > } > > ~~~ > > 5. > +static void > +drop_subscription_dependencies(Oid subid, char *subname, > + Oid subconflictlogrelid) > +{ > + ObjectAddress object; > + char *conflictrelname; > + > + conflictrelname = get_rel_name(subconflictlogrelid); > + > + /* > + * By using PERFORM_DELETION_SKIP_ORIGINAL, we ensure that only the > + * conflict log table is deleted while the subscription remains. > + */ > + ObjectAddressSet(object, SubscriptionRelationId, subid); > + performDeletion(&object, DROP_CASCADE, > + PERFORM_DELETION_INTERNAL | > + PERFORM_DELETION_SKIP_ORIGINAL); > + > + ereport(NOTICE, > + errmsg("dropped conflict log table \"%s\" for subscription \"%s\"", > + get_qualified_objname(PG_CONFLICT_NAMESPACE, conflictrelname), > + subname)); > +} > + > > One day, this function might do more than just remove the CLT, so IMO > all this function body should be within a check: > > if (OidIsValid(subconflictlogrelid)) > { > /* Drop any dependent CLT */ > ... > } > > ~~~ > > DropSubscription > > 6. > + if (OidIsValid(subconflictlogrelid)) > + drop_subscription_dependencies(subid, subname, subconflictlogrelid); > > Make it unconditional. Instead, add the condition inside the > 'drop_subscription_dependencies', per the previous review comment #5. > > ====== > src/test/regress/sql/subscription.sql > > 7. > +-- > +-- PUBLICATION: Verify conflict log tables are not publishable > +-- > +-- pg_relation_is_publishable should return false for internal conflict log > +-- tables to prevent them from being accidentally included in publications > +-- > > Everywhere else, you had removed the word "internal", but this one > (maybe others?) was missed. Thanks for the comments, these are addressed in the v40 version patch attached. Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-26T06:28:43Z
Hi Vignesh. I had only one trivial review comment for v40-0001/0002 combined. ====== src/backend/commands/subscriptioncmds.c 1. + if (OidIsValid(subconflictlogrelid)) + { + ObjectAddress object; + char *conflictrelname; + + /* Drop any dependent conflict log table */ + conflictrelname = get_rel_name(subconflictlogrelid); That "Drop any..." comment doesn't have anything to do with the statement that follows it. I think this comment belongs outside the if. e.g. /* Drop any dependent conflict log table */ if (OidIsValid(subconflictlogrelid)) { ... } ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-26T09:38:19Z
On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote: > > > Thanks for the comments, the attached v39 version patch has the > changes for the same. > I have not yet looked at v40, but please find a few ocmments on v39-0001 and 0002 merged together. 1) heap_create: + errdetail("Conflict schema modifications are currently disallowed."))); LookupCreationNamespace: + errmsg("cannot move objects into or out of the pg_conflict schema"))); Can we make it same through-out, either we use 'Conflict schema' at both the places or pg_conflict schema. Since in these 2 functions, in previous messages, we are using names like 'System catalog', 'TOAST schema' etc, I think we can use Conflict schema at both the places. What do others think on this? 2) drop_subscription_dependencies(): + conflictrelname = get_rel_name(subconflictlogrelid); We can actually have a sanity check that we got the CLT using the relid. Assert(conflictrelname != NULL); 3) + /* + * Special handling for the JSON array type for proper + * TupleDescInitEntry call. + */ + if (type_oid == JSONARRAYOID) + type_oid = get_array_type(JSONOID); Why do we have this special handling? Do we expect that 'type_oid' can be different from JSONARRAYOID if we use get_array_type? On debugging, I found it to be same pre and post get_array_type() 4) Do we need to have CommandCounterIncrement() after heap_create_with_catalog() in create_conflict_log_table()? I think even if we are not doing any table_open etc for CLT in same transaction, we should call CommandCounterIncrement() (to be consistent with other such calls of heap_create_with_catalog and to make it future proof). Thoughts? thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-26T22:50:23Z
On Tue, May 26, 2026 at 7:38 PM shveta malik <shveta.malik@gmail.com> wrote: ... > 1) > heap_create: > + errdetail("Conflict schema modifications are currently disallowed."))); > LookupCreationNamespace: > + errmsg("cannot move objects into or out of the pg_conflict schema"))); > > Can we make it same through-out, either we use 'Conflict schema' at > both the places or pg_conflict schema. Since in these 2 functions, in > previous messages, we are using names like 'System catalog', 'TOAST > schema' etc, I think we can use Conflict schema at both the places. > What do others think on this? > The suggested name of "Conflict schema" LGTM. My only concern was that a user may not know where that is referring to. OTOH, things like "System catalog" have 100s of mentions and whole documentation chapters dedicated to them. If we go with "Conflict schema", then the documentation needs to also consistently use that term, describe what it is for, and make it very easy to look up and discover that "Conflict schema" is 'pg_conflict'. Currently (in patches 0008/9) there is very little explanation even about what pg_conflict is, apart from just observing in passing that the CLT gets written to that "dedicated namespace". It seems a bit backwards describing the parent schema by the contents: Instead of saying when there is a CLT it gets written there, IMO it should be the other way around, and say there is a "Conflict schema" which is where the CLTs (if any) reside. ====== Kind Regards, Peter Smith. Fujitsu Australia -
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-27T04:09:01Z
On Wed, May 27, 2026 at 4:20 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Tue, May 26, 2026 at 7:38 PM shveta malik <shveta.malik@gmail.com> wrote: > ... > > 1) > > heap_create: > > + errdetail("Conflict schema modifications are currently disallowed."))); > > LookupCreationNamespace: > > + errmsg("cannot move objects into or out of the pg_conflict schema"))); > > > > Can we make it same through-out, either we use 'Conflict schema' at > > both the places or pg_conflict schema. Since in these 2 functions, in > > previous messages, we are using names like 'System catalog', 'TOAST > > schema' etc, I think we can use Conflict schema at both the places. > > What do others think on this? > > > > The suggested name of "Conflict schema" LGTM. My only concern was that > a user may not know where that is referring to. OTOH, things like > "System catalog" have 100s of mentions and whole documentation > chapters dedicated to them. If we go with "Conflict schema", then the > documentation needs to also consistently use that term, describe what > it is for, and make it very easy to look up and discover that > "Conflict schema" is 'pg_conflict'. I agree that if we use 'Conflict schema' in the error messages, we need to refer it the same way in doc. Let's wait for others' opinions on this too. > > Currently (in patches 0008/9) there is very little explanation even > about what pg_conflict is, apart from just observing in passing that > the CLT gets written to that "dedicated namespace". It seems a bit > backwards describing the parent schema by the contents: Instead of > saying when there is a CLT it gets written there, IMO it should be the > other way around, and say there is a "Conflict schema" which is where > the CLTs (if any) reside. Yes, the suggestion makes sense. I will look at the doc patch again for this. thanks Shveta -
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-27T08:34:37Z
On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Thanks for the comments, the attached v39 version patch has the > > changes for the same. > > > > I have not yet looked at v40, but please find a few ocmments on > v39-0001 and 0002 merged together. > 4) > Do we need to have CommandCounterIncrement() after > heap_create_with_catalog() in create_conflict_log_table()? I think > even if we are not doing any table_open etc for CLT in same > transaction, we should call CommandCounterIncrement() (to be > consistent with other such calls of heap_create_with_catalog and to > make it future proof). Thoughts? I felt this is not required as we are not doing a table open on the newly created table. I have fixed the rest of the comments. The attached v41 version patch has the changes for the same. Additionally the comments from [1] have also been fixed. [1] - https://www.postgresql.org/message-id/CAHut%2BPvB3rUs2ccUxJ1q1YEmvtHN3HJGSEjT4Cbc%3D5pjoGO9Yg%40mail.gmail.com Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
shveta malik <shveta.malik@gmail.com> — 2026-05-27T10:38:40Z
I have not yet looked at v41. Here are the comments for v40 0003 and 0004: No comments. 0004 and 0005: 1) In build_local_conflicts_json_array(), we have these: + json_datum = heap_copy_tuple_as_datum(tuple, tupdesc); + + /* + * Build the higher level JSON datum in format described in function + * header. + */ + json_datum = DirectFunctionCall1(row_to_json, json_datum); We have first allocation to 'json_datum' via heap_copy_tuple_as_datum() and then second via row_to_json() call. So we are overwriting first allocation. Which memory context are we using here for this allocation? IIUC, if the conflict is non-error one, we may accumulate these memory chunks in long running worker loop which may gradually bloat the memory. Let me know if my undertstanding is wrong. Same situation in tuple_table_slot_to_indextup_json and tuple_table_slot_to_json_datum as well. 2) Same in ReportApplyConflict(), if elevel is not ERROR, should we worry about freeing 'err_detail' after error-reporting or does some short-lived context handle it? thanks Shveta
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-27T21:26:47Z
On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > Thanks for the comments, the attached v39 version patch has the > > > changes for the same. > > > > > > > I have not yet looked at v40, but please find a few ocmments on > > v39-0001 and 0002 merged together. > > 4) > > Do we need to have CommandCounterIncrement() after > > heap_create_with_catalog() in create_conflict_log_table()? I think > > even if we are not doing any table_open etc for CLT in same > > transaction, we should call CommandCounterIncrement() (to be > > consistent with other such calls of heap_create_with_catalog and to > > make it future proof). Thoughts? > > I felt this is not required as we are not doing a table open on the > newly created table. > Okay, command counter increment would be required here if we further access that relation in the same command. I think I am facing a related problem w.r.t newly created subscription. After applying first six patches, the create subscription fails as follows: postgres=# create subscription sub1 connection 'dbname=postgres' publication pub1 with (conflict_log_destination='all'); ERROR: dependent subscription was concurrently dropped I debugged and found that we get the above ERROR when we are trying to find the subscription which is not yet created. In this case, it seems to be happening because we are using a subscription that is yet not created for dependency recording. This raises a question as to why are we creating the conflict_log_table before subscription, at least this needs some comments. * + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) + { + if (IsConflictLogTableClass(classForm)) + { + /* + * For conflict log tables, allow non-superusers to perform + * DELETE and TRUNCATE for cleanup and maintenance. Also allow + * INSERT and UPDATE to pass ACL checks so that later checks + * can raise the dedicated "cannot modify or insert data into + * conflict log table" error instead of a generic permission + * denied error. Still restrict USAGE for non-superusers. + */ + mask &= ~(ACL_USAGE); I see the point of giving a specific error instead of a generic error but this functionality is used by pg_class_aclmask() which is an exposed function. If we go with your proposed change, isn't there a risk that some extension or outside core-code using pg_class_aclmask() won't invoke that later functionality (CheckValidResultRel())? If we decide to go this way then we can change this comment as proposed in the attached? -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-27T22:09:45Z
On Wed, May 27, 2026 at 3:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > 2) > Same in ReportApplyConflict(), if elevel is not ERROR, should we worry > about freeing 'err_detail' after error-reporting or does some > short-lived context handle it? > Isn't this the case even without this patch? If so, this can be investigated separately. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-27T23:08:11Z
On Tue, May 26, 2026 at 2:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > 2) > drop_subscription_dependencies(): > > + conflictrelname = get_rel_name(subconflictlogrelid); > > We can actually have a sanity check that we got the CLT using the relid. > Assert(conflictrelname != NULL); > elog will suit this place better as this can't be a direct coding mistake. I see that at other places we used elog. See if (result == NULL) elog(ERROR, "cache lookup failed for index %u", indexId); -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2026-05-28T01:20:03Z
Hi Vignesh. Here are some review comments for the v41-0008/9 combined (docs) patch. ====== doc/src/sgml/ddl.sgml (5.11.6. The Conflict Schema) 1. + <para> + Similarly, the <literal>pg_conflict</literal> schema (sometimes referred to + as the <emphasis>conflict schema</emphasis>) contains system managed + conflict log tables used for logical replication conflict tracking. These + tables are created and maintained by the system and are not intended for + direct user manipulation. Unlike <literal>pg_catalog</literal>, the + <literal>pg_catalog</literal> schema is not implicitly included in the + search path, so objects within it must be referenced explicitly or by + adjusting the search path. + </para> 1a. /Similarly, the/The/ ~ 1b. IMO don't say "sometimes". Also, case. /conflict schema/Conflict schema/ ~ 1c. "conflict log tables" -- I think it will be helpful if this includes a link to "29.8.2. Table-based logging #". ~ 1d. "Unlike <literal>pg_catalog</literal>, the <literal>pg_catalog</literal> schema..." typo. That 2nd pg_catalog should say pg_conflict. ====== doc/src/sgml/glossary.sgml 2. + <glossentry id="glossary-conflict-schema"> + <glossterm>conflict schema</glossterm> + <glossdef> + <para> + The <literal>pg_conflict</literal> schema that contains system-managed + conflict log tables for logical replication. These tables are created + and maintained automatically by the system and are not intended for + direct user manipulation. See <xref linkend="ddl-schemas-conflict"/>. + </para> + </glossdef> + </glossentry> + case. /conflict schema/Conflict schema/ ====== doc/src/sgml/logical-replication.sgml (29.2. Subscription) 3. + automatically manages a dedicated <firstterm>conflict log table</firstterm>, + which is created an dropped along with the subscription. This significantly + improves post-mortem analysis and operational visibility of the replication + setup. typo. /created an dropped/created and dropped/ ~~~ (29.8.2. Table-based logging) 4. + a dedicated conflict log table will be automatically created. This table is + created in the <literal>pg_conflict</literal> namespace. The name of the Instead of "<literal>pg_conflict</literal> namespace", this should now say "Conflict schema" and have a link to that new docs section. ====== doc/src/sgml/ref/create_subscription.sgml (Parameters - conflict_log_destination) 5. + named <literal>pg_conflict_log_for_subid_<subid></literal> + in the <literal>pg_conflict</literal> schema. This allows for easy Same as review comment #4. Instead of "<literal>pg_conflict</literal> schema", this should now say "Conflict schema" and have a link to that new docs section. ====== Kind Regards, Peter Smith. Fujitsu Australia
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-28T14:49:14Z
On Wed, 27 May 2026 at 14:04, vignesh C <vignesh21@gmail.com> wrote: > > > I have fixed the rest of the comments. The attached v41 version patch > has the changes for the same. Additionally the comments from [1] have > also been fixed. I was evaluating whether the existing pg_upgrade changes for conflict log tables can handle the addition of new columns in a future release. To validate this, I performed the following: Added two new columns to the conflict log table: v20_new_col1 TEXT v20_new_col2 TEXT These changes are present in patch '0001'. For adding new columns during binary upgrade, the following version-specific logic is required in 'pg_dump': ALTER TABLE pg_conflict.pg_conflict_log_for_subid_oid ADD COLUMN v20_new_col1 TEXT; ALTER TABLE pg_conflict.pg_conflict_log_for_subid_oid ADD COLUMN v20_new_col2 TEXT; These changes are included in patch '0001'. One important point here is that when 'ALTER TABLE ... ADD COLUMN' is run, the server does not rewrite existing rows on disk. Instead, it only updates the system catalog with the new column metadata. While selecting data from the table, the server handles this as follows: 1. Deform what is physically present - 'slot_deform_heap_tuple()' reads the raw tuple bytes from disk, but only up to 't_natts', which is the number of columns recorded in the tuple header at the time that row was inserted. It stops there because the tuple has no physical data for columns added later. 2. Fill in what is missing - After deforming the tuple, if the number of populated columns is still less than the number of columns requested by the query, it calls 'slot_getmissingattrs()' to cover the gap. Since the new columns were added with no default value, 'slot_getmissingattrs()' sets: tts_isnull[attnum] = true; This is how NULL is returned for the newly added columns in existing rows. These changes were tested on a new server with the v40 version patch + '0001' patch. 1. Pre-upgrade state using v40 version patches Simulated conflicts using a setup where the schema does not include the new columns: postgres=# select * from pg_conflict.pg_conflict_log_for_subid_16396 ; .... (4 rows) 2. Upgrade using 'pg_upgrade' The upgrade was performed on a cluster initialized with patches v40 + '0001', and it completed successfully. Post-upgrade verification: postgres=# select conflict_type, v20_new_col1, v20_new_col2 from pg_conflict.pg_conflict_log_for_subid_16396 ; conflict_type | v20_new_col1 | v20_new_col2 ---------------+--------------+-------------- insert_exists | | insert_exists | | insert_exists | | insert_exists | | (4 rows) Existing rows were preserved, and the newly added columns are visible and populated with NULLs, as expected. 3. Post-upgrade conflict insertion After starting the old publisher again to continue generating conflicts: postgres=# select conflict_type, v20_new_col1, v20_new_col2 from pg_conflict.pg_conflict_log_for_subid_16396 ; conflict_type | v20_new_col1 | v20_new_col2 ---------------+--------------+-------------- insert_exists | | insert_exists | | insert_exists | | insert_exists | | insert_exists | v20_new_col1 | v20_new_col2 insert_exists | v20_new_col1 | v20_new_col2 insert_exists | v20_new_col1 | v20_new_col2 (7 rows) New conflicts are inserted successfully, and the newly added columns are correctly populated for new entries. Based on this testing, the current 'pg_upgrade' framework, along with the additional dump-time adjustments, appears sufficient to support schema evolution of conflict log tables, specifically for adding new columns in future releases. Thoughts? Regards, Vignesh
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-28T23:41:34Z
On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Rest of the comments were fixed. > > The attached v37 version patch has the changes for the same. Also > > Peter's comments on the documentation patch from [1] and Shveta's > > comments from [2] are addressed in the attached patch. > > > > Here are few comments based on v37 testing: > > 1) Should we consider using TOAST tables for tuple-data columns like > remote_tuple and local_conflicts (the JSON columns)? > This may be a corner case, but if the tuple data becomes too large to > fit into an 8KB heap tuple, then the apply worker keeps failing while > inserting into the CLT with errors like: > > ERROR: row is too big: size 19496, maximum size 8160 > LOG: background worker "logical replication apply worker" (PID > 41226) exited with exit code 1 > In the docs, it is mentioned: "column_value is the column value. The large column values are truncated to 64 bytes." [1], so I wonder, if we follow this why we need toast entries? Did you tried any case where you are getting above ERROR? > Noticed that even disable_on_error=true does not disable the > subscription in this case. > Hmm, I think we need to have a documented reason if such a case doesn't disable the subscription with the disable_on_error as true? [1]: https://www.postgresql.org/docs/devel/logical-replication-conflicts.html -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-29T09:22:58Z
On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > Thanks for the comments, the attached v39 version patch has the > > > > changes for the same. > > > > > > > > > > I have not yet looked at v40, but please find a few ocmments on > > > v39-0001 and 0002 merged together. > > > 4) > > > Do we need to have CommandCounterIncrement() after > > > heap_create_with_catalog() in create_conflict_log_table()? I think > > > even if we are not doing any table_open etc for CLT in same > > > transaction, we should call CommandCounterIncrement() (to be > > > consistent with other such calls of heap_create_with_catalog and to > > > make it future proof). Thoughts? > > > > I felt this is not required as we are not doing a table open on the > > newly created table. > > > > Okay, command counter increment would be required here if we further > access that relation in the same command. I think CommandCounterIncrement() is called wherever we need to open the relation in the same command. In this particular case we do not need to open the conflict log table so we do not need to call CCI I think I am facing a > related problem w.r.t newly created subscription. After applying first > six patches, the create subscription fails as follows: > postgres=# create subscription sub1 connection 'dbname=postgres' > publication pub1 with (conflict_log_destination='all'); > ERROR: dependent subscription was concurrently dropped > > I debugged and found that we get the above ERROR when we are trying to > find the subscription which is not yet created. In this case, it seems > to be happening because we are using a subscription that is yet not > created for dependency recording. This raises a question as to why are > we creating the conflict_log_table before subscription, at least this > needs some comments. This error occurs because in the commit below [1], we disallowed recording a dependency on an object that does not exist. Therefore, we now need to record the dependency after the subscription is created. And we create CLT before so that we can add the conflict log relid in pg_subscription without an additional update, I will add a comment explaining this. [1] commit 2fbb21170e9053720c2c374b21eb650a22b8aaea Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Date: Wed May 27 18:35:58 2026 +0300 Avoid orphaned objects dependencies -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-29T18:47:49Z
On Fri, May 29, 2026 at 2:23 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I think I am facing a > > related problem w.r.t newly created subscription. After applying first > > six patches, the create subscription fails as follows: > > postgres=# create subscription sub1 connection 'dbname=postgres' > > publication pub1 with (conflict_log_destination='all'); > > ERROR: dependent subscription was concurrently dropped > > > > I debugged and found that we get the above ERROR when we are trying to > > find the subscription which is not yet created. In this case, it seems > > to be happening because we are using a subscription that is yet not > > created for dependency recording. This raises a question as to why are > > we creating the conflict_log_table before subscription, at least this > > needs some comments. > > This error occurs because in the commit below [1], we disallowed > recording a dependency on an object that does not exist. Therefore, we > now need to record the dependency after the subscription is created. > But don't we normally create dependency immediately after creating the object? Do you see such examples at other places in the code? > And we create CLT before so that we can add the conflict log relid in > pg_subscription without an additional update, > But will this additional update matter to an extent in DDL execution that we don't follow our usual way to record dependency? I feel unless we follow similar coding pattern at other places, it is better to create the CLT after subscription. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-29T21:54:39Z
On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Rest of the comments were fixed. > > > The attached v37 version patch has the changes for the same. Also > > > Peter's comments on the documentation patch from [1] and Shveta's > > > comments from [2] are addressed in the attached patch. > > > > > > > Here are few comments based on v37 testing: > > > > 1) Should we consider using TOAST tables for tuple-data columns like > > remote_tuple and local_conflicts (the JSON columns)? > > This may be a corner case, but if the tuple data becomes too large to > > fit into an 8KB heap tuple, then the apply worker keeps failing while > > inserting into the CLT with errors like: > > > > ERROR: row is too big: size 19496, maximum size 8160 > > LOG: background worker "logical replication apply worker" (PID > > 41226) exited with exit code 1 > > > > In the docs, it is mentioned: "column_value is the column value. The > large column values are truncated to 64 bytes." [1], so I wonder, if > we follow this why we need toast entries? Did you tried any case where > you are getting above ERROR? But in this case we are talking about the JSON column of the CLT which might contain a full local tuple or even multiple local tuples if a remote tuple conflicts with multiple local rows. So, IMHO, we need a toast table. Nisha, have you already tested the scenario? If yes, can you share your test case? -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-29T22:06:59Z
On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > Rest of the comments were fixed. > > > > The attached v37 version patch has the changes for the same. Also > > > > Peter's comments on the documentation patch from [1] and Shveta's > > > > comments from [2] are addressed in the attached patch. > > > > > > > > > > Here are few comments based on v37 testing: > > > > > > 1) Should we consider using TOAST tables for tuple-data columns like > > > remote_tuple and local_conflicts (the JSON columns)? > > > This may be a corner case, but if the tuple data becomes too large to > > > fit into an 8KB heap tuple, then the apply worker keeps failing while > > > inserting into the CLT with errors like: > > > > > > ERROR: row is too big: size 19496, maximum size 8160 > > > LOG: background worker "logical replication apply worker" (PID > > > 41226) exited with exit code 1 > > > > > > > In the docs, it is mentioned: "column_value is the column value. The > > large column values are truncated to 64 bytes." [1], so I wonder, if > > we follow this why we need toast entries? Did you tried any case where > > you are getting above ERROR? > > But in this case we are talking about the JSON column of the CLT which > might contain a full local tuple or even multiple local tuples if a > remote tuple conflicts with multiple local rows. So, IMHO, we need a > toast table. Nisha, have you already tested the scenario? If yes, can > you share your test case? After putting more thought, I think instead of executing a three-step process i.e. inserting the pg_subscription tuple, creating the table with its dependency, and then going back to update the tuple with the new relation ID, it is much cleaner to do it linearly, i.e. we should create the conflict log table first to get its OID, insert the subscription tuple pre-populated with that ID, and then record the dependency. This achieves the exact same state in a single direct sequence without the redundant catalog update within the same command. I agree with that code we would have to keep the record dependency code in CreateSubscription and AlterSubscription functions, but after putting more thought I think in thoese function we are already recording subscription dependencies with other object so wouldn't it be more natural to add this depednecy as well at the same place? Anyway I am ready to change that if we have strong opinion against this approach. Here is the updated patch and changes are 1. 0003 and 0004 are merged on 0001 2. Merged Amit's v41_amit_1.patch.txt to 0002 3. Fix the dependency order issue (i.e. create dependency after inserting subscription tuple) and merged in 0002 Open Items: 1. Need to create toast table for CLT after testing with larger JSON row 2. Fixed review comments of Shveta on 0004 and 0005 3. Rebase Vignesh's patch of "v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we can do that once we have concensus on whether to create conflict log table first or insert the subscription row first as based on this change we would have to rebase this patch again. 4. Once we rebase "v41-0007-Preserve-conflict-log-destination-and-subscripti" after dependency order consensus I would rebase doc patch and \dRs+ change patch of Vignesh. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-30T00:31:05Z
On Sat, May 30, 2026 at 3:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > Rest of the comments were fixed. > > > > > The attached v37 version patch has the changes for the same. Also > > > > > Peter's comments on the documentation patch from [1] and Shveta's > > > > > comments from [2] are addressed in the attached patch. > > > > > > > > > > > > > Here are few comments based on v37 testing: > > > > > > > > 1) Should we consider using TOAST tables for tuple-data columns like > > > > remote_tuple and local_conflicts (the JSON columns)? > > > > This may be a corner case, but if the tuple data becomes too large to > > > > fit into an 8KB heap tuple, then the apply worker keeps failing while > > > > inserting into the CLT with errors like: > > > > > > > > ERROR: row is too big: size 19496, maximum size 8160 > > > > LOG: background worker "logical replication apply worker" (PID > > > > 41226) exited with exit code 1 > > > > > > > > > > In the docs, it is mentioned: "column_value is the column value. The > > > large column values are truncated to 64 bytes." [1], so I wonder, if > > > we follow this why we need toast entries? Did you tried any case where > > > you are getting above ERROR? > > > > But in this case we are talking about the JSON column of the CLT which > > might contain a full local tuple or even multiple local tuples if a > > remote tuple conflicts with multiple local rows. So, IMHO, we need a > > toast table. Nisha, have you already tested the scenario? If yes, can > > you share your test case? > > After putting more thought, I think instead of executing a three-step > process i.e. inserting the pg_subscription tuple, creating the table > with its dependency, and then going back to update the tuple with the > new relation ID, it is much cleaner to do it linearly, i.e. we should > create the conflict log table first to get its OID, insert the > subscription tuple pre-populated with that ID, and then record the > dependency. This achieves the exact same state in a single direct > sequence without the redundant catalog update within the same command. > I agree with that code we would have to keep the record dependency > code in CreateSubscription and AlterSubscription functions, but after > putting more thought I think in thoese function we are already > recording subscription dependencies with other object so wouldn't it > be more natural to add this depednecy as well at the same place? > > Anyway I am ready to change that if we have strong opinion against > this approach. > > Here is the updated patch and changes are > 1. 0003 and 0004 are merged on 0001 > 2. Merged Amit's v41_amit_1.patch.txt to 0002 > 3. Fix the dependency order issue (i.e. create dependency after > inserting subscription tuple) and merged in 0002 > > Open Items: > 1. Need to create toast table for CLT after testing with larger JSON row > 2. Fixed review comments of Shveta on 0004 and 0005 > 3. Rebase Vignesh's patch of > "v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we > can do that once we have concensus on whether to create conflict log > table first or insert the subscription row first as based on this > change we would have to rebase this patch again. > 4. Once we rebase > "v41-0007-Preserve-conflict-log-destination-and-subscripti" after > dependency order consensus I would rebase doc patch and \dRs+ change > patch of Vignesh. Here is a topup patch so create conflict log table after inserting subscription tuple and then update the tuple with clt relid.. Main changes will look like this[1] [1] /* * If logging to a table is required, physically create it now. We create * the conflict log table here. Also update the pg_subscription row * after creating the conflict log table with its reloid. */ if (CONFLICTS_LOGGED_TO_TABLE(opts.conflictlogdest)) { bool replaces[Natts_pg_subscription]; Oid logrelid = create_conflict_log_table(subid, stmt->subname, owner); /* Form a new tuple. */ memset(values, 0, sizeof(values)); memset(nulls, false, sizeof(nulls)); memset(replaces, false, sizeof(replaces)); values[Anum_pg_subscription_subconflictlogrelid - 1] = ObjectIdGetDatum(logrelid); replaces[Anum_pg_subscription_subconflictlogrelid - 1] = true; /* Make subscription tuple visible before updating it. */ CommandCounterIncrement(); tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls, replaces); CatalogTupleUpdate(rel, &tup->t_self, tup); } -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Nisha Moond <nisha.moond412@gmail.com> — 2026-05-30T02:49:15Z
On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > Rest of the comments were fixed. > > > > The attached v37 version patch has the changes for the same. Also > > > > Peter's comments on the documentation patch from [1] and Shveta's > > > > comments from [2] are addressed in the attached patch. > > > > > > > > > > Here are few comments based on v37 testing: > > > > > > 1) Should we consider using TOAST tables for tuple-data columns like > > > remote_tuple and local_conflicts (the JSON columns)? > > > This may be a corner case, but if the tuple data becomes too large to > > > fit into an 8KB heap tuple, then the apply worker keeps failing while > > > inserting into the CLT with errors like: > > > > > > ERROR: row is too big: size 19496, maximum size 8160 > > > LOG: background worker "logical replication apply worker" (PID > > > 41226) exited with exit code 1 > > > > > > > In the docs, it is mentioned: "column_value is the column value. The > > large column values are truncated to 64 bytes." [1], so I wonder, if > > we follow this why we need toast entries? Did you tried any case where > > you are getting above ERROR? > > But in this case we are talking about the JSON column of the CLT which > might contain a full local tuple or even multiple local tuples if a > remote tuple conflicts with multiple local rows. So, IMHO, we need a > toast table. Nisha, have you already tested the scenario? If yes, can > you share your test case? > Hi Dilip, Amit, Yes, I tested the scenario. Used below steps to reproduce the error: #Publisher: CREATE TABLE fat2 (id int PRIMARY KEY, col1 text, col2 text); INSERT INTO fat2 VALUES ( 1, (SELECT string_agg(md5(i::text), '') FROM generate_series(1, 200) i), (SELECT string_agg(md5(i::text), '') FROM generate_series(201, 400) i) ); ALTER TABLE fat2 REPLICA IDENTITY FULL; CREATE PUBLICATION p3 FOR TABLE fat2; #Subscriber -- create subscription s3 for publication p3 with conflict log table (after table syncs): -- modifying the row locally UPDATE fat2 SET col1 = (SELECT string_agg(md5(i::text), '') FROM generate_series(601, 800) i) WHERE id = 1; #Publisher (triggers the conflict): UPDATE fat2 SET col1 = (SELECT string_agg(md5(i::text), '') FROM generate_series(801, 1000) i) WHERE id = 1; Above should cause the reported failure. -- Thanks, Nisha -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-30T08:12:27Z
On Sat, May 30, 2026 at 6:01 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, May 30, 2026 at 3:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Fri, May 29, 2026 at 5:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Thu, May 21, 2026 at 9:51 PM Nisha Moond <nisha.moond412@gmail.com> wrote: > > > > > > > > > > On Wed, May 20, 2026 at 3:05 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > Rest of the comments were fixed. > > > > > > The attached v37 version patch has the changes for the same. Also > > > > > > Peter's comments on the documentation patch from [1] and Shveta's > > > > > > comments from [2] are addressed in the attached patch. > > > > > > > > > > > > > > > > Here are few comments based on v37 testing: > > > > > > > > > > 1) Should we consider using TOAST tables for tuple-data columns like > > > > > remote_tuple and local_conflicts (the JSON columns)? > > > > > This may be a corner case, but if the tuple data becomes too large to > > > > > fit into an 8KB heap tuple, then the apply worker keeps failing while > > > > > inserting into the CLT with errors like: > > > > > > > > > > ERROR: row is too big: size 19496, maximum size 8160 > > > > > LOG: background worker "logical replication apply worker" (PID > > > > > 41226) exited with exit code 1 > > > > > > > > > > > > > In the docs, it is mentioned: "column_value is the column value. The > > > > large column values are truncated to 64 bytes." [1], so I wonder, if > > > > we follow this why we need toast entries? Did you tried any case where > > > > you are getting above ERROR? > > > > > > But in this case we are talking about the JSON column of the CLT which > > > might contain a full local tuple or even multiple local tuples if a > > > remote tuple conflicts with multiple local rows. So, IMHO, we need a > > > toast table. Nisha, have you already tested the scenario? If yes, can > > > you share your test case? > > > > After putting more thought, I think instead of executing a three-step > > process i.e. inserting the pg_subscription tuple, creating the table > > with its dependency, and then going back to update the tuple with the > > new relation ID, it is much cleaner to do it linearly, i.e. we should > > create the conflict log table first to get its OID, insert the > > subscription tuple pre-populated with that ID, and then record the > > dependency. This achieves the exact same state in a single direct > > sequence without the redundant catalog update within the same command. > > I agree with that code we would have to keep the record dependency > > code in CreateSubscription and AlterSubscription functions, but after > > putting more thought I think in thoese function we are already > > recording subscription dependencies with other object so wouldn't it > > be more natural to add this depednecy as well at the same place? > > > > Anyway I am ready to change that if we have strong opinion against > > this approach. > > > > Here is the updated patch and changes are > > 1. 0003 and 0004 are merged on 0001 > > 2. Merged Amit's v41_amit_1.patch.txt to 0002 > > 3. Fix the dependency order issue (i.e. create dependency after > > inserting subscription tuple) and merged in 0002 > > > > Open Items: > > 1. Need to create toast table for CLT after testing with larger JSON row > > 2. Fixed review comments of Shveta on 0004 and 0005 > > 3. Rebase Vignesh's patch of > > "v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we > > can do that once we have concensus on whether to create conflict log > > table first or insert the subscription row first as based on this > > change we would have to rebase this patch again. > > 4. Once we rebase > > "v41-0007-Preserve-conflict-log-destination-and-subscripti" after > > dependency order consensus I would rebase doc patch and \dRs+ change > > patch of Vignesh. > > Here is a topup patch so create conflict log table after inserting > subscription tuple and then update the tuple with clt relid.. > > Main changes will look like this[1] > > [1] > /* > * If logging to a table is required, physically create it now. We create > * the conflict log table here. Also update the pg_subscription row > * after creating the conflict log table with its reloid. > */ > if (CONFLICTS_LOGGED_TO_TABLE(opts.conflictlogdest)) > { > bool replaces[Natts_pg_subscription]; > Oid logrelid = > create_conflict_log_table(subid, stmt->subname, owner); > > /* Form a new tuple. */ > memset(values, 0, sizeof(values)); > memset(nulls, false, sizeof(nulls)); > memset(replaces, false, sizeof(replaces)); > > values[Anum_pg_subscription_subconflictlogrelid - 1] = > ObjectIdGetDatum(logrelid); > replaces[Anum_pg_subscription_subconflictlogrelid - 1] = > true; > > /* Make subscription tuple visible before updating it. */ > CommandCounterIncrement(); > > tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls, > replaces); > > CatalogTupleUpdate(rel, &tup->t_self, tup); > } > In latest patch set I have fixed Nisha's comments by creating a toast table, a separate patch (v43-0005-Create-conflict-log-table-after-inserting-subscr.patch) attached for creating conflict log table after inserting subscription row. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-30T18:21:10Z
On Fri, May 29, 2026 at 3:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, May 30, 2026 at 3:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > After putting more thought, I think instead of executing a three-step > process i.e. inserting the pg_subscription tuple, creating the table > with its dependency, and then going back to update the tuple with the > new relation ID, it is much cleaner to do it linearly, i.e. we should > create the conflict log table first to get its OID, insert the > subscription tuple pre-populated with that ID, and then record the > dependency. This achieves the exact same state in a single direct > sequence without the redundant catalog update within the same command. > I agree with that code we would have to keep the record dependency > code in CreateSubscription and AlterSubscription functions, but after > putting more thought I think in thoese function we are already > recording subscription dependencies with other object so wouldn't it > be more natural to add this depednecy as well at the same place? > It makes sense to me and anyway for serverid also we are creating dependency after creation of subscription, so your solution looks good to me. One minor suggestion related to this changes: + if (CONFLICTS_LOGGED_TO_TABLE(opts.conflictlogdest)) + { + ObjectAddress clt; + + ObjectAddressSet(clt, RelationRelationId, logrelid); + recordDependencyOn(&clt, &myself, DEPENDENCY_INTERNAL); + } Let's name clt as cltaddr or cltobj to make it consistent with naming at some other similar places in code. Change this at both places where we use this code. > Anyway I am ready to change that if we have strong opinion against > this approach. > > Here is the updated patch and changes are > 1. 0003 and 0004 are merged on 0001 > 2. Merged Amit's v41_amit_1.patch.txt to 0002 > 3. Fix the dependency order issue (i.e. create dependency after > inserting subscription tuple) and merged in 0002 > > Open Items: > 1. Need to create toast table for CLT after testing with larger JSON row > 2. Fixed review comments of Shveta on 0004 and 0005 > 3. Rebase Vignesh's patch of > "v41-0007-Preserve-conflict-log-destination-and-subscripti" I think we > can do that once we have concensus on whether to create conflict log > table first or insert the subscription row first as based on this > change we would have to rebase this patch again. > 4. Once we rebase > "v41-0007-Preserve-conflict-log-destination-and-subscripti" after > dependency order consensus I would rebase doc patch and \dRs+ change > patch of Vignesh. > I see that my second comment in email [1] and another comment in email [2] are still not answered and are neither listed in open items. [1] - https://www.postgresql.org/message-id/CAA4eK1%2BzdaLF7%3DAVKd8xNGTuvPvn8BYSxHfnLZd7whWZ%2Bv3B-Q%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAA4eK1K6tVUmKY-yqKgTX00yrSVAdSZN4Ao761JEXdtQkAYT4g%40mail.gmail.com -- With Regards, Amit Kapila. -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T06:02:51Z
On Thu, May 28, 2026 at 4:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, May 26, 2026 at 2:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > 2) > > drop_subscription_dependencies(): > > > > + conflictrelname = get_rel_name(subconflictlogrelid); > > > > We can actually have a sanity check that we got the CLT using the relid. > > Assert(conflictrelname != NULL); > > > > elog will suit this place better as this can't be a direct coding > mistake. I see that at other places we used elog. See > if (result == NULL) > elog(ERROR, "cache lookup failed for index %u", indexId); Yes it make sense to report elog, I will change this. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T06:12:54Z
On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > Thanks for the comments, the attached v39 version patch has the > > > > changes for the same. > > > > > > > > > > I have not yet looked at v40, but please find a few ocmments on > > > v39-0001 and 0002 merged together. > > > 4) > > > Do we need to have CommandCounterIncrement() after > > > heap_create_with_catalog() in create_conflict_log_table()? I think > > > even if we are not doing any table_open etc for CLT in same > > > transaction, we should call CommandCounterIncrement() (to be > > > consistent with other such calls of heap_create_with_catalog and to > > > make it future proof). Thoughts? > > > > I felt this is not required as we are not doing a table open on the > > newly created table. > > > > Okay, command counter increment would be required here if we further > access that relation in the same command. I think I am facing a > related problem w.r.t newly created subscription. After applying first > six patches, the create subscription fails as follows: > postgres=# create subscription sub1 connection 'dbname=postgres' > publication pub1 with (conflict_log_destination='all'); > ERROR: dependent subscription was concurrently dropped > > I debugged and found that we get the above ERROR when we are trying to > find the subscription which is not yet created. In this case, it seems > to be happening because we are using a subscription that is yet not > created for dependency recording. This raises a question as to why are > we creating the conflict_log_table before subscription, at least this > needs some comments. > > * > + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) > + { > + if (IsConflictLogTableClass(classForm)) > + { > + /* > + * For conflict log tables, allow non-superusers to perform > + * DELETE and TRUNCATE for cleanup and maintenance. Also allow > + * INSERT and UPDATE to pass ACL checks so that later checks > + * can raise the dedicated "cannot modify or insert data into > + * conflict log table" error instead of a generic permission > + * denied error. Still restrict USAGE for non-superusers. > + */ > + mask &= ~(ACL_USAGE); > > I see the point of giving a specific error instead of a generic error > but this functionality is used by pg_class_aclmask() which is an > exposed function. If we go with your proposed change, isn't there a > risk that some extension or outside core-code using pg_class_aclmask() > won't invoke that later functionality (CheckValidResultRel())? If we > decide to go this way then we can change this comment as proposed in > the attached? I do not understand this change; my original patch 0001 has like this, that mean we are only allowing ACL_TRUNCATE and ACL_DELETE for conflict log table, whats the reason for changing the same in 0002? if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) && - IsSystemClass(table_oid, classForm) && - classForm->relkind != RELKIND_VIEW && + IsConflictClass(classForm) && !superuser_arg(roleid)) - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE); + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) && + IsSystemClass(table_oid, classForm) && + classForm->relkind != RELKIND_VIEW && + !superuser_arg(roleid)) + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T06:57:48Z
On Wed, May 27, 2026 at 4:08 PM shveta malik <shveta.malik@gmail.com> wrote: > > I have not yet looked at v41. Here are the comments for v40 > > 0003 and 0004: No comments. > > 0004 and 0005: > > > 1) > In build_local_conflicts_json_array(), we have these: > > + json_datum = heap_copy_tuple_as_datum(tuple, tupdesc); > + > + /* > + * Build the higher level JSON datum in format described in function > + * header. > + */ > + json_datum = DirectFunctionCall1(row_to_json, json_datum); > > We have first allocation to 'json_datum' via > heap_copy_tuple_as_datum() and then second via row_to_json() call. So > we are overwriting first allocation. Which memory context are we using > here for this allocation? IIUC, if the conflict is non-error one, we > may accumulate these memory chunks in long running worker loop which > may gradually bloat the memory. Let me know if my undertstanding is > wrong. > > Same situation in tuple_table_slot_to_indextup_json and > tuple_table_slot_to_json_datum as well. IIUC logical these all memory will be allocated under ApplyMessageContext which is temporary and getting reset on every logical message, so I think that contex is really for the purpose of temporary allocation during each message processing and get reset after the message is processed. > 2) > Same in ReportApplyConflict(), if elevel is not ERROR, should we worry > about freeing 'err_detail' after error-reporting or does some > short-lived context handle it? Same is true for this as well. -- Regards, Dilip Kumar Google
-
Re: Proposal: Conflict log history table for Logical Replication
Amit Kapila <amit.kapila16@gmail.com> — 2026-05-31T11:54:08Z
On Sat, May 30, 2026 at 1:12 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > Few comments on 0001 and 0002 =========================== 1. + Oid subconflictlogrelid; /* Relid of the conflict log table. */ #ifdef CATALOG_VARLEN /* variable-length fields start here */ + /* + * Strategy for logging replication conflicts: + * 'log' - server log only, + * 'table' - conflict log table only, + * 'all' - both log and table. + */ + text subconflictlogdest BKI_FORCE_NOT_NULL; 'log' sounds redundant in the above two field names. I feel naming them as subconflictrelid and subconflictdest should be sufficient. 2. If you agree with the above, then let's make similar changes at other places in the patch. We can change alter_sub_conflictlogdestination to alter_sub_conflict_destination. Also, similar to AlterSubscription_refresh and AlterSubscription_refresh_seq, we can name this new function as AlterSubscription_conflict_dest. 3. Now, let's consider whether we should change the option name to conflict_data_destination instead of conflict_log_destination? The reason I am asking to consider this change is that one of the option values is 'log', so it sounded a bit odd to name the option as conflict_log_destination. If we change this then we can consider changing the name of Enum ConflictLogDest as well. Apart from above, I have made some changes in the attached. Kindly review and see which all can be incorporated in the next version. -- With Regards, Amit Kapila.
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2026-05-31T12:08:05Z
On Sun, 31 May 2026 at 11:43, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > > > > Thanks for the comments, the attached v39 version patch has the > > > > > changes for the same. > > > > > > > > > > > > > I have not yet looked at v40, but please find a few ocmments on > > > > v39-0001 and 0002 merged together. > > > > 4) > > > > Do we need to have CommandCounterIncrement() after > > > > heap_create_with_catalog() in create_conflict_log_table()? I think > > > > even if we are not doing any table_open etc for CLT in same > > > > transaction, we should call CommandCounterIncrement() (to be > > > > consistent with other such calls of heap_create_with_catalog and to > > > > make it future proof). Thoughts? > > > > > > I felt this is not required as we are not doing a table open on the > > > newly created table. > > > > > > > Okay, command counter increment would be required here if we further > > access that relation in the same command. I think I am facing a > > related problem w.r.t newly created subscription. After applying first > > six patches, the create subscription fails as follows: > > postgres=# create subscription sub1 connection 'dbname=postgres' > > publication pub1 with (conflict_log_destination='all'); > > ERROR: dependent subscription was concurrently dropped > > > > I debugged and found that we get the above ERROR when we are trying to > > find the subscription which is not yet created. In this case, it seems > > to be happening because we are using a subscription that is yet not > > created for dependency recording. This raises a question as to why are > > we creating the conflict_log_table before subscription, at least this > > needs some comments. > > > > * > > + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) > > + { > > + if (IsConflictLogTableClass(classForm)) > > + { > > + /* > > + * For conflict log tables, allow non-superusers to perform > > + * DELETE and TRUNCATE for cleanup and maintenance. Also allow > > + * INSERT and UPDATE to pass ACL checks so that later checks > > + * can raise the dedicated "cannot modify or insert data into > > + * conflict log table" error instead of a generic permission > > + * denied error. Still restrict USAGE for non-superusers. > > + */ > > + mask &= ~(ACL_USAGE); > > > > I see the point of giving a specific error instead of a generic error > > but this functionality is used by pg_class_aclmask() which is an > > exposed function. If we go with your proposed change, isn't there a > > risk that some extension or outside core-code using pg_class_aclmask() > > won't invoke that later functionality (CheckValidResultRel())? If we > > decide to go this way then we can change this comment as proposed in > > the attached? > > I do not understand this change; my original patch 0001 has like this, > that mean we are only allowing ACL_TRUNCATE and ACL_DELETE for > conflict log table, whats the reason for changing the same in 0002? > > if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | > ACL_USAGE)) && > - IsSystemClass(table_oid, classForm) && > - classForm->relkind != RELKIND_VIEW && > + IsConflictClass(classForm) && > !superuser_arg(roleid)) > - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE); > + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | > ACL_TRUNCATE | ACL_USAGE)) && > + IsSystemClass(table_oid, classForm) && > + classForm->relkind != RELKIND_VIEW && > + !superuser_arg(roleid)) > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); This was done to fix Shveta's comments from [1] to throw "cannot modify or insert data into conflict log table" instead of a generic permission denied error for the owner of the conflict log table. [1] - https://www.postgresql.org/message-id/CAJpy0uANkzTyUjO2W0=RtaJCGg=VYcwLGGCpqax=zKJgNbB0Hw@mail.gmail.com Regards, Vignesh -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T23:06:36Z
On Sun, May 31, 2026 at 5:38 PM vignesh C <vignesh21@gmail.com> wrote: > > On Sun, 31 May 2026 at 11:43, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > > > > > > > Thanks for the comments, the attached v39 version patch has the > > > > > > changes for the same. > > > > > > > > > > > > > > > > I have not yet looked at v40, but please find a few ocmments on > > > > > v39-0001 and 0002 merged together. > > > > > 4) > > > > > Do we need to have CommandCounterIncrement() after > > > > > heap_create_with_catalog() in create_conflict_log_table()? I think > > > > > even if we are not doing any table_open etc for CLT in same > > > > > transaction, we should call CommandCounterIncrement() (to be > > > > > consistent with other such calls of heap_create_with_catalog and to > > > > > make it future proof). Thoughts? > > > > > > > > I felt this is not required as we are not doing a table open on the > > > > newly created table. > > > > > > > > > > Okay, command counter increment would be required here if we further > > > access that relation in the same command. I think I am facing a > > > related problem w.r.t newly created subscription. After applying first > > > six patches, the create subscription fails as follows: > > > postgres=# create subscription sub1 connection 'dbname=postgres' > > > publication pub1 with (conflict_log_destination='all'); > > > ERROR: dependent subscription was concurrently dropped > > > > > > I debugged and found that we get the above ERROR when we are trying to > > > find the subscription which is not yet created. In this case, it seems > > > to be happening because we are using a subscription that is yet not > > > created for dependency recording. This raises a question as to why are > > > we creating the conflict_log_table before subscription, at least this > > > needs some comments. > > > > > > * > > > + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) > > > + { > > > + if (IsConflictLogTableClass(classForm)) > > > + { > > > + /* > > > + * For conflict log tables, allow non-superusers to perform > > > + * DELETE and TRUNCATE for cleanup and maintenance. Also allow > > > + * INSERT and UPDATE to pass ACL checks so that later checks > > > + * can raise the dedicated "cannot modify or insert data into > > > + * conflict log table" error instead of a generic permission > > > + * denied error. Still restrict USAGE for non-superusers. > > > + */ > > > + mask &= ~(ACL_USAGE); > > > > > > I see the point of giving a specific error instead of a generic error > > > but this functionality is used by pg_class_aclmask() which is an > > > exposed function. If we go with your proposed change, isn't there a > > > risk that some extension or outside core-code using pg_class_aclmask() > > > won't invoke that later functionality (CheckValidResultRel())? If we > > > decide to go this way then we can change this comment as proposed in > > > the attached? > > > > I do not understand this change; my original patch 0001 has like this, > > that mean we are only allowing ACL_TRUNCATE and ACL_DELETE for > > conflict log table, whats the reason for changing the same in 0002? > > > > if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | > > ACL_USAGE)) && > > - IsSystemClass(table_oid, classForm) && > > - classForm->relkind != RELKIND_VIEW && > > + IsConflictClass(classForm) && > > !superuser_arg(roleid)) > > - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); > > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE); > > + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | > > ACL_TRUNCATE | ACL_USAGE)) && > > + IsSystemClass(table_oid, classForm) && > > + classForm->relkind != RELKIND_VIEW && > > + !superuser_arg(roleid)) > > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); > > This was done to fix Shveta's comments from [1] to throw "cannot > modify or insert data into conflict log table" instead of a generic > permission denied error for the owner of the conflict log table. > [1] - https://www.postgresql.org/message-id/CAJpy0uANkzTyUjO2W0=RtaJCGg=VYcwLGGCpqax=zKJgNbB0Hw@mail.gmail.com Thanks for pointing it, I will analyze this behavior and give my opinion. -- Regards, Dilip Kumar Google -
Re: Proposal: Conflict log history table for Logical Replication
Dilip Kumar <dilipbalaut@gmail.com> — 2026-05-31T23:23:47Z
On Mon, Jun 1, 2026 at 4:36 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sun, May 31, 2026 at 5:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Sun, 31 May 2026 at 11:43, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, May 28, 2026 at 2:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Wed, May 27, 2026 at 1:34 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > On Tue, 26 May 2026 at 15:08, shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > > > On Mon, May 25, 2026 at 10:13 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for the comments, the attached v39 version patch has the > > > > > > > changes for the same. > > > > > > > > > > > > > > > > > > > I have not yet looked at v40, but please find a few ocmments on > > > > > > v39-0001 and 0002 merged together. > > > > > > 4) > > > > > > Do we need to have CommandCounterIncrement() after > > > > > > heap_create_with_catalog() in create_conflict_log_table()? I think > > > > > > even if we are not doing any table_open etc for CLT in same > > > > > > transaction, we should call CommandCounterIncrement() (to be > > > > > > consistent with other such calls of heap_create_with_catalog and to > > > > > > make it future proof). Thoughts? > > > > > > > > > > I felt this is not required as we are not doing a table open on the > > > > > newly created table. > > > > > > > > > > > > > Okay, command counter increment would be required here if we further > > > > access that relation in the same command. I think I am facing a > > > > related problem w.r.t newly created subscription. After applying first > > > > six patches, the create subscription fails as follows: > > > > postgres=# create subscription sub1 connection 'dbname=postgres' > > > > publication pub1 with (conflict_log_destination='all'); > > > > ERROR: dependent subscription was concurrently dropped > > > > > > > > I debugged and found that we get the above ERROR when we are trying to > > > > find the subscription which is not yet created. In this case, it seems > > > > to be happening because we are using a subscription that is yet not > > > > created for dependency recording. This raises a question as to why are > > > > we creating the conflict_log_table before subscription, at least this > > > > needs some comments. > > > > > > > > * > > > > + if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE)) > > > > + { > > > > + if (IsConflictLogTableClass(classForm)) > > > > + { > > > > + /* > > > > + * For conflict log tables, allow non-superusers to perform > > > > + * DELETE and TRUNCATE for cleanup and maintenance. Also allow > > > > + * INSERT and UPDATE to pass ACL checks so that later checks > > > > + * can raise the dedicated "cannot modify or insert data into > > > > + * conflict log table" error instead of a generic permission > > > > + * denied error. Still restrict USAGE for non-superusers. > > > > + */ > > > > + mask &= ~(ACL_USAGE); > > > > > > > > I see the point of giving a specific error instead of a generic error > > > > but this functionality is used by pg_class_aclmask() which is an > > > > exposed function. If we go with your proposed change, isn't there a > > > > risk that some extension or outside core-code using pg_class_aclmask() > > > > won't invoke that later functionality (CheckValidResultRel())? If we > > > > decide to go this way then we can change this comment as proposed in > > > > the attached? > > > > > > I do not understand this change; my original patch 0001 has like this, > > > that mean we are only allowing ACL_TRUNCATE and ACL_DELETE for > > > conflict log table, whats the reason for changing the same in 0002? > > > > > > if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | > > > ACL_USAGE)) && > > > - IsSystemClass(table_oid, classForm) && > > > - classForm->relkind != RELKIND_VIEW && > > > + IsConflictClass(classForm) && > > > !superuser_arg(roleid)) > > > - mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); > > > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_USAGE); > > > + else if ((mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE | > > > ACL_TRUNCATE | ACL_USAGE)) && > > > + IsSystemClass(table_oid, classForm) && > > > + classForm->relkind != RELKIND_VIEW && > > > + !superuser_arg(roleid)) > > > + mask &= ~(ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE | ACL_USAGE); > > > > This was done to fix Shveta's comments from [1] to throw "cannot > > modify or insert data into conflict log table" instead of a generic > > permission denied error for the owner of the conflict log table. > > [1] - https://www.postgresql.org/message-id/CAJpy0uANkzTyUjO2W0=RtaJCGg=VYcwLGGCpqax=zKJgNbB0Hw@mail.gmail.com > > Thanks for pointing it, I will analyze this behavior and give my opinion. While thinking more about this, wouldn't the behaviour is same as pg_toast table, I mean, the superuser will get "cannot change TOAST relation "pg_toast_16404" whereas the owner of the toast will get a permission denied error? -- Regards, Dilip Kumar Google