Thread
-
Re: Proposal: Conflict log history table for Logical Replication
vignesh C <vignesh21@gmail.com> — 2025-12-26T15:27:57Z
On Thu, 25 Dec 2025 at 13:10, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Dec 24, 2025 at 4:02 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Dec 23, 2025 at 5:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Tue, Dec 23, 2025 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Tue, Dec 23, 2025 at 10:55 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > On Mon, Dec 22, 2025 at 9:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > > > On Mon, Dec 22, 2025 at 3:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > > > > > I think this needs more thought, others can be fixed. > > > > > > > > > > > > > 2) > > > > > > > postgres=# drop schema shveta cascade; > > > > > > > NOTICE: drop cascades to subscription sub1 > > > > > > > ERROR: global objects cannot be deleted by doDeletion > > > > > > > > > > > > > > Is this expected? Is the user supposed to see this error? > > > > > > > > > > > > > See below code, so this says if the object being dropped is the > > > > > > outermost object (i.e. if we are dropping the table directly) then it > > > > > > will disallow dropping the object on which it has INTERNAL DEPENDENCY, > > > > > > OTOH if the object is being dropped via recursive drop (i.e. the table > > > > > > is being dropped while dropping the schema) then object on which it > > > > > > has INTERNAL dependency will also be added to the deletion list and > > > > > > later will be dropped via doDeletion and later we are getting error as > > > > > > subscription is a global object. I thought maybe we can handle an > > > > > > additional case that the INTERNAL DEPENDENCY, is on subscription the > > > > > > disallow dropping it irrespective of whether it is being called > > > > > > directly or via recursive drop but then it will give an issue even > > > > > > when we are trying to drop table during subscription drop, we can make > > > > > > handle this case as well via 'flags' passed in findDependentObjects() > > > > > > but need more investigation. > > > > > > > > > > > > Seeing this complexity makes me think more on is it really worth it to > > > > > > maintain this dependency? Because during subscription drop we anyway > > > > > > have to call performDeletion externally because this dependency is > > > > > > local so we are just disallowing the conflict table drop, however the > > > > > > ALTER table is allowed so what we are really protecting by protecting > > > > > > the table drop, I think it can be just documented that if user try to > > > > > > drop the table then conflict will not be inserted anymore? > > > > > > > > > > > > findDependentObjects() > > > > > > { > > > > > > ... > > > > > > switch (foundDep->deptype) > > > > > > { > > > > > > .... > > > > > > case DEPENDENCY_INTERNAL: > > > > > > * 1. At the outermost recursion level, we must disallow the > > > > > > * DROP. However, if the owning object is listed in > > > > > > * pendingObjects, just release the caller's lock and return; > > > > > > * we'll eventually complete the DROP when we reach that entry > > > > > > * in the pending list. > > > > > > } > > > > > > } > > > > > > > > > > > > [1] > > > > > > postgres[1333899]=# select * from pg_depend where objid > 16410; > > > > > > classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype > > > > > > ---------+-------+----------+------------+----------+-------------+--------- > > > > > > 1259 | 16420 | 0 | 2615 | 16410 | 0 | n > > > > > > 1259 | 16420 | 0 | 6100 | 16419 | 0 | i > > > > > > (4 rows) > > > > > > > > > > > > 16420 -> conflict_log_table_16419 > > > > > > 16419 -> subscription > > > > > > 16410 -> schema s1 > > > > > > > > > > > > > > > > One approach could be to use something similar to > > > > > PERFORM_DELETION_SKIP_EXTENSIONS in our case, but only for recursive > > > > > drops. The effect would be that 'DROP SCHEMA ... CASCADE' would > > > > > proceed without error, i.e., it would drop the tables as well without > > > > > including the subscription in the dependency list. But if we try to > > > > > drop a table directly (e.g., DROP TABLE CLT), it will still result in: > > > > > ERROR: cannot drop table because subscription sub1 requires it > > > > > > > > > > > > > I think this way of allowing dropping the conflict table without > > > > caring for the parent object (subscription) is not a good idea. How > > > > about creating a dedicated schema, say pg_conflict for the purpose of > > > > storing conflict tables? This will be similar to the pg_toast schema > > > > for toast tables. So, similar to that each database will have a > > > > pg_conflict schema. It prevents the "orphan" problem where a user > > > > accidentally drops the logging schema but the Subscription is still > > > > trying to write to it. pg_dump needs to ignore all system schemas > > > > EXCEPT pg_conflict. This ensures the history is preserved during > > > > migrations while still protecting the tables from accidental user > > > > deletion. About permissions, I think we need to set the schema > > > > permissions so that USAGE is public (so users can SELECT from their > > > > logs) but CREATE is restricted to the superuser/subscription owner. We > > > > may need to think some more about permissions. > > > > > > > > I also tried to reason out if we can allow storing the conflict table > > > > in pg_catalog but here are a few reasons why it won't be a good idea. > > > > I think by default, pg_dump completely ignores the pg_catalog schema. > > > > It assumes pg_catalog contains static system definitions (like > > > > pg_class, pg_proc, etc.) that are re-generated by the initdb process, > > > > not user data. If we place a conflict table in pg_catalog, it will not > > > > be backed up. If a user runs pg_dump/all to migrate to a new server, > > > > their subscription definition will survive, but their entire history > > > > of conflict logs will vanish. Also from the permissions angle, If a > > > > user wants to write a custom PL/pgSQL function to "retry" conflicts, > > > > they might need to DELETE rows from the conflict table after fixing > > > > them. Granting DELETE permissions on a table inside pg_catalog is > > > > non-standard and often frowned upon by security auditors. It blurs the > > > > line between "System Internals" (immutable) and "User Data" (mutable). > > > > So, in short a separate pg_conflict schema appears to be a better solution. > > > > > > Yeah that makes sense. Although I haven't thought about all cases > > > whether it can be a problem anywhere, but meanwhile I tried > > > prototyping with this and it behaves what we want. > > > > > > postgres[1651968]=# select * from pg_conflict.conflict_log_table_16406 ; > > > relid | schemaname | relname | conflict_type | remote_xid | > > > remote_commit_lsn | remote_commit_ts | remote_origin | > > > replica_identity | remote_tuple > > > | > > > local_conflicts > > > -------+------------+---------+-----------------------+------------+-------------------+-------------------------------+---------------+------------------+---------------- > > > +------------------------------------------------------------------------------------------------------------------------------------ > > > 16385 | public | test | update_origin_differs | 761 | > > > 0/01760BD8 | 2025-12-23 11:08:30.583816+00 | pg_16406 | > > > {"a":1} | {"a":1,"b":20} > > > | {"{\"xid\":\"772\",\"commit_ts\":\"2025-12-23T11:08:25.568561+00:00\",\"origin\":null,\"key\":null,\"tuple\":{\"a\":1,\"b\":10}}"} > > > (1 row) > > > > > > -- Case1: Alter is not allowed > > > postgres[1651968]=# ALTER TABLE pg_conflict.conflict_log_table_16406 > > > ADD COLUMN a int; > > > ERROR: 42501: permission denied: "conflict_log_table_16406" is a system catalog > > > LOCATION: RangeVarCallbackForAlterRelation, tablecmds.c:19634 > > > > > > > How was this achieved? Did you modify IsSystemClass to behave > > similarly to IsToastClass? > > Right > > > I tried to analyze whether there are alternative approaches. The > > possible options I see are: > > > > 1) > > heap_create_with_catalog() provides the boolean argument use_user_acl, > > which is meant to apply user-defined default privileges. In theory, we > > could predefine default ACLs for our schema and then invoke > > heap_create_with_catalog() with use_user_acl = true. But it’s not > > clear how to do this purely from internal code. We would need to mimic > > or reuse the logic behind SetDefaultACLsInSchemas. > > 2) > > Another option is to create the table using heap_create_with_catalog() > > with use_user_acl = false, and then explicitly update pg_class.relacl > > for that table, similar to what ExecGrant_Relation does when > > processing GRANT/REVOKE. But I couldn’t find any existing internal > > code paths (outside of the GRANT/REVOKE implementation itself) that do > > this kind of post-creation ACL manipulation. > > I haven't analyzed this options, I will do that but not before Jan 3rd > as I will be away from my laptop for a week. > > > So overall, I feel changing IsSystemClass is the simpler way right > > now. To set ACL before/after/during heap_create_with_catalog is a > > tricky thing, at-least I could not find an easier way to do this, > > unless I have missed something. > > Thoughts on possible approaches? > > Here is the patches I have changed by using IsSystemClass(), based on > this many other things changed like we don't need to check for the > temp schema and also the caller of create_conflict_log_table() now > don't need to find the creation schema so it don't need to generate > the relname so that part is also moved within > create_conflict_log_table(). Fixed most of the comments given by > Peter and Shveta, although some of them are still open e.g. the name > of the conflict log table as of now I have kept as > conflict_log_table_<subid> other options are > > 1. pg_conflict_<subid> > 2. conflict_log_<subid> > 3. sub_conflict_log_<subid> > > I prefer 3, considering it says this table holds subscription conflict > logs. Thoughts? > > Vignesh, your patches have to be rebased on the new version. Here is a rebased version of the remaining patches. Regards, Vignesh