Re: logical decoding and replication of sequences, take 2
Tomas Vondra <tomas.vondra@enterprisedb.com>
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
To: Amit Kapila <amit.kapila16@gmail.com>
Cc: John Naylor <john.naylor@enterprisedb.com>,
vignesh C <vignesh21@gmail.com>, Andres Freund <andres@anarazel.de>,
Robert Haas <robertmhaas@gmail.com>,
PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>,
Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: 2023-03-20T08:19:41Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
-
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
-
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
-
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
-
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited
On 3/20/23 04:42, Amit Kapila wrote: > On Sat, Mar 18, 2023 at 8:49 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 3/18/23 06:35, Amit Kapila wrote: >>> On Sat, Mar 18, 2023 at 3:13 AM Tomas Vondra >>> <tomas.vondra@enterprisedb.com> wrote: >>>> >>>> ... >>>> >>>> Clearly, for sequences we can't quite rely on snapshots/slots, we need >>>> to get the LSN to decide what changes to apply/skip from somewhere else. >>>> I wonder if we can just ignore the queued changes in tablesync, but I >>>> guess not - there can be queued increments after reading the sequence >>>> state, and we need to apply those. But maybe we could use the page LSN >>>> from the relfilenode - that should be the LSN of the last WAL record. >>>> >>>> Or maybe we could simply add pg_current_wal_insert_lsn() into the SQL we >>>> use to read the sequence state ... >>>> >>> >>> What if some Alter Sequence is performed before the copy starts and >>> after the copy is finished, the containing transaction rolled back? >>> Won't it copy something which shouldn't have been copied? >>> >> >> That shouldn't be possible - the alter creates a new relfilenode and >> it's invisible until commit. So either it gets committed (and then >> replicated), or it remains invisible to the SELECT during sync. >> > > Okay, however, we need to ensure that such a change will later be > replicated and also need to ensure that the required WAL doesn't get > removed. > > Say, if we use your first idea of page LSN from the relfilenode, then > how do we ensure that the corresponding WAL doesn't get removed when > later the sync worker tries to start replication from that LSN? I am > imagining here the sync_sequence_slot will be created before > copy_sequence but even then it is possible that the sequence has not > been updated for a long time and the LSN location will be in the past > (as compared to the slot's LSN) which means the corresponding WAL > could be removed. Now, here we can't directly start using the slot's > LSN to stream changes because there is no correlation of it with the > LSN (page LSN of sequence's relfilnode) where we want to start > streaming. > I don't understand why we'd need WAL from before the slot is created, which happens before copy_sequence so the sync will see a more recent state (reflecting all changes up to the slot LSN). I think the only "issue" are the WAL records after the slot LSN, or more precisely deciding which of the decoded changes to apply. > Now, for the second idea which is to directly use > pg_current_wal_insert_lsn(), I think we won't be able to ensure that > the changes covered by in-progress transactions like the one with > Alter Sequence I have given example would be streamed later after the > initial copy. Because the LSN returned by pg_current_wal_insert_lsn() > could be an LSN after the LSN associated with Alter Sequence but > before the corresponding xact's commit. Yeah, I think you're right - the locking itself is not sufficient to prevent this ordering of operations. copy_sequence would have to lock the sequence exclusively, which seems bit disruptive. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company