Re: logical decoding and replication of sequences, take 2
Amit Kapila <amit.kapila16@gmail.com>
From: Amit Kapila <amit.kapila16@gmail.com>
To: Tomas Vondra <tomas.vondra@enterprisedb.com>
Cc: "Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>,
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Masahiko Sawada <sawada.mshk@gmail.com>, Peter Eisentraut <peter.eisentraut@enterprisedb.com>,
Dilip Kumar <dilipbalaut@gmail.com>
Date: 2023-11-27T06:04:46Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
-
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
-
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
-
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
-
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited
On Mon, Nov 27, 2023 at 6:41 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > I've been cleaning up the first two patches to get them committed soon > (adding the decoding infrastructure + test_decoding), cleaning up stale > comments, updating commit messages etc. And I think it's ready to go, > but it's too late over, so I plan going over once more tomorrow and then > likely push. But if someone wants to take a look, I'd welcome that. > > The one issue I found during this cleanup is that the patch was missing > the changes introduced by 29d0a77fa660 for decoding of other stuff. > > commit 29d0a77fa6606f9c01ba17311fc452dabd3f793d > Author: Amit Kapila <akapila@postgresql.org> > Date: Thu Oct 26 06:54:16 2023 +0530 > > Migrate logical slots to the new node during an upgrade. > ... > > I fixed that, but perhaps someone might want to double check ... > > > 0003 is here just for completeness - that's the part adding sequences to > built-in replication. I haven't done much with it, it needs some cleanup > too to get it committable. I don't intend to push that right after > 0001+0002, though. > > > While going over 0001, I realized there might be an optimization for > ReorderBufferSequenceIsTransactional. As coded in 0001, it always > searches through all top-level transactions, and if there's many of them > that might be expensive, even if very few of them have any relfilenodes > in the hash table. It's still linear search, and it needs to happen for > each sequence change. > > But can the relfilenode even be in some other top-level transaction? How > could it be - our transaction would not see it, and wouldn't be able to > generate the sequence change. So we should be able to simply check *our* > transaction (or if it's a subxact, the top-level transaction). Either > it's there (and it's transactional change), or not (and then it's > non-transactional change). > I also think the relfilenode should be part of either the current top-level xact or one of its subxact, so looking at all the top-level transactions for each change doesn't seem advisable. > The 0004 does this. > > This of course hinges on when exactly the transactions get created, and > assignments processed. For example if this would fire before the txn > gets assigned to the top-level one, this would break. I don't think this > can happen thanks to the immediate logging of assignments, but I'm too > tired to think about it now. > This needs some thought because I think we can't guarantee the association till we reach the point where we can actually decode the xact. See comments in AssertTXNLsnOrder() [1]. I noticed few minor comments while reading the patch: 1. + * turned on here because the non-transactional logical message is + * decoded without waiting for these records. Instead of '.. logical message', shouldn't we say sequence change message? 2. + /* + * If we found an entry with matchine relfilenode, typo (matchine) 3. + Note that this may not the value obtained by the process updating the + process, but the future sequence value written to WAL (typically about + 32 values ahead). /may not the value/may not be the value [1] - /* * Skip the verification if we don't reach the LSN at which we start * decoding the contents of transactions yet because until we reach the * LSN, we could have transactions that don't have the association between * the top-level transaction and subtransaction yet and consequently have * the same LSN. We don't guarantee this association until we try to * decode the actual contents of transaction. -- With Regards, Amit Kapila.