Re: logical decoding and replication of sequences, take 2
Dilip Kumar <dilipbalaut@gmail.com>
From: Dilip Kumar <dilipbalaut@gmail.com>
To: Robert Haas <robertmhaas@gmail.com>
Cc: Tomas Vondra <tomas.vondra@enterprisedb.com>,
Amit Kapila <amit.kapila16@gmail.com>, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>,
"Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Masahiko Sawada <sawada.mshk@gmail.com>, Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Date: 2024-02-21T09:13:01Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
-
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
-
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
-
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
-
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited
On Wed, Feb 21, 2024 at 1:24 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Feb 21, 2024 at 1:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > But I am wondering why this flag is always set to true in
> > > DecodeTXNNeedSkip() irrespective of the commit or abort. Because the
> > > aborted transactions are not supposed to be replayed? So if my
> > > observation is correct that for the aborted transaction, this
> > > shouldn't be set to true then we have a problem with sequence where we
> > > are identifying the transactional changes as non-transaction changes
> > > because now for transactional changes this should depend upon commit
> > > status.
> >
> > I have checked this case with Amit Kapila. So it seems in the cases
> > where we have sent the prepared transaction or streamed in-progress
> > transaction we would need to send the abort also, and for that reason,
> > we are setting 'ctx->processing_required' as true so that if these
> > WALs are not streamed we do not allow upgrade of such slots.
>
> I don't find this explanation clear enough for me to understand.
Explanation about why we set 'ctx->processing_required' to true from
DecodeCommit as well as DecodeAbort:
--------------------------------------------------------------------------------------------------------------------------------------------------
For upgrading logical replication slots, it's essential to ensure
these slots are completely synchronized with the subscriber. To
identify that we process all the pending WAL in 'fast_forward' mode to
find whether there is any decodable WAL or not. So in short any WAL
type that we stream to standby in normal mode (no fast_forward mode)
is considered decodable and so is the abort WAL. That's the reason
why at the end of the transaction commit/abort we need to set this
'ctx->processing_required' to true i.e. there are some decodable WAL
exists so we can not upgrade this slot.
Why the below check is safe?
> + if (ctx->fast_forward)
> + {
> + /*
> + * We need to set processing_required flag to notify the sequence
> + * change existence to the caller. Usually, the flag is set when
> + * either the COMMIT or ABORT records are decoded, but this must be
> + * turned on here because the non-transactional logical message is
> + * decoded without waiting for these records.
> + */
> + if (!transactional)
> + ctx->processing_required = true;
> +
> + return;
> + }
So the problem is that we might consider the transaction change as
non-transaction and mark this flag as true. But what would have
happened if we would have identified it correctly as transactional?
In such cases, we wouldn't have set this flag here but then we would
have set this while processing the DecodeAbort/DecodeCommit, so the
net effect would be the same no? You may question what if the
Abort/Commit WAL never appears in the WAL, but this flag is
specifically for the upgrade case, and in that case we have to do a
clean shutdown so may not be an issue. But in the future, if we try
to use 'ctx->processing_required' for something else where the clean
shutdown is not guaranteed then this flag can be set incorrectly.
I am not arguing that this is a perfect design but I am just making a
point about why it would work.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com