Re: logical decoding and replication of sequences, take 2
Tomas Vondra <tomas.vondra@enterprisedb.com>
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
To: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Cc: Amit Kapila <amit.kapila16@gmail.com>,
PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>,
Masahiko Sawada <sawada.mshk@gmail.com>,
Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Date: 2023-08-16T14:26:46Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
-
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
-
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
-
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
-
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited
Attachments
- 0001-Logical-decoding-of-sequences-20230816.patch (text/x-patch) patch 0001
- 0002-Add-decoding-of-sequences-to-test_decoding-20230816.patch (text/x-patch) patch 0002
- 0003-Add-decoding-of-sequences-to-built-in-repli-20230816.patch (text/x-patch) patch 0003
- 0004-update-replorigin_session_origin_lsn-20230816.patch (text/x-patch) patch 0004
On 8/11/23 08:32, Ashutosh Bapat wrote:
> On Tue, Aug 1, 2023 at 8:46 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> Anyway, I think this is "just" a matter of efficiency, not correctness.
>> IMHO there are bigger questions regarding the "going back" behavior
>> after apply restart.
>
>
> sequence_decode() has the following code
> /* Skip the change if already processed (per the snapshot). */
> if (transactional &&
> !SnapBuildProcessChange(builder, xid, buf->origptr))
> return;
> else if (!transactional &&
> (SnapBuildCurrentState(builder) != SNAPBUILD_CONSISTENT ||
> SnapBuildXactNeedsSkip(builder, buf->origptr)))
> return;
>
> This means that if the subscription restarts, the upstream will *not*
> send any non-transactional sequence changes with LSN prior to the LSN
> specified by START_REPLICATION command. That should avoid replicating
> all the non-transactional sequence changes since
> ReplicationSlot::restart_lsn if the subscription restarts.
>
Ah, right, I got confused and mixed restart_lsn and the LSN passed in
the START_REPLICATION COMMAND. Thanks for the details, I think this
works fine.
> But in apply_handle_sequence(), we do not update the
> replorigin_session_origin_lsn with LSN of the non-transactional
> sequence change when it's applied. This means that if a subscription
> restarts while it is half way through applying a transaction, those
> changes will be replicated again. This will move the sequence
> backward. If the subscription keeps restarting again and again while
> applying that transaction, we will see the sequence "rubber banding"
> [1] on subscription. So untill the transaction is completely applied,
> the other users of the sequence may see duplicate values during this
> time. I think this is undesirable.
>
Well, but as I said earlier, this is not expected to support using the
sequence on the subscriber until after the failover, so there's not real
risk of "duplicate values". Yes, you might select the data from the
sequence directly, but that would have all sorts of issues even without
replication - users are required to use nextval/currval and so on.
> But I am not able to find a case where this can lead to conflicting
> values after failover. If there's only one transaction which is
> repeatedly being applied, the rows which use sequence values were
> never committed so there's no conflicting value present on the
> subscription. The same reasoning can be extended to multiple in-flight
> transactions. If another transaction (T2) uses the sequence values
> changed by in-flight transaction T1 and if T2 commits before T1, the
> sequence changes used by T2 must have LSNs before commit of T2 and
> thus they will never be replicated. (See example below).
>
> T1
> insert into t1 (nextval('seq'), ...) from generate_series(1, 100); - Q1
> T2
> insert into t1 (nextval('seq'), ...) from generate_series(1, 100); - Q2
> COMMIT;
> T1
> insert into t1 (nextval('seq'), ...) from generate_series(1, 100); - Q13
> COMMIT;
>
> So I am not able to imagine a case when a sequence going backward can
> cause conflicting values.
Right, I agree this "rubber banding" can happen. But as long as we don't
go back too far (before the last applied commit) I think that'd fine. We
only need to make guarantees about committed transactions, and I don't
think we need to worry about this too much ...
>
> But whether or not that's the case, downstream should not request (and
> hence receive) any changes that have been already applied (and
> committed) downstream as a principle. I think a way to achieve this is
> to update the replorigin_session_origin_lsn so that a sequence change
> applied once is not requested (and hence sent) again.
>
I guess we could update the origin, per attached 0004. We don't have
timestamp to set replorigin_session_origin_timestamp, but it seems we
don't need that.
The attached patch merges the earlier improvements, except for the part
that experimented with adding a "fake" transaction (which turned out to
have a number of difficult issues).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company