Re: logical decoding and replication of sequences, take 2
Tomas Vondra <tomas.vondra@enterprisedb.com>
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
To: Amit Kapila <amit.kapila16@gmail.com>,
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>,
Masahiko Sawada <sawada.mshk@gmail.com>,
Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Date: 2023-07-24T10:52:08Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
-
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
-
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
-
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
-
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited
On 7/24/23 12:40, Amit Kapila wrote:
> On Wed, Jul 5, 2023 at 8:21 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
>>
>> 0005, 0006 and 0007 are all related to the initial sequence sync. [3]
>> resulted in 0007 and I think we need it. That leaves 0005 and 0006 to
>> be reviewed in this response.
>>
>> I followed the discussion starting [1] till [2]. The second one
>> mentions the interlock mechanism which has been implemented in 0005
>> and 0006. While I don't have an objection to allowing LOCKing a
>> sequence using the LOCK command, I am not sure whether it will
>> actually work or is even needed.
>>
>> The problem described in [1] seems to be the same as the problem
>> described in [2]. In both cases we see the sequence moving backwards
>> during CATCHUP. At the end of catchup the sequence is in the right
>> state in both the cases.
>>
>
> I think we could see backward sequence value even after the catchup
> phase (after the sync worker is exited and or the state of rel is
> marked as 'ready' in pg_subscription_rel). The point is that there is
> no guarantee that we will process all the pending WAL before
> considering the sequence state is 'SYNCDONE' and or 'READY'. For
> example, after copy_sequence, I see values like:
>
> postgres=# select * from s;
> last_value | log_cnt | is_called
> ------------+---------+-----------
> 165 | 0 | t
> (1 row)
> postgres=# select nextval('s');
> nextval
> ---------
> 166
> (1 row)
> postgres=# select nextval('s');
> nextval
> ---------
> 167
> (1 row)
> postgres=# select currval('s');
> currval
> ---------
> 167
> (1 row)
>
> Then during the catchup phase:
> postgres=# select * from s;
> last_value | log_cnt | is_called
> ------------+---------+-----------
> 33 | 0 | t
> (1 row)
> postgres=# select * from s;
> last_value | log_cnt | is_called
> ------------+---------+-----------
> 66 | 0 | t
> (1 row)
>
> postgres=# select * from pg_subscription_rel;
> srsubid | srrelid | srsubstate | srsublsn
> ---------+---------+------------+-----------
> 16394 | 16390 | r | 0/16374E8
> 16394 | 16393 | s | 0/1637700
> (2 rows)
>
> postgres=# select * from pg_subscription_rel;
> srsubid | srrelid | srsubstate | srsublsn
> ---------+---------+------------+-----------
> 16394 | 16390 | r | 0/16374E8
> 16394 | 16393 | r | 0/1637700
> (2 rows)
>
> Here Sequence relid id 16393. You can see sequence state is marked as ready.
>
> postgres=# select * from s;
> last_value | log_cnt | is_called
> ------------+---------+-----------
> 66 | 0 | t
> (1 row)
>
> Even after that, see below the value of the sequence is still not
> caught up. Later, when the apply worker processes all the WAL, the
> sequence state will be caught up.
>
> postgres=# select * from s;
> last_value | log_cnt | is_called
> ------------+---------+-----------
> 165 | 0 | t
> (1 row)
>
> So, there will be a window where the sequence won't be caught up for a
> certain period of time and any usage of it (even after the sync is
> finished) during that time could result in inconsistent behaviour.
>
I'm rather confused about which node these queries are executed on.
Presumably some of it is on publisher, some on subscriber?
Can you create a reproducer (TAP test demonstrating this?) I guess it
might require adding some sleeps to hit the right timing ...
> The other question is whether it is okay to allow the sequence to go
> backwards even during the initial sync phase? The reason I am asking
> this question is that for the time sequence value moves backwards, one
> is allowed to use it on the subscriber which will result in using
> out-of-sequence values. For example, immediately, after copy_sequence
> the values look like this:
> postgres=# select * from s;
> last_value | log_cnt | is_called
> ------------+---------+-----------
> 133 | 32 | t
> (1 row)
> postgres=# select nextval('s');
> nextval
> ---------
> 134
> (1 row)
> postgres=# select currval('s');
> currval
> ---------
> 134
> (1 row)
>
> But then during the sync phase, it can go backwards and one is allowed
> to use it on the subscriber:
> postgres=# select * from s;
> last_value | log_cnt | is_called
> ------------+---------+-----------
> 66 | 0 | t
> (1 row)
> postgres=# select nextval('s');
> nextval
> ---------
> 67
> (1 row)
>
Well, as for going back during the sync phase, I think the agreement was
that's acceptable, as we don't make guarantees about that. The question
is what's the state at the end of the sync (which I think leads to the
first part of your message).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company