Re: logical decoding and replication of sequences, take 2
Tomas Vondra <tomas.vondra@enterprisedb.com>
From: Tomas Vondra <tomas.vondra@enterprisedb.com>
To: Masahiko Sawada <sawada.mshk@gmail.com>
Cc: John Naylor <john.naylor@enterprisedb.com>,
vignesh C <vignesh21@gmail.com>, Andres Freund <andres@anarazel.de>,
Robert Haas <robertmhaas@gmail.com>,
PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>,
Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: 2023-03-16T16:25:45Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
-
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
-
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
-
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
-
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited
Attachments
- 0001-Logical-decoding-of-sequences-20230316.patch (text/x-patch) patch 0001
- 0002-Add-decoding-of-sequences-to-test_decoding-20230316.patch (text/x-patch) patch 0002
- 0003-Add-decoding-of-sequences-to-built-in-repli-20230316.patch (text/x-patch) patch 0003
- 0004-puballtables-fixup-20230316.patch (text/x-patch) patch 0004
- 0005-fixup-syncing-refresh-sequences-20230316.patch (text/x-patch) patch 0005
- 0006-john-s-review-20230316.patch (text/x-patch) patch 0006
Hi!
On 3/16/23 08:38, Masahiko Sawada wrote:
> Hi,
>
> On Wed, Mar 15, 2023 at 9:52 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>>
>>
>> On 3/14/23 08:30, John Naylor wrote:
>>> I tried a couple toy examples with various combinations of use styles.
>>>
>>> Three with "automatic" reading from sequences:
>>>
>>> create table test(i serial);
>>> create table test(i int GENERATED BY DEFAULT AS IDENTITY);
>>> create table test(i int default nextval('s1'));
>>>
>>> ...where s1 has some non-default parameters:
>>>
>>> CREATE SEQUENCE s1 START 100 MAXVALUE 100 INCREMENT BY -1;
>>>
>>> ...and then two with explicit use of s1, one inserting the 'nextval'
>>> into a table with no default, and one with no table at all, just
>>> selecting from the sequence.
>>>
>>> The last two seem to work similarly to the first three, so it seems like
>>> FOR ALL TABLES adds all sequences as well. Is that expected?
>>
>> Yeah, that's a bug - we shouldn't replicate the sequence changes, unless
>> the sequence is actually added to the publication. I tracked this down
>> to a thinko in get_rel_sync_entry() which failed to check the object
>> type when puballtables or puballsequences was set.
>>
>> Attached is a patch fixing this.
>>
>>> The documentation for CREATE PUBLICATION mentions sequence options,
>>> but doesn't really say how these options should be used.
>> Good point. The idea is that we handle tables and sequences the same
>> way, i.e. if you specify 'sequence' then we'll replicate increments for
>> sequences explicitly added to the publication.
>>
>> If this is not clear, the docs may need some improvements.
>>
>
> I'm late to this thread, but I have some questions and review comments.
>
> Regarding sequence logical replication, it seems that changes of
> sequence created after CREATE SUBSCRIPTION are applied on the
> subscriber even without REFRESH PUBLICATION command on the subscriber.
> Which is a different behavior than tables. For example, I set both
> publisher and subscriber as follows:
>
> 1. On publisher
> create publication test_pub for all sequences;
>
> 2. On subscriber
> create subscription test_sub connection 'dbname=postgres port=5551'
> publication test_pub; -- port=5551 is the publisher
>
> 3. On publisher
> create sequence s1;
> select nextval('s1');
>
> I got the error "ERROR: relation "public.s1" does not exist on the
> subscriber". Probably we need to do should_apply_changes_for_rel()
> check in apply_handle_sequence().
>
Yes, you're right - the sequence handling should have been calling the
should_apply_changes_for_rel() etc.
The attached 0005 patch should fix that - I still need to test it a bit
more and maybe clean it up a bit, but hopefully it'll allow you to
continue the review.
I had to tweak the protocol a bit, so that this uses the same cache as
tables. I wonder if maybe we should make it even more similar, by
essentially treating sequences as tables with (last_value, log_cnt,
called) columns.
> If my understanding is correct, is there any case where the subscriber
> needs to apply transactional sequence changes? The commit message of
> 0001 patch says:
>
> * Changes for sequences created in the same top-level transaction are
> treated as transactional, i.e. just like any other change from that
> transaction, and discarded in case of a rollback.
>
> IIUC such sequences are not visible to the subscriber, so it cannot
> subscribe to them until the commit.
>
The comment is slightly misleading, as it talks about creation of
sequences, but it should be talking about relfilenodes. For example, if
you create a sequence, add it to publication, and then in a later
transaction you do
ALTER SEQUENCE x RESTART
or something else that creates a new relfilenode, then the subsequent
increments are visible only in that transaction. But we still need to
apply those on the subscriber, but only as part of the transaction,
because it might roll back.
> ---
> I got an assertion failure. The reproducible steps are:
>
I do believe this was due to a thinko in apply_handle_sequence, which
sometimes started transaction and didn't terminate it correctly. I've
changed it to use the begin_replication_step() etc. and it seems to be
working fine now.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company