Re: logical decoding and replication of sequences, take 2
Jonathan S. Katz <jkatz@postgresql.org>
From: "Jonathan S. Katz" <jkatz@postgresql.org>
To: Tomas Vondra <tomas.vondra@enterprisedb.com>,
vignesh C <vignesh21@gmail.com>
Cc: Andres Freund <andres@anarazel.de>, Robert Haas <robertmhaas@gmail.com>,
PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>,
Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: 2023-02-22T17:04:29Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Migrate logical slots to the new node during an upgrade.
- 29d0a77fa660 17.0 cited
-
Make test_decoding ddl.out shorter
- d6677b93c79b 17.0 landed
- c5c5832600e9 14.9 landed
- b1dc946eee3d 16.0 landed
- 3bb8b9342f8a 15.4 landed
-
Fix snapshot handling in logicalmsg_decode
- 949ac32e1267 15.3 landed
- 8b9cbd42b61f 14.8 landed
- 4df581fa0f4b 13.11 landed
- 497f863f0598 12.15 landed
- 8de91ebf2ac1 11.20 landed
- 7fe1aa991b62 16.0 landed
-
doc: Adjust a few more references to "postmaster"
- 17e72ec45d31 16.0 cited
-
Revert "Logical decoding of sequences"
- 2c7ea57e56ca 15.0 cited
On 2/22/23 5:02 AM, Tomas Vondra wrote: > > On 2/22/23 03:28, Jonathan S. Katz wrote: >> Thanks for continuing to work on this patch! I tested the latest version >> and have some feedback/clarifications. >> > > Thanks! Also I should mention I've been testing with both async/sync logical replication. I didn't have any specific comments on either as it seemed to just work and behaviors aligned with existing expectations. Generally it's been a good experience and it seems to be working. :) At this point I'm trying to understand the limitations and tripwires so we can guide users appropriately. > Yes, this is due to how we WAL-log sequences. We don't log individual > increments, but every 32nd increment and we log the "future" sequence > state so that after a crash/recovery we don't generate duplicates. > > So you do nextval() and it returns 1. But into WAL we record 32. And > there will be no WAL records until nextval reaches 32 and needs to > generate another batch. > > And because logical replication relies on these WAL records, it inherits > this batching behavior with a "jump" on recovery/failover. IMHO it's OK, > it works for the "logical failover" use case and if you need gapless > sequences then regular sequences are not an issue anyway. > > It's possible to reduce the jump a bit by reducing the batch size (from > 32 to 0) so that every increment is logged. But it doesn't eliminate it > because of rollbacks. I generally agree. I think it's mainly something we should capture in the user docs that they can be a jump on the subscriber side, so people are not surprised. Interestingly, in systems that tend to have higher rates of failover (I'm thinking of a few distributed systems), this may cause int4 sequences to exhaust numbers slightly (marginally?) more quickly. Likely not too big of an issue, but something to keep in mind. >> 2. Using with origin=none with nonconflicting sequences. >> >> I modified the example in [1] to set up two schemas with non-conflicting >> sequences[2], e.g. on instance 1: >> >> CREATE TABLE public.room ( >> id int GENERATED BY DEFAULT AS IDENTITY (INCREMENT 2 START WITH 1) >> PRIMARY KEY, >> name text NOT NULL >> ); >> >> and instance 2: >> >> CREATE TABLE public.room ( >> id int GENERATED BY DEFAULT AS IDENTITY (INCREMENT 2 START WITH 2) >> PRIMARY KEY, >> name text NOT NULL >> ); >> > > Well, yeah. We don't support active-active logical replication (at least > not with the built-in). You can easily get into similar issues without > sequences. The "origin=none" feature lets you replicate tables bidirectionally. While it's not full "active-active", this is a starting point and a feature for v16. We'll definitely have users replicating data bidirectionally with this. > Replicating a sequence overwrites the state of the sequence on the other > side, which may result in it generating duplicate values with the other > node, etc. I understand that we don't currently support global sequences, but I am concerned there may be a tripwire here in the origin=none case given it's fairly common to use serial/GENERATED BY to set primary keys. And it's fairly trivial to set them to be nonconflicting, or at least give the user the appearance that they are nonconflicting. From my high level understand of how sequences work, this sounds like it would be a lift to support the example in [1]. Or maybe the answer is that you can bidirectionally replicate the changes in the tables, but not sequences? In any case, we should update the restrictions in [2] to state: while sequences can be replicated, there is additional work required if you are bidirectionally replicating tables that use sequences, esp. if used in a PK or a constraint. We can provide alternatives to how a user could set that up, i.e. not replicates the sequences or do something like in [3]. Thanks, Jonathan [1] https://gist.github.com/jkatz/5c34bf1e401b3376dfe8e627fcd30af3 [2] https://www.postgresql.org/docs/devel/logical-replication-restrictions.html [3] https://gist.github.com/jkatz/1599e467d55abec88ab487d8ac9dc7c3