Re: WAL segments removed from primary despite the fact that logical replication slot needs it.
Masahiko Sawada <sawada.mshk@gmail.com>
From: Masahiko Sawada <sawada.mshk@gmail.com>
To: Andres Freund <andres@anarazel.de>
Cc: depesz@depesz.com, Amit Kapila <amit.kapila16@gmail.com>, pgsql-bugs mailing list <pgsql-bugs@postgresql.org>
Date: 2022-11-17T14:22:12Z
Lists: pgsql-bugs
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix a possibility of logical replication slot's restart_lsn going backwards.
- e5ed873b1b4a 18.0 landed
- 568e78a653ee 17.2 landed
- f353911337cf 16.6 landed
- 91771b3fbbc3 15.10 landed
- 26c4e8968690 14.15 landed
- 15dc1abb17dd 13.18 landed
On Thu, Nov 17, 2022 at 5:03 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2022-11-15 23:59:37 +0900, Masahiko Sawada wrote: > > > Is something like the following scenario possible to happen? > > > > > > 1. wal sender updates slot's restart_lsn and releases the spin lock > > > (not saved in the disk yet) > > > 2. someone updates slots' minimum restart_lsn (note that slot's > > > restart_lsn in memory is already updated). > > You mean ReplicationSlotsComputeRequiredLSN(), or update that specific slot's > restart_lsn? The latter shouldn't happen. I meant the former. > > > > > 3. checkpointer removes WAL files older than the minimum restart_lsn > > > calculated at step 2. > > For xmin we have protection against that via the split between > catalog_xmin/effective_catalog_xmin. We should probably mirror that for > restart_lsn as well. > > We should also call ReplicationSlotsComputeRequiredLSN if only update_restart > is true... Agree. > > > > > 4. wal sender restarts for some reason (or server crashed). > > I don't think walsender alone restarting should change anything, but > crash-restart obviously would. Right. I've confirmed this scenario is possible to happen with crash-restart. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com