Re: WAL segments removed from primary despite the fact that logical replication slot needs it.
Andres Freund <andres@anarazel.de>
From: Andres Freund <andres@anarazel.de>
To: Masahiko Sawada <sawada.mshk@gmail.com>
Cc: depesz@depesz.com, Amit Kapila <amit.kapila16@gmail.com>, pgsql-bugs mailing list <pgsql-bugs@postgresql.org>
Date: 2022-11-17T08:02:58Z
Lists: pgsql-bugs
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix a possibility of logical replication slot's restart_lsn going backwards.
- e5ed873b1b4a 18.0 landed
- 568e78a653ee 17.2 landed
- f353911337cf 16.6 landed
- 91771b3fbbc3 15.10 landed
- 26c4e8968690 14.15 landed
- 15dc1abb17dd 13.18 landed
Hi, On 2022-11-15 23:59:37 +0900, Masahiko Sawada wrote: > > Is something like the following scenario possible to happen? > > > > 1. wal sender updates slot's restart_lsn and releases the spin lock > > (not saved in the disk yet) > > 2. someone updates slots' minimum restart_lsn (note that slot's > > restart_lsn in memory is already updated). You mean ReplicationSlotsComputeRequiredLSN(), or update that specific slot's restart_lsn? The latter shouldn't happen. > > 3. checkpointer removes WAL files older than the minimum restart_lsn > > calculated at step 2. For xmin we have protection against that via the split between catalog_xmin/effective_catalog_xmin. We should probably mirror that for restart_lsn as well. We should also call ReplicationSlotsComputeRequiredLSN if only update_restart is true... > > 4. wal sender restarts for some reason (or server crashed). I don't think walsender alone restarting should change anything, but crash-restart obviously would. Greetings, Andres Freund