Re: WAL segments removed from primary despite the fact that logical replication slot needs it.

Masahiko Sawada <sawada.mshk@gmail.com>

From: Masahiko Sawada <sawada.mshk@gmail.com>
To: depesz@depesz.com
Cc: Amit Kapila <amit.kapila16@gmail.com>, PostgreSQL mailing lists <pgsql-bugs@lists.postgresql.org>
Date: 2023-02-06T14:18:02Z
Lists: pgsql-bugs

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Fix a possibility of logical replication slot's restart_lsn going backwards.

On Mon, Feb 6, 2023 at 8:15 PM hubert depesz lubaczewski
<depesz@depesz.com> wrote:
>
> On Mon, Feb 06, 2023 at 05:25:42PM +0900, Masahiko Sawada wrote:
> > Based on the analysis we did[1][2], I've created the manual scenario
> > to reproduce this issue with the attached patch and the script.
> >
> > The scenario.md explains the basic steps to reproduce this issue. It
> > consists of 13 steps (very tricky!!). It's not sophisticated and could
> > be improved. test.sh is the shell script I used to execute the
> > reproduction steps from 1 to 10. In my environment, I could reproduce
> > this issue by the following steps.
> >
> > 1. apply the patch and build PostgreSQL.
> > 2. run test.sh.
> > 3. execute the step 11 and later described in scenario.md.
> >
> > The test.sh is a very hacky and dirty script and is optimized in my
> > environment (especially adding many sleeps). You might need to adjust
> > it while checking scenario.md.
> >
> > I've also confirmed that this issue is fixed by the attached patch,
> > which clears candidate_restart_lsn and friends during
> > ReplicationSlotRelease().
>
> Hi,
> one important question - do I patch newer Pg, or older? The thing is
> that we were able to replicate the problem (with some luck) only on
> production databases, and patching them will be hard sell. Maybe
> possible, but if it's enough to patch the pg14 (recipient) it would make
> my life much easier.

Unfortunately, the patch I attached is for the publisher (i.e., sender
side). There might be a way to fix this issue from the receiver side
but I have no idea for now.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com