Re: BUG #16039: PANIC when activating replication slots in Postgres 12.0 64bit under Windows

Michael Paquier <michael@paquier.xyz>

From: Michael Paquier <michael@paquier.xyz>
To: Andres Freund <andres@anarazel.de>
Cc: buschmann@nidsa.net, pgsql-bugs@lists.postgresql.org, Michael Paquier <michael.paquier@gmail.com>
Date: 2019-10-06T04:55:48Z
Lists: pgsql-bugs

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Flush logical mapping files with fd opened for read/write at checkpoint

  2. Use a fd opened for read/write when syncing slots during startup, take 2.

  3. Tighten use of OpenTransientFile and CloseTransientFile

  4. Use a fd opened for read/write when syncing slots during startup.

Attachments

On Fri, Oct 04, 2019 at 01:06:05PM -0700, Andres Freund wrote:
> I realize I perhaps should have added a comment explaining this, but
> this is far from the only location that knows we have to know open fds
> r/w to be able to fsync them.

Sorry for the late reply here.  It looks like I messed up here, my
apologies for that.  And thanks for fixing the issue.

It would have been nice to add some sanity checks based on fcntl() but
directory handling in pg_fsync() makes that annoying.  Anyway, I have
checked the code with a little trick, and I have spotted a second bug:
CheckPointLogicalRewriteHeap() fsyncs a logical rewrite mapping file
with RDONLY.  This is incorrect since b89e151.

> What were you even trying to fix by changing this?

Hardening of the code.  Some code paths clearly relied on the
operations to be read-only.

> Seems also pretty clear that we need a few animals running with fsync
> enabled. Not sure how we best can write test infrastructure to make it
> easy to set that for all tests. Guess I best start a thread about it on
> -hackers.

I think that we would need more infrastructure here for TAP tests, aka
how to be able to enforce some parameters when setting the
configuration of a new node.

Attached are two patches: the actual bug fix and an extra patch with
the trick I have used to find it out (contrib/test_decoding/ was the
part which has blown up).
--
Michael