Re: Making pg_rewind faster

Michael Paquier <michael@paquier.xyz>

From: Michael Paquier <michael@paquier.xyz>
To: Justin Kwan <justinpkwan@outlook.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, pgsql-hackers <pgsql-hackers@postgresql.org>, vignesh <vignesh@cloudflare.com>, "jkwan@cloudflare.com" <jkwan@cloudflare.com>, vignesh ravichandran <admin@viggy28.dev>, "hlinnaka@iki.fi" <hlinnaka@iki.fi>
Date: 2022-07-19T05:36:25Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. pg_rewind: Skip copy of WAL segments generated before point of divergence

  2. pg_rewind: Extend code detecting relation files to work with WAL files

  3. Split TESTDIR into TESTLOGDIR and TESTDATADIR

On Mon, Jul 18, 2022 at 05:14:00PM +0000, Justin Kwan wrote:
> Thank you for taking a look at this and that sounds good. I will
> send over a patch compatible with Postgres v16.

+$node_2->psql(
+       'postgres',
+       "SELECT extract(epoch from modification) FROM pg_stat_file('pg_wal/000000010000000000000003');",
+       stdout => \my $last_common_tli1_wal_last_modified_at);
Please note that you should not rely on the FS-level stats for
anything that touches the WAL segments.  A rough guess about what you
could here to make sure that only the set of WAL segments you are
looking for is being copied over would be to either:
- Scan the logs produced by pg_rewind and see if the segments are
copied or not, depending on the divergence point (aka the last
checkpoint before WAL forked).
- Clean up pg_wal/ in the target node before running pg_rewind,
checking that only the segments you want are available once the
operation completes.
--
Michael