Re: Making pg_rewind faster

Justin Kwan <justinpkwan@outlook.com>

From: Justin Kwan <justinpkwan@outlook.com>
To: pgsql-hackers <pgsql-hackers@postgresql.org>
Cc: vignesh <vignesh@cloudflare.com>, "jkwan@cloudflare.com" <jkwan@cloudflare.com>, vignesh ravichandran <admin@viggy28.dev>, "hlinnaka@iki.fi" <hlinnaka@iki.fi>
Date: 2022-07-16T03:16:27Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. pg_rewind: Skip copy of WAL segments generated before point of divergence

  2. pg_rewind: Extend code detecting relation files to work with WAL files

  3. Split TESTDIR into TESTLOGDIR and TESTDATADIR

Attachments

Hi everyone!

I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

Thanks,
Justin
________________________________
From: Justin Kwan <jkwan@cloudflare.com>
Sent: July 15, 2022 6:13 PM
To: vignesh ravichandran <admin@viggy28.dev>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; justinpkwan@outlook.com <justinpkwan@outlook.com>
Subject: Re: Making pg_rewind faster

Looping in my other email.

On Thu, Jun 30, 2022 at 6:22 AM vignesh ravichandran <admin@viggy28.dev<mailto:admin@viggy28.dev>> wrote:
Hi Hackers,

I have been using pg_rewind in production for 2 years. One of the things that I noticed in pg_rewind is if it doesn't know what to do with a file "it copies". I understand it's the more safer option. After all, the alternative, pg_basebackup copies all the files from source to target.

However, this is making pg_rewind inefficient when we have a high number of WAL files. Majority of the data (in most of my cases 95%+) that it copies are WAL files which are anyway same between the source and target. Skipping those same WAL files from copying will improve the speed of pg_rewind a lot.

1. Does pg_rewind need to copy WAL files before the WAL that contains the last common check point?

Heikki's presentation https://pgsessions.com/assets/archives/pg_rewind-presentation-paris.pdf gave me a good overview and also explained the behavior what I mentioned.

Thanks,
Vignesh