Re: Making pg_rewind faster
Justin Kwan <justinpkwan@outlook.com>
From: Justin Kwan <justinpkwan@outlook.com>
To: vignesh ravichandran <admin@viggy28.dev>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>, vignesh <vignesh@cloudflare.com>, "jkwan@cloudflare.com" <jkwan@cloudflare.com>, "hlinnaka@iki.fi" <hlinnaka@iki.fi>
Date: 2022-07-15T22:24:54Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
pg_rewind: Skip copy of WAL segments generated before point of divergence
- 5173bfd0443e 19 (unreleased) landed
-
pg_rewind: Extend code detecting relation files to work with WAL files
- 6ae08d9583e9 19 (unreleased) landed
-
Split TESTDIR into TESTLOGDIR and TESTDATADIR
- c47885bd8b69 16.0 cited
Attachments
- v1-0001-Avoid-copying-WAL-segments-before-divergence-to-spee.patch (application/octet-stream) patch v1-0001
Hi everyone! Here's the attached patch submission to optimize pg_rewind performance when many WAL files are retained on server. This patch avoids replaying (copying over) older WAL segment files that fall before the point of divergence between the source and target servers. Thanks, Justin ________________________________ From: Justin Kwan <jkwan@cloudflare.com> Sent: July 15, 2022 6:13 PM To: vignesh ravichandran <admin@viggy28.dev> Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; justinpkwan@outlook.com <justinpkwan@outlook.com> Subject: Re: Making pg_rewind faster Looping in my other email. On Thu, Jun 30, 2022 at 6:22 AM vignesh ravichandran <admin@viggy28.dev<mailto:admin@viggy28.dev>> wrote: Hi Hackers, I have been using pg_rewind in production for 2 years. One of the things that I noticed in pg_rewind is if it doesn't know what to do with a file "it copies". I understand it's the more safer option. After all, the alternative, pg_basebackup copies all the files from source to target. However, this is making pg_rewind inefficient when we have a high number of WAL files. Majority of the data (in most of my cases 95%+) that it copies are WAL files which are anyway same between the source and target. Skipping those same WAL files from copying will improve the speed of pg_rewind a lot. 1. Does pg_rewind need to copy WAL files before the WAL that contains the last common check point? Heikki's presentation https://pgsessions.com/assets/archives/pg_rewind-presentation-paris.pdf gave me a good overview and also explained the behavior what I mentioned. Thanks, Vignesh