Re: Making pg_rewind faster

Japin Li <japinli@hotmail.com>

From: Japin Li <japinli@hotmail.com>
To: John H <johnhyvr@gmail.com>
Cc: Michael Paquier <michael@paquier.xyz>, Andres Freund <andres@anarazel.de>, Alexander Korotkov <aekorotkov@gmail.com>, Justin Kwan <justinpkwan@outlook.com>, Tom Lane <tgl@sss.pgh.pa.us>, pgsql-hackers <pgsql-hackers@postgresql.org>, vignesh <vignesh@cloudflare.com>, vignesh ravichandran <admin@viggy28.dev>, "hlinnaka@iki.fi" <hlinnaka@iki.fi>, "jkwan@cloudflare.com" <jkwan@cloudflare.com>
Date: 2025-07-02T02:20:50Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. pg_rewind: Skip copy of WAL segments generated before point of divergence

  2. pg_rewind: Extend code detecting relation files to work with WAL files

  3. Split TESTDIR into TESTLOGDIR and TESTDATADIR

On Tue, 01 Jul 2025 at 11:13, John H <johnhyvr@gmail.com> wrote:
> Hi,
>
> I've attached an updated version of the patch against master with the changes
> suggested.
>
> On Tue, Nov 29, 2022 at 10:03 PM Michael Paquier <michael@paquier.xyz> wrote:
>>
>> On Thu, Oct 06, 2022 at 04:08:45PM +0900, Michael Paquier wrote:
>>>
>>>  There may be something I am missing here, but there is no need to care
>>> about segments with a TLI older than lastcommontliIndex, no?
>
> Hard to say. pg_rewind is intended to make the same "copy" of the cluster which
> implies pg_wal/ should look the same. There might be use cases around logical
> replication where you would want these WAL files to still exist even
> across promotions?
>
>>> decide_wal_file_action() assumes that the WAL segment exists on the
>>> target and the source.  This looks bug-prone to me without at least an
>>> assertion.
>
> From previous refactors there is now an Assertion in filemap.c
> decide_file_action that handles this.
>
>> Assert(entry->target_exists && entry->source_exists);
>
> decide_wal_file_action is called after the assertion.
>

Hi, John

Thanks for updating the patch.

1.
+/* Determine the type of file content (relation, WAL, or other) */
+static file_content_type_t
+getFileType(const char *path)

Considering the existence of file_type_t, would getFileContentType() be a
suitable function for handling file content types?

2.
Perhaps decide_wal_file_action() could be defined in filemap.c.


While this is unrelated to WAL logging, it could also contribute to faster
pg_rewind operations.  Should we consider ignoring log files under PGDATA
(e.g., those in the default log/ directory)?

-- 
Regards,
Japin Li