Re: Changing the state of data checksums in a running cluster

Tomas Vondra <tomas@vondra.me>

From: Tomas Vondra <tomas@vondra.me>
To: Michael Paquier <michael@paquier.xyz>
Cc: Daniel Gustafsson <daniel@yesql.se>, Michael Banck <mbanck@gmx.net>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-03-10T00:18:23Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Use correct datatype for PID

  2. Improve comments in online checksums code

  3. Fix checksum state transition during promotion

  4. Fix regex searching for page verification failures in tests

  5. Apply data-checksum worker throttling parameters

  6. Skip WAL for unlogged main fork during online checksum enable

  7. Revert "Get rid of WALBufMappingLock"

  8. Get rid of WALBufMappingLock

  9. Improve grammar of options for command arrays in TAP tests

Attachments

On 3/10/25 00:35, Tomas Vondra wrote:
> Seems cfbot was unhappy with the patches, so here's an improved version,
> fixing some minor issues in expected output and a compiler warning.
> 
> There however seems to be some issue with 003_standby_restarts, which
> causes failures on freebsd and macos. I don't know what that is about,
> but the test runs much longer than on debian.
> 

OK, turns out the failures were caused by the test creating a standby
from a backup, without a slot, so sometimes the primary removed the
necessary WAL. Fixed in the attached version.

There's still a failure on windows, though. I'd bet that's due to the
data_checksum/LocalDatachecksumVersion sync not working correctly on
builds with EXEC_BACKEND, or something like that, but it's too late so
I'll take a closer look tomorrow.


regards

-- 
Tomas Vondra