Re: Changing the state of data checksums in a running cluster

Tomas Vondra <tomas@vondra.me>

From: Tomas Vondra <tomas@vondra.me>
To: Daniel Gustafsson <daniel@yesql.se>, Bernd Helmle <mailings@oopsware.de>
Cc: Michael Paquier <michael@paquier.xyz>, Michael Banck <mbanck@gmx.net>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-08-20T17:02:57Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Use correct datatype for PID

  2. Improve comments in online checksums code

  3. Fix checksum state transition during promotion

  4. Fix regex searching for page verification failures in tests

  5. Apply data-checksum worker throttling parameters

  6. Skip WAL for unlogged main fork during online checksum enable

  7. Revert "Get rid of WALBufMappingLock"

  8. Get rid of WALBufMappingLock

  9. Improve grammar of options for command arrays in TAP tests

Hi,

I think there's a minor issue in how pg_checksums validates state before
checking the data.

The current patch simply does:

  if (ControlFile->data_checksum_version == 0 &&
      mode == PG_MODE_CHECK)
      pg_fatal("data checksums are not enabled in cluster");

and that worked when the version was either 0 or 1. But now it can be
also 2 or 3, for inprogress-on / inprogress-off, and if the cluster gets
shut down at the right moment, that can end in the control file.

It doesn't make sense to verify checksums in such cluster, pg_checksums
should handle that as "off", i.e. error out.


regards

-- 
Tomas Vondra