Re: Changing the state of data checksums in a running cluster

Daniel Gustafsson <daniel@yesql.se>

From: Daniel Gustafsson <daniel@yesql.se>
To: Tomas Vondra <tomas.vondra@enterprisedb.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2024-09-30T21:21:30Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Use correct datatype for PID

  2. Improve comments in online checksums code

  3. Fix checksum state transition during promotion

  4. Fix regex searching for page verification failures in tests

  5. Apply data-checksum worker throttling parameters

  6. Skip WAL for unlogged main fork during online checksum enable

  7. Revert "Get rid of WALBufMappingLock"

  8. Get rid of WALBufMappingLock

  9. Improve grammar of options for command arrays in TAP tests

Attachments

> On 3 Jul 2024, at 13:20, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

> Thanks for rebasing the patch and submitting it again!

Thanks for review, sorry for being so slow to pick this up again.

The attached version is a rebase with some level of cleanup and polish all
around, and most importantly it adresses the two points raised below.

>> * Immediate checkpoints - the code is currently using CHECKPOINT_IMMEDIATE in
>> order to be able to run the tests in a timely manner on it.  This is overly
>> aggressive and dialling it back while still being able to run fast tests is a
>> TODO.  Not sure what the best option is there.
> 
> Why not to add a parameter to pg_enable_data_checksums(), specifying
> whether to do immediate checkpoint or wait for the next one? AFAIK
> that's what we do in pg_backup_start, for example.

That's a good idea, pg_enable_data_checksums now accepts a third parameter
"fast" (defaults to false) which will enable immediate checkpoints when true.

>> * Monitoring - an insightful off-list reviewer asked how the current progress
>> of the operation is monitored.  So far I've been using pg_stat_activity but I
>> don't disagree that it's not a very sharp tool for this.  Maybe we need a
>> specific function or view or something?  There clearly needs to be a way for a
>> user to query state and progress of a transition.
> 
> Yeah, I think a view like pg_stat_progress_checksums would work.

Added in the attached version.  It probably needs some polish (the docs for
sure do) but it's at least a start.

--
Daniel Gustafsson