Thread

  1. allow spread checkpoints when changing checksums online

    Tomas Vondra <tomas@vondra.me> — 2026-05-04T13:42:17Z

    Hi,
    
    Here's a small patch re-introducing the option to use spread checkpoints
    (instead of always using CHECKPOINT_FAST) for online checksum changes.
    
    The version v20251201 posted in [1] supported this, but the next patch
    version was without checkpoints and so removed the "fast" parameter too.
    Then we realized the checkpoints are actually needed, but were added
    back to keep it as simple as possible. Or maybe it was an omission, not
    sure, and there's no explanation on the thread.
    
    I recall someone claiming always doing fast checkpoints is fine, because
    we've already written the whole database into WAL anyway, and on large
    databases that's likely way more expensive than a single checkpoint. I
    don't buy that, for two reasons:
    
    - We do have throttling for the rewrite phase, thanks to the cost_limit
    and cost_delay parameters. So we can effectively throttle it, to reduce
    impact of the checksums change. In which case the "fast" checkpoint can
    be way more disruptive.
    
    - We need to do checkpoints even when "disabling" checksums, in which
    case we don't rewrite any data pages (or WAL-log anything), we just need
    to persist the new checksum state. Which just makes the fast checkpoint
    relatively more disruptive.
    
    The attached patch is mostly extracted from v20251201, and adds the
    "fast" parameter back to pg_{enable,disable}_data_checksums.
    
    I have two open questions regarding it:
    
    1) What should be the default? I've used fast=true, mostly because
    that's what PG19 is going to do (fast checkpoints by default). It's also
    somewhat consistent with e.g. VACUUM which does no throttling by
    default. But I assume most production uses would want fast=false?
    
    2) I haven't adjusted the TAP tests. We could use fast=false in a couple
    of the test_checksums tests, but I'm not sure it's worth it and it makes
    it way more time consuming.
    
    We could reduce checkpoint_timeout to something very aggressive. And in
    fact that's what I did locally with the TAP tests I posted in [2]. But
    I'm still not convinced it's worth it - the checkpoints are still
    synchronous, of course.
    
    
    regards
    
    [1]
    https://www.postgresql.org/message-id/477897AE-1314-4724-9694-0BABC4F4ABDA%40yesql.se
    
    [2]
    https://www.postgresql.org/message-id/9e1331e1-93a0-4e27-934a-17b89342be4d%40vondra.me
    
    -- 
    Tomas Vondra