Thread

  1. Re: Checkpointer write combining

    Soumya S Murali <soumyamurali.work@gmail.com> — 2025-12-17T05:53:08Z

    Hi all,
    
    On Tue, Dec 16, 2025 at 3:18 AM Melanie Plageman
    <melanieplageman@gmail.com> wrote:
    >
    > On Mon, Dec 15, 2025 at 4:36 AM Soumya S Murali
    > <soumyamurali.work@gmail.com> wrote:
    > >
    > > With reference to the last patches (v11) I received [1] and while reviewing Melanie’s latest feedback, I understood that PageSetBatchChecksumInplace() is currently WIP and depends on upcoming changes to hint-bit locking. It will be contrary to the flow if I propose new functional changes to checksum batching at this time. So for now I will focus on preparatory or documentation improvements until I get the updates on dependencies.
    > > Regarding my patch attached, the patch introduces write-combining during checkpoints by batching contiguous buffers and allowing them to be written using vectorized I/O. My patch includes write-combining for checkpoint buffer flushes, contiguous buffer batching, Preserved WAL ordering, locking, and buffer state invariants. The change is currently limited to the checkpointer path (BufferSync()). So far I tested my implementation and found that all the regression (233 tests) and isolation tests (121 tests) got passed, the manual pgbench validation completed successfully and also verified pg_stat_bgwriter counters before and after checkpoints. So far the implementation is stable in my system.
    >
    > Can you explain how your implementation differs from what was posted
    > in v11 0006 [1]? That implements checkpointer write combining. I'm
    > open to ideas for improving the code, but I don't understand how your
    > patch is supposed to fit into the ongoing work on this thread.
    >
    > - Melanie
    >
    > [1] https://www.postgresql.org/message-id/CAAKRu_ZiEpE_EHww3S3-E3iznybdnX8mXSO7Wsuru7%3DP9Y%3DczQ%40mail.gmail.com
    
    
    Thank you for the question.
    My patch is not intended to replace or redesign v11-0006. I am fully
    aligned with that patch and treated it as the baseline for my work.
    The work I sent is intentionally incremental, rather than introducing
    a new batching logic.
    v11-0006 already implements the core checkpointer write-combining
    logic (batch formation, contiguity checks, WAL ordering, pin limits,
    and IO issuance). I did not change that structure.
    My changes focus on correctness around existing CleanVictimBuffer()
    ensuring content locks are always released on early exit paths and
    making the shared exclusive lock transitions explicit. This directly
    addresses the lock-handling issue you pointed out earlier in the
    thread. And the BufferNeedsWALFlush() clarifying semantics so that an
    LSN is only returned when the buffer is logged (BM_PERMANENT),
    otherwise explicitly setting it to InvalidXLogRecPtr. This matches the
    direction you mentioned about avoiding confusing or unsafe LSN
    propagation.
    I intentionally did not modify PageSetBatchChecksumInplace(), since it
    is clearly marked WIP and depended on the hint-bit locking work as you
    mentioned.
    I just validated the v11-0006 design on a fresh tree, done a few small
    correctness cleanups that do not alter behavior and done the testing
    like make check, isolation tests and manual checkpoint validation for
    confirmation. I hope you find this useful.
    Thank you for the guidance and patience. Looking forward to more feedback.
    
    
    Regards,
    Soumya