Thread

  1. Re: [PATCH] Expose checkpoint timestamp and duration in pg_stat_checkpointer

    Soumya S Murali <soumyamurali.work@gmail.com> — 2025-12-04T05:13:23Z

    Hi all,
    
    Thank you for the review and kind feedback.
    
    On Mon, Dec 1, 2025 at 1:45 PM Michael Banck <mbanck@gmx.net> wrote:
    >
    > Hi,
    >
    > On Mon, Dec 01, 2025 at 11:05:19AM +0530, Soumya S Murali wrote:
    > > > On Fri, Nov 28, 2025 at 10:23:54AM +0530, Soumya S Murali wrote:
    > > > I am still not convinced of the usefulness of those changes to
    > > > pg_stat_checkpointer, but some feedback on the patch:
    > >
    > > According to my understanding, The monitoring systems can already poll
    > > pg_stat_checkpointer at a reasonable frequency but with the checkpoint
    > > duration values exposed, I think it will be easier to compute - the
    > > checkpoint deltas, fluctuations in duration, notice unusualities and
    > > the timing instabilities in WAL-driven checkpoints etc. These may seem
    > > simple but are useful signals that many existing monitoring dashboards
    > > lack today.
    >
    > How would such a computation look like? Maybe if you give an example, it
    > would be easier to understand how this would make things better/more
    > robust.
    
    Consider a monitoring agent polls pg_stat_checkpointer every 30
    seconds, It will read total write_time, total sync_time, counters and
    the last checkpoint duration and timestamp (as in my proposal). Even
    if multiple checkpoints happen between two samples, having the last
    duration and last timestamp allows the monitoring system to spot
    sudden slow checkpoints. For eg:- Imagine if the
    last_checkpoint_duration suddenly jumps from approx (300 ms to 5000
    ms), the monitoring system can alert immediately, even if multiple
    checkpoints happened in between. But this is hard to find out purely
    from cumulative write_time/sync_time without doing  complex delta
    calculations. And also If the timestamp shows checkpoints happening
    much closer together than expected, the tool can alert it as “unusual
    high checkpoint frequency” indicating any of the cases like an
    aggressive WAL-producing workload errors, checkpoint_completion_target
    not being met or I/O layer becoming saturated. This type of detection
    becomes easier when the last checkpoint’s end time is visible
    directly.
    
    > I mentioned up-thread that one problem would be multiple checkpoints
    > having happened between two monitoring runs, where the monitoring system
    > sees the duration of the last checkpoint, but maybe more than one
    > happened. Should they keep track of the number of overall checkpoints
    > and adjust in that case?
    >
    > To be more general: we don't store the last duration anywhere else (as
    > far as I can see, happy to be prove wrong), why is this essential for
    > checkpoint duration, and not other things? Or to put it another way: why
    > does the patch change it for checkpoint but not all the other places?
    >
    >
    > Michael
    
    You are right that the last duration has not been stored anywhere else
    so far and it is a fact that most pg_stat views expose only cumulative
    counters. The reason this patch focuses specifically on checkpoints is
    that checkpoint timing is one of the few parameters where a single
    reading of an event can directly indicate instability and other
    irregularities. A single unusual long checkpoint often implies some of
    the conditions like backend stalls, WAL flush bottlenecks, extended
    buffer recycling, slowdowns in bgwriter or checkpointer I/O
    instabilities. So storing the last checkpoint duration is indeed a
    small extension, but it offers a direct signal that many monitoring
    dashboards currently lack.
    I hope this explanation will be helpful to understand more clearly
    regarding the patch. Looking forward to more feedback.
    
    Regards,
    Soumya