Thread

Re: [PATCH] Expose checkpoint timestamp and duration in pg_stat_checkpointer

Soumya S Murali <soumyamurali.work@gmail.com> — 2025-12-04T05:13:23Z
Hi all,

Thank you for the review and kind feedback.

On Mon, Dec 1, 2025 at 1:45 PM Michael Banck <mbanck@gmx.net> wrote:
>
> Hi,
>
> On Mon, Dec 01, 2025 at 11:05:19AM +0530, Soumya S Murali wrote:
> > > On Fri, Nov 28, 2025 at 10:23:54AM +0530, Soumya S Murali wrote:
> > > I am still not convinced of the usefulness of those changes to
> > > pg_stat_checkpointer, but some feedback on the patch:
> >
> > According to my understanding, The monitoring systems can already poll
> > pg_stat_checkpointer at a reasonable frequency but with the checkpoint
> > duration values exposed, I think it will be easier to compute - the
> > checkpoint deltas, fluctuations in duration, notice unusualities and
> > the timing instabilities in WAL-driven checkpoints etc. These may seem
> > simple but are useful signals that many existing monitoring dashboards
> > lack today.
>
> How would such a computation look like? Maybe if you give an example, it
> would be easier to understand how this would make things better/more
> robust.

Consider a monitoring agent polls pg_stat_checkpointer every 30
seconds, It will read total write_time, total sync_time, counters and
the last checkpoint duration and timestamp (as in my proposal). Even
if multiple checkpoints happen between two samples, having the last
duration and last timestamp allows the monitoring system to spot
sudden slow checkpoints. For eg:- Imagine if the
last_checkpoint_duration suddenly jumps from approx (300 ms to 5000
ms), the monitoring system can alert immediately, even if multiple
checkpoints happened in between. But this is hard to find out purely
from cumulative write_time/sync_time without doing  complex delta
calculations. And also If the timestamp shows checkpoints happening
much closer together than expected, the tool can alert it as “unusual
high checkpoint frequency” indicating any of the cases like an
aggressive WAL-producing workload errors, checkpoint_completion_target
not being met or I/O layer becoming saturated. This type of detection
becomes easier when the last checkpoint’s end time is visible
directly.

> I mentioned up-thread that one problem would be multiple checkpoints
> having happened between two monitoring runs, where the monitoring system
> sees the duration of the last checkpoint, but maybe more than one
> happened. Should they keep track of the number of overall checkpoints
> and adjust in that case?
>
> To be more general: we don't store the last duration anywhere else (as
> far as I can see, happy to be prove wrong), why is this essential for
> checkpoint duration, and not other things? Or to put it another way: why
> does the patch change it for checkpoint but not all the other places?
>
>
> Michael

You are right that the last duration has not been stored anywhere else
so far and it is a fact that most pg_stat views expose only cumulative
counters. The reason this patch focuses specifically on checkpoints is
that checkpoint timing is one of the few parameters where a single
reading of an event can directly indicate instability and other
irregularities. A single unusual long checkpoint often implies some of
the conditions like backend stalls, WAL flush bottlenecks, extended
buffer recycling, slowdowns in bgwriter or checkpointer I/O
instabilities. So storing the last checkpoint duration is indeed a
small extension, but it offers a direct signal that many monitoring
dashboards currently lack.
I hope this explanation will be helpful to understand more clearly
regarding the patch. Looking forward to more feedback.

Regards,
Soumya