Thread

  1. Re: [PATCH] Expose checkpoint timestamp and duration in pg_stat_checkpointer

    Soumya S Murali <soumyamurali.work@gmail.com> — 2025-12-01T05:35:19Z

    Hi all,
    
    > On Fri, Nov 28, 2025 at 10:23:54AM +0530, Soumya S Murali wrote:
    > > I have updated the code based on the feedback received to my earlier
    > > mails and prepared a patch for further review.
    >
    > I think the logging change and the pg_stat_checkpointer changes are
    > different enough that they should be separate patches. If not just
    > because the logging change seems to not have had any non-positive
    > feedback.
    
    Thank you for the review and for the clarification. I understand the
    point about separating the logging change and the pg_stat_checkpointer
    additions. As per the suggestion, I will make sure to split them into
    two independent patches before sending the updated one.
    
    > > In this patch, I have renamed the checkpoint_total_time to
    > > last_checkpoint_duration, stats_reset has been kept as the last column
    > > following the usual pattern, last_checkpoint_duration and
    > > last_checkpoint_time will now be overwritten per checkpoint and also
    > > have removed unnecessary lines as per the usual format. I had
    > > successfully verified the checkpointer duration with different write
    > > loads and I am  attaching the observations for further reference.
    >
    > I am still not convinced of the usefulness of those changes to
    > pg_stat_checkpointer, but some feedback on the patch:
    
    According to my understanding, The monitoring systems can already poll
    pg_stat_checkpointer at a reasonable frequency but with the checkpoint
    duration values exposed, I think it will be easier to compute - the
    checkpoint deltas, fluctuations in duration, notice unusualities and
    the timing instabilities in WAL-driven checkpoints etc. These may seem
    simple but are useful signals that many existing monitoring dashboards
    lack today.
    
    > diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
    > index 9217508917..4a45f4f708 100644
    > --- a/src/backend/access/transam/xlog.c
    > +++ b/src/backend/access/transam/xlog.c
    > @@ -6778,7 +6778,7 @@ LogCheckpointEnd(bool restartpoint, int flags)
    >
    >
    >         /* Store in PendingCheckpointerStats */
    > -       PendingCheckpointerStats.checkpoint_total_time += (double) total_msecs;
    > +       PendingCheckpointerStats.last_checkpoint_duration = (double) total_msecs;
    >         PendingCheckpointerStats.last_checkpoint_time = CheckpointStats.ckpt_end_t;
    >
    > [...]
    >
    > diff --git a/src/include/pgstat.h b/src/include/pgstat.h
    > index a8eb1f8add..73688041c8 100644
    > --- a/src/include/pgstat.h
    > +++ b/src/include/pgstat.h
    > @@ -263,7 +263,7 @@ typedef struct PgStat_CheckpointerStats
    >         PgStat_Counter sync_time;
    >         PgStat_Counter buffers_written;
    >         PgStat_Counter slru_written;
    > -       PgStat_Counter checkpoint_total_time;  /* new: total ms of last checkpoint */
    > +       PgStat_Counter last_checkpoint_duration;  /* new: total ms of last checkpoint */
    >         TimestampTz    last_checkpoint_time;   /* new: end time of last checkpoint */
    >         TimestampTz    stat_reset_timestamp;
    >  } PgStat_CheckpointerStats;
    >
    > This looks like an incremental patch based on your original one? It is
    > customary to send the full, updated, patch again.
    >
    >
    > Michael
    
    Ok noted. I will resend a full updated patch set soon and will make
    sure every updation goes as per the intended flow.
    Thank you for the guidance. Looking forward to more feedback.
    
    Regards,
    Soumya