Thread

  1. Re: Comments on Custom RMGRs

    Andrei Lepikhov <lepihov@gmail.com> — 2025-10-14T09:11:53Z

    On 27/5/2024 20:20, Michael Paquier wrote:
    > Please note that I've been studying ways to have pg_stat_statements
    > being plugged in directly with the shared pgstat APIs to get it backed
    > by a dshash to give more flexibility and scaling, giving a way for
    > extensions to register their own stats kind.  In this case, the flush
    > of the stats would be controlled with a callback in the stats
    > registered by the extensions, conflicting with what's proposed here.
    > pg_stat_statements is all about stats, at the end.  I don't want this
    > argument to act as a barrier if a checkpoint hook is an accepted
    > consensus here,  but a checkpoint hook used for this code path is not
    > the most intuitive solution I can think of in the long-term.Let me continue this thread.
    I wait for any kind of checkpoint cut-in machinery for extensions.
    
    Typically, when collecting knowledge about the instance state, we store 
    it in an extension's owned database table, incurring the costs 
    associated with transactional mechanics, tuple format overhead, and so 
    on. Usually, we don't need MVCC or rollback; we have fixed-length data, 
    and it would be better to store data in hash tables. These hash tables 
    should survive instances' restarts and crashes - that's the only feature 
    needed.
    
    The pg_stat_statements dumps its data to a file, but it is not reliable 
    enough when we need consistent information, such as replication status 
    or when logging update conflicts (see the Spock extension [1]). When we 
    learn about query executions, we can't dump the hash table on each 
    ExecutorEnd due to overhead, but we are okay with adding one more WAL 
    record containing the hash table entry data - it may be done by the 
    backend or by a separate background worker.
    
    So, the primary reason for us is to have a moment to store the 
    extension's state on disk, keeping in mind that we have registered RMGR, 
    which allows us to restore the full state using this disk file and WAL 
    records.
    
    For me, the ideal place for such a hook is CheckPointGuts, right between 
    the CheckPointBuffers call and fsyncs. I think that to demonstrate how 
    this hook can work, the pg_stat_statements storage may need to be 
    redesigned slightly.
    
    [1] https://github.com/pgEdge/spock
    
    -- 
    regards, Andrei Lepikhov,
    pgEdge