Thread

  1. Re: query_id: jumble names of temp tables for better pg_stat_statement UX

    Lukas Fittl <lukas@fittl.com> — 2025-07-17T05:06:04Z

    On Tue, Jul 15, 2025 at 11:20 PM Alexander Kukushkin <cyberdemn@gmail.com>
    wrote:
    
    > However, we regularly hear from many different customers that they *don't
    > control queries* sent by application or *can't modify these queries*.
    > Such kinds of workloads are also not that uncommon and this change makes
    > it impossible to monitor them.
    >
    
    For the workloads you are thinking of, are these "one customer per schema"
    multi-tenant workloads, or something else?
    
    I mentioned this earlier in the discussion (when supporting the change that
    was done), but the main challenge I've seen is that for "one customer per
    schema" workloads, pg_stat_statements just doesn't work today, unless you
    have only a handful of customers on a server.
    
    Once you have anything close to 100 or more customer schemas on a server,
    the churn on the entries makes pg_stat_statements unusable (even with a
    high max), especially with the current way the query text file works, since
    you can't reliably read from pg_stat_statements anymore without incurring a
    read from a 100MB+ query text file.
    
    So I agree this change reduces the visibility into which of the schemas had
    a slow query, but it at least allows reliably using pg_stat_statements to
    narrow down which query / part of an application is problematic. To get
    specifics on the schema, one could then use other means (e.g.
    log_min_duration_statement, auto_explain, etc) to get exact details,
    grepping the logfile for the query ID retrieved from pg_stat_statements.
    
    Thanks,
    Lukas
    
    -- 
    Lukas Fittl