Thread

Re: query_id: jumble names of temp tables for better pg_stat_statement UX

Lukas Fittl <lukas@fittl.com> — 2025-07-17T05:06:04Z
On Tue, Jul 15, 2025 at 11:20 PM Alexander Kukushkin <cyberdemn@gmail.com>
wrote:

> However, we regularly hear from many different customers that they *don't
> control queries* sent by application or *can't modify these queries*.
> Such kinds of workloads are also not that uncommon and this change makes
> it impossible to monitor them.
>

For the workloads you are thinking of, are these "one customer per schema"
multi-tenant workloads, or something else?

I mentioned this earlier in the discussion (when supporting the change that
was done), but the main challenge I've seen is that for "one customer per
schema" workloads, pg_stat_statements just doesn't work today, unless you
have only a handful of customers on a server.

Once you have anything close to 100 or more customer schemas on a server,
the churn on the entries makes pg_stat_statements unusable (even with a
high max), especially with the current way the query text file works, since
you can't reliably read from pg_stat_statements anymore without incurring a
read from a 100MB+ query text file.

So I agree this change reduces the visibility into which of the schemas had
a slow query, but it at least allows reliably using pg_stat_statements to
narrow down which query / part of an application is problematic. To get
specifics on the schema, one could then use other means (e.g.
log_min_duration_statement, auto_explain, etc) to get exact details,
grepping the logfile for the query ID retrieved from pg_stat_statements.

Thanks,
Lukas

-- 
Lukas Fittl