Re: RFC: Allow EXPLAIN to Output Page Fault Information

Lukas Fittl <lukas@fittl.com>

From: Lukas Fittl <lukas@fittl.com>
To: Atsushi Torikoshi <torikoshia.tech@gmail.com>
Cc: vellaipandiyan sm <vellaipandiyan.sm@gmail.com>, torikoshia <torikoshia@oss.nttdata.com>, Pgsql Hackers <pgsql-hackers@postgresql.org>, Andres Freund <andres@anarazel.de>, Jeremy Schneider <schneider@ardentperf.com>
Date: 2026-05-25T18:43:43Z
Lists: pgsql-hackers
On Fri, May 22, 2026 at 9:07 AM Atsushi Torikoshi
<torikoshia.tech@gmail.com> wrote:
>
> Thanks for the review!
>
> On Thu, May 21, 2026 at 2:38 PM vellaipandiyan sm
> <vellaipandiyan.sm@gmail.com> wrote:
> >
> > Hello hackers,
> >
> > I reviewed the EXPLAIN storage I/O patch and the overall direction seems useful, especially for distinguishing shared-buffer hits from actual storage reads during query analysis.
> >
> > One concern that stood out to me from the later discussion is the interaction with asynchronous I/O and worker-based I/O accounting.
> >
> > Since the patch currently relies on per-process getrusage() statistics, it seems possible that the reported values could become partial or misleading once I/O is performed outside the backend process context. In particular, worker-based AIO could undercount storage reads/writes while still returning non-zero values, which may make the output appear more accurate than it actually is.
>
> Yeah, to avoid reporting the misleadingly underestimated values, no
> output is shown when worker-based AIO is used, as described in the
> docs:

I think having something like this patch proposes would be extremely
valuable, but:

Do we even have a path forward here if this simply won't work with I/O workers?

This was discussed before on this thread, but if anything it seems to
me the situation has become more clear that I/O workers are going to
be used for the majority of Postgres 19+ installations.

At least for my part, I've seen both managed providers only offering
I/O workers (e.g. AWS RDS/Aurora), as well as challenges in container
environments where io_uring is not enabled.

Maybe we should try to figure out what would be needed to do better
I/O tracking on the Linux side in a way that is compatible with I/O
workers?

e.g. I assume rusage is too expensive to run on individual I/Os that
the workers process (so its not just a communication problem) -- but
would be good to benchmark.

Thanks,
Lukas

-- 
Lukas Fittl