Re: index prefetching
Peter Geoghegan <pg@bowt.ie>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
aio: io_uring: Trigger async processing for large IOs
- a9ee66881744 19 (unreleased) landed
-
read stream: Split decision about look ahead for AIO and combining
- 8ca147d582a5 19 (unreleased) landed
-
read_stream: Only increase read-ahead distance when waiting for IO
- f63ca3379025 19 (unreleased) landed
-
read_stream: Prevent distance from decaying too quickly
- 6e36930f9aaf 19 (unreleased) landed
-
Reduce ExecSeqScan* code size using pg_assume()
- b227b0bb4e03 19 (unreleased) cited
-
Fix rare bug in read_stream.c's split IO handling.
- b421223172a2 19 (unreleased) cited
-
Fix multiranges to behave more like dependent types.
- 3e8235ba4f9c 17.0 cited
-
Add EXPLAIN (MEMORY) to report planner memory consumption
- 5de890e3610d 17.0 cited
-
Optimize nbtree backward scan boundary cases.
- c9c0589fda0e 17.0 cited
-
Increment xactCompletionCount during subtransaction abort.
- 90c885cdab8b 14.0 cited
-
Add nbtree Valgrind buffer lock checks.
- 4a70f829d86c 14.0 cited
-
Add nbtree high key "continuescan" optimization.
- 29b64d1de7c7 12.0 cited
-
Reduce pinning and buffer content locking for btree scans.
- 2ed5b87f96d4 9.5.0 cited
-
Teach btree to handle ScalarArrayOpExpr quals natively.
- 9e8da0f75731 9.2.0 cited
On Fri, Nov 21, 2025 at 6:31 PM Andres Freund <andres@anarazel.de> wrote: > On 2025-11-21 18:14:56 -0500, Peter Geoghegan wrote: > > On Fri, Nov 21, 2025 at 5:38 PM Andres Freund <andres@anarazel.de> wrote: > > > Another benfit is that it helps even more when there multiple queries running > > > concurrently - the high rate of lock/unlock on the buffer rather badly hurts > > > scalability. > > > > I haven't noticed that effect myself. In fact, it seemed to be the > > other way around; it looked like it helped most with very low client > > count workloads. > > It's possible that that effect is more visible on larger machines - I did test > that on a 2x 24cores/48 threads machine. I do see a smaller effect on a > 2x10c/20t machine. Update: I find that when I build Postgres with -march=native, I see performance characteristics that are much more in line with what you saw when you ran your own experiments (experiments with minimizing the number of heap buffer locks acquired during index scans). At 1 client count, there's now only about a 10% increase in throughput for a pgbench variant that uses the type of range queries that you'd expect to benefit the most from this work (that was more like 18%-20% without -march=native). Whereas with 32 clients, it's an ~18% improvement in throughput (where before it was only around 15% - 16%). Are you in the habit of using -march=native? I'm not. I assume that most Postgres users aren't using packages that were built with the flags that -march=native implies, which is why I largely go with defaults for my release/benchmarking builds (the only exception is my use of -fno-omit-frame-pointer). In case it matters, my workstation uses a Ryzen 9 5950X CPU (which is Zen 3). -- Peter Geoghegan