Thread

  1. Re: index prefetching

    Tomas Vondra <tomas@vondra.me> — 2025-12-17T19:20:07Z

    
    On 12/17/25 19:49, Peter Geoghegan wrote:
    > On Wed, Dec 17, 2025 at 12:19 PM Konstantin Knizhnik <knizhnik@garret.ru> wrote:
    >> create table t (pk integer primary key, payload text default repeat('x',
    >> 1000)) with (fillfactor=10);
    >> insert into t values (generate_series(1,10000000))
    >>
    >> So it creates table with size 80Gb (160 after vacuum) which doesn't fit
    >> in RAM.
    > 
    > 160 after VACUUM? What do you mean?
    > 
    >> but what confuses me is that they do not depend on
    >> `effective_io_concurrency`.
    > 
    > You did change other settings, right? You didn't just use the default
    > shared_buffers, for example? (Sorry, I have to ask.)
    > 
    >> Moreover with `enable_indexscan_prefetch=off` results are the same.
    > 
    > It's quite unlikely that the current heuristics that trigger
    > prefetching would have ever allowed any prefetching, for queries such
    > as these.
    > 
    > The exact rule right now is that we don't even begin prefetching until
    > we've already read at least one index leaf page, and have to read
    > another one. So it's impossible to use prefetching with a LIMIT of 1,
    > with queries such as these. It's highly unlikely that you'd see any
    > benefits from prefetching even with LIMIT 100 (usually we wouldn't
    > even begin prefetching).
    > 
    
    True, although I suspect some queries may benefit from prefetching if
    they start close to the end of a leaf page (and so get to read the
    following leaf page too).
    
    >> Also I expected that the best effect of index prefetching should be for
    >> larger limit (accessing more heap pages). But as you see - it is not true.
    >>
    >> May we there is something wrong with my test scenario.
    > 
    > I could definitely believe that the new amgetbatch interface is
    > noticeably faster with range queries. Maybe 5% - 10% faster (even
    > without using the heap-buffer-locking optimization we've talked about
    > on this thread, which you can't have used here because I haven't
    > posted it to the list just yet). But a near 2x improvement wildly
    > exceeds my expectations. Honestly, I have no idea why the patch is so
    > much faster, and suspect an invalid result.
    > 
    
    FWIW I did try to reproduce this improvement, and I don't see anything
    like 2x speedup. I see this:
    
      eic   master     prefetch
        1    28369        28699
       10     7062         8134
      100     2080         2162
    
    So on my machine there's ~5-10% speedup, just like you predicted.
    There's noise, I'd need to do more runs to get more stable results. But
    it's clearly far from 2x.
    
    regards
    
    -- 
    Tomas Vondra