Re: Adding skip scan (including MDAM style range skip scan) to nbtree

Peter Geoghegan <pg@bowt.ie>

From: Peter Geoghegan <pg@bowt.ie>
To: Heikki Linnakangas <hlinnaka@iki.fi>
Cc: Masahiro Ikeda <ikedamsh@oss.nttdata.com>, Tomas Vondra <tomas@vondra.me>, Masahiro.Ikeda@nttdata.com, pgsql-hackers@lists.postgresql.org, Masao.Fujii@nttdata.com
Date: 2025-02-14T23:06:23Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. nbtree: Always set skipScan flag on rescan.

  2. meson: Build numeric.c with -ftree-vectorize.

  3. Fix "variable not found in subplan target lists" in semijoin de-duplication.

  4. Revert "nbtree: Remove useless row compare arg."

  5. nbtree: Remove useless row compare arg.

  6. Prevent premature nbtree array advancement.

  7. nbtree: tighten up array recheck rules.

  8. Avoid treating nonrequired nbtree keys as required.

  9. Adjust overstrong nbtree skip array assertion.

  10. Make NULL tuple values always advance skip arrays.

  11. Avoid extra index searches through preprocessing.

  12. Improve nbtree skip scan primitive scan scheduling.

  13. Further optimize nbtree search scan key comparisons.

  14. Add nbtree skip scan optimization.

  15. Improve nbtree array primitive scan scheduling.

  16. nbtree: Make BTMaxItemSize into object-like macro.

  17. Show index search count in EXPLAIN ANALYZE, take 2.

  18. Make parallel nbtree index scans use an LWLock.

  19. Show index search count in EXPLAIN ANALYZE.

  20. Avoid nbtree parallel scan currPos confusion.

  21. nbtree: Remove useless 'strat' local variable.

  22. Normalize nbtree truncated high key array behavior.

  23. Refactor handling of nbtree array redundancies.

  24. Fix nbtree pgstats accounting with parallel scans.

  25. Avoid parallel nbtree index scan hangs with SAOPs.

  26. Show Parallel Bitmap Heap Scan worker stats in EXPLAIN ANALYZE

  27. Enhance nbtree ScalarArrayOp execution.

  28. Skip checking of scan keys required for directional scan in B-tree

  29. Instead of using a numberOfRequiredKeys count to distinguish required

Attachments

On Wed, Feb 5, 2025 at 6:43 PM Peter Geoghegan <pg@bowt.ie> wrote:
> The way that the "skipskip" optimization works is unchanged in v23.
> And even the way that we decide whether to apply that optimization
> didn't really change, either. What's new in v23 is that
> v23-0003-*patch adds rules around primitive scan scheduling.
> Obviously, I specifically targeted Heikki's regression when I came up
> with this, but the new _bt_advance_array_keys rules are nevertheless
> orthogonal: even scans that just use conventional SAOP arrays will
> also use these new _bt_advance_array_keys heuristics (though it'll
> tend to matter much less there).

Attached is v24, which breaks out these recent changes to primscan
scheduling into their own commit/patch (namely
v24-0002-Improve-nbtree-SAOP-primitive-scan-scheduling.patch). The
primscan scheduling improvement stuff hasn't really changed since
v23, though (though I did polish it some more). I hope to be able to
commit this new primscan scheduling patch sooner rather than later
(though I don't think that it's quite committable yet). It is
technically an independent improvement to the scheduling of primitive
index scans during SAOP nbtree index scans, so it makes sense to get
it out of the way.

The changes in v24 aren't just structural. The real changes in v24 are
to the optimization previously known as the "skipskip" optimization,
which now appears alone in
v24-0004-Lower-the-overhead-of-nbtree-runtime-skip-checks.patch (since
I broke out the other changes into their own patch). I guess it's
called the "_bt_skip_ikeyprefix" optimization now, since that's the
name of the function that now activates the optimization (the function
that is sometimes called from _bt_readpage during so->skipScan scans).
It seemed better to place the emphasis on starting calls to
_bt_check_compare with a later pstate.ikey (i.e. one greater than 0),
since that's where most of the benefit comes from.

We now treat all of the scan's arrays as nonrequired when the
_bt_skip_ikeyprefix optimization is activated -- even skip arrays can
have nonrequired calls to _bt_advance_array_keys from
_bt_check_compare (that won't just happen for the scan's SAOP arrays
in v24). This approach seems clearer to me. More importantly, it
performs much better than the previous approach in certain complicated
cases involving multiple range skip arrays against several different
index columns. (I can elaborate on what those queries look like and
why they're better with this new approach, if anybody wants me to.)


--
Peter Geoghegan