Re: Adding skip scan (including MDAM style range skip scan) to nbtree

Peter Geoghegan <pg@bowt.ie>

From: Peter Geoghegan <pg@bowt.ie>
To: Aleksander Alekseev <aleksander@timescale.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2024-07-02T13:30:28Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. nbtree: Always set skipScan flag on rescan.

  2. meson: Build numeric.c with -ftree-vectorize.

  3. Fix "variable not found in subplan target lists" in semijoin de-duplication.

  4. Revert "nbtree: Remove useless row compare arg."

  5. nbtree: Remove useless row compare arg.

  6. Prevent premature nbtree array advancement.

  7. nbtree: tighten up array recheck rules.

  8. Avoid treating nonrequired nbtree keys as required.

  9. Adjust overstrong nbtree skip array assertion.

  10. Make NULL tuple values always advance skip arrays.

  11. Avoid extra index searches through preprocessing.

  12. Improve nbtree skip scan primitive scan scheduling.

  13. Further optimize nbtree search scan key comparisons.

  14. Add nbtree skip scan optimization.

  15. Improve nbtree array primitive scan scheduling.

  16. nbtree: Make BTMaxItemSize into object-like macro.

  17. Show index search count in EXPLAIN ANALYZE, take 2.

  18. Make parallel nbtree index scans use an LWLock.

  19. Show index search count in EXPLAIN ANALYZE.

  20. Avoid nbtree parallel scan currPos confusion.

  21. nbtree: Remove useless 'strat' local variable.

  22. Normalize nbtree truncated high key array behavior.

  23. Refactor handling of nbtree array redundancies.

  24. Fix nbtree pgstats accounting with parallel scans.

  25. Avoid parallel nbtree index scan hangs with SAOPs.

  26. Show Parallel Bitmap Heap Scan worker stats in EXPLAIN ANALYZE

  27. Enhance nbtree ScalarArrayOp execution.

  28. Skip checking of scan keys required for directional scan in B-tree

  29. Instead of using a numberOfRequiredKeys count to distinguish required

On Tue, Jul 2, 2024 at 8:53 AM Aleksander Alekseev
<aleksander@timescale.com> wrote:
> CREATE TABLE test1(c char, n bigint);
> CREATE INDEX test1_idx ON test1 USING btree(c,n);

The type "char" (note the quotes) is different from char(1). It just
so happens that v1 has support for skipping attributes that use the
default opclass for "char", without support for char(1).

If you change your table definition to CREATE TABLE test1(c "char", n
bigint), then your example queries can use the optimization. This
makes a huge difference.

> EXPLAIN [ANALYZE] SELECT COUNT(*) FROM test1 WHERE n > 900_000_000;

For example, this first test query goes from needing a full index scan
that has 5056 buffer hits to a skip scan that requires only 12 buffer
hits.

> I noticed that with the patch we choose Index Only Scans for Test 1
> and without the patch - Parallel Seq Scan. However the Parallel Seq
> Scan is 2.4 times faster. Before the patch the query takes 53 ms,
> after the patch - 127 ms.

I'm guessing that it's actually much faster once you change the
leading column to the "char" type/default opclass.

> I realize this could be just something
> specific to my hardware and/or amount of data.

The selfuncs.c costing current has a number of problems.

One problem is that it doesn't know that some opclasses/types don't
support skipping at all. That particular problem should be fixed on
the nbtree side; nbtree should support skipping regardless of the
opclass that the skipped attribute uses (while still retaining the new
opclass support functions for a subset of types where we expect it to
make skip scans somewhat faster).

-- 
Peter Geoghegan