Thread
-
Remove extra Sort node above a btree-compatible Index Scan
Alexandra Wang <alexandra.wang.oss@gmail.com> — 2025-02-28T05:22:21Z
Hello hackers, While reviewing Mark Dilger’s patch series "Index AM API cleanup" [1], I noticed some unexpected plan diffs when running the xtree and treeb test modules. Specifically, an additional Sort node appears on top of an Index Only Scan, even when the index key and sort key are the same. For example: --- src/test/modules/xtree/expected/interval.out +++ src/test/modules/xtree/results/interval.out @@ -375,10 +375,12 @@ SET enable_seqscan TO false; EXPLAIN (COSTS OFF) SELECT f1 FROM INTERVAL_TBL_OF r1 ORDER BY f1; - QUERY PLAN --------------------------------------------------------------------- - Index Only Scan using interval_tbl_of_f1_idx on interval_tbl_of r1 -(1 row) + QUERY PLAN +-------------------------------------------------------------------------- + Sort + Sort Key: f1 + -> Index Only Scan using interval_tbl_of_f1_idx on interval_tbl_of r1 +(3 rows) I’ve attached a patch that removes this unnecessary Sort node for B-tree-compatible indexes. This change should: - Reduce the number of test failures in the xtree module from 43 to 30 - Reduce the size of regression.diffs from 234K to 123K Since pathkey comparison is a hot path in query planning and exercised by many test queries, I plan to gather more performance metrics. However, in a simple test running make installcheck with and without the patch, I observed no negative impact on the runtime of the regression test suite (which doesn’t include other btree-like indexes) and a positive impact on the same regression tests for xtree. Regression tests (same plans): w/o patch: make installcheck 1.36s user 2.21s system 12% cpu 28.018 total w/ patch: make installcheck 1.32s user 2.12s system 12% cpu 28.089 total xtree tests: w/o patch (inferior plan w/ extra sort node): make installcheck 1.52s user 2.42s system 10% cpu 36.817 total w/ patch (good plan): make installcheck 1.52s user 2.48s system 12% cpu 32.201 total I’ve marked the patch as no-cfbot, as it applies only on top of the aforementioned patch series [1]. Thoughts? [1] https://www.postgresql.org/message-id/a5dfb7cd-7a89-48ab-a913-e5304eee0854%40eisentraut.org Best, Alex
-
Re: Remove extra Sort node above a btree-compatible Index Scan
Peter Geoghegan <pg@bowt.ie> — 2025-02-28T05:30:54Z
On Fri, Feb 28, 2025 at 12:23 AM Alexandra Wang <alexandra.wang.oss@gmail.com> wrote: > While reviewing Mark Dilger’s patch series "Index AM API cleanup" [1], > I noticed some unexpected plan diffs when running the xtree and treeb > test modules. Specifically, an additional Sort node appears on top of > an Index Only Scan, even when the index key and sort key are the same. > For example: > > --- src/test/modules/xtree/expected/interval.out > +++ src/test/modules/xtree/results/interval.out What's the xtree module? There's no such test module? -- Peter Geoghegan
-
Re: Remove extra Sort node above a btree-compatible Index Scan
Tom Lane <tgl@sss.pgh.pa.us> — 2025-02-28T05:43:17Z
Alexandra Wang <alexandra.wang.oss@gmail.com> writes: > I’ve attached a patch that removes this unnecessary Sort node for > B-tree-compatible indexes. This does not look right at all. You can't just ignore the opfamily etc. while deciding whether two pathkeys represent the same sort ordering, as you did in create_mergejoin_plan(). I don't like pathkeys_have_same_sortop() either. The pathkey data structures were designed to let pointer comparison be sufficient for deciding whether pathkeys are equivalent: see the "canonical pathkey" stuff in pathkeys.c. I think that this patch may be band-aiding over some breakage of that concept, but you've not provided enough context to figure out what the underlying problem is. regards, tom lane
-
Re: Remove extra Sort node above a btree-compatible Index Scan
Alexandra Wang <alexandra.wang.oss@gmail.com> — 2025-02-28T18:01:28Z
Hi Peter, On Thu, Feb 27, 2025 at 11:31 PM Peter Geoghegan <pg@bowt.ie> wrote: > On Fri, Feb 28, 2025 at 12:23 AM Alexandra Wang > <alexandra.wang.oss@gmail.com> wrote: > > While reviewing Mark Dilger’s patch series "Index AM API cleanup" [1], > > I noticed some unexpected plan diffs when running the xtree and treeb > > test modules. Specifically, an additional Sort node appears on top of > > an Index Only Scan, even when the index key and sort key are the same. > > For example: > > > > --- src/test/modules/xtree/expected/interval.out > > +++ src/test/modules/xtree/results/interval.out > > What's the xtree module? There's no such test module? > Thank you so much for asking! I borrowed the following three patches from the Index AM API cleanup thread [1] and attached them here for reference: v20-0001: TEST - Add loadable modules as tests of the AM API v20-0003: TEST - Stop requiring BTREE_AM_OID outside btree v20-0012: Use CompareType more and StrategyNumber less The v20-0001 patch introduces the xtree loadable test module. Xtree as an extension is a shallow copy of the btree index, it should behave exactly like btree. Specifically, it uses the same operators and strategy numbers as btree, and all of its IndexAMRoutine methods point to the corresponding btree methods. The only difference is that xtree belongs to a different opfamily. v20-0001 adds two loadable modules: src/test/module/xtree (a shallow copy of btree) src/test/module/xhash (a shallow copy of hash, not relevant to this discussion) These test modules are not meant for commit; they are used solely for testing the Index AM API. By running "make check" in src/test/module/xtree, a script included in the patch copies all tests from src/test/regress and replaces all btree indexes with xtree indexes, allowing verification of how the Index AM API works with non-core indexes. The other two patches serve the following purposes: v20-0003 removes BTREE_AM_OID-specific assertions outside btree code, allowing other Index AMs. v20-0012 is nice to have since it also touches the PathKey struct and adds a CompareType field in PathKey. CompareType is more generic for sort ordering compared to AM-specific strategy numbers. I’d like to include this patch because even if two AMs have different strategy numbers, they could still map to the same CompareType. When it comes to deciding whether two PathKeys represent the same ordering, what really matters is the underlying sort operator, and CompareType is very relevant in that regard. So, with the above patches applied (v20-0012 is optional for this demonstration), I can run the following queries: -- setup create extension xtree; create table t (b int, x int); create index on t using btree(b); create index on t using xtree(x); insert into t select i, i from generate_series(1, 1000)i; analyze t; -- queries explain (costs off) select b from t where b < 10 order by (b); explain (costs off) select x from t where x < 10 order by (x); -- output: test=# explain (costs off) select b from t where b < 10 order by (b); QUERY PLAN ------------------------------------ Index Only Scan using t_b_idx on t Index Cond: (b < 10) (2 rows) test=# explain (costs off) select x from t where x < 10 order by (x); QUERY PLAN ------------------------------------------ Sort Sort Key: x -> Index Only Scan using t_x_idx on t Index Cond: (x < 10) (4 rows) Notice that the xtree index scan introduces an unnecessary Sort node, even though the Index Only Scan already returns rows of x sorted by the same operator (int4lt (”<”)) used in the ORDER BY (x) clause. The problem I'm trying to solve is to eliminate this redundant sort for btree-like index AMs. [1] https://www.postgresql.org/message-id/a5dfb7cd-7a89-48ab-a913-e5304eee0854%40eisentraut.org Best, Alex