Thread

  1. Remove extra Sort node above a btree-compatible Index Scan

    Alexandra Wang <alexandra.wang.oss@gmail.com> — 2025-02-28T05:22:21Z

    Hello hackers,
    
    While reviewing Mark Dilger’s patch series "Index AM API cleanup" [1],
    I noticed some unexpected plan diffs when running the xtree and treeb
    test modules. Specifically, an additional Sort node appears on top of
    an Index Only Scan, even when the index key and sort key are the same.
    For example:
    
    --- src/test/modules/xtree/expected/interval.out
    +++ src/test/modules/xtree/results/interval.out
    @@ -375,10 +375,12 @@
     SET enable_seqscan TO false;
     EXPLAIN (COSTS OFF)
     SELECT f1 FROM INTERVAL_TBL_OF r1 ORDER BY f1;
    -                             QUERY PLAN
    ---------------------------------------------------------------------
    - Index Only Scan using interval_tbl_of_f1_idx on interval_tbl_of r1
    -(1 row)
    +                                QUERY PLAN
    +--------------------------------------------------------------------------
    + Sort
    +   Sort Key: f1
    +   ->  Index Only Scan using interval_tbl_of_f1_idx on interval_tbl_of r1
    +(3 rows)
    
    
    I’ve attached a patch that removes this unnecessary Sort node for
    B-tree-compatible indexes. This change should:
    - Reduce the number of test failures in the xtree module from 43 to 30
    - Reduce the size of regression.diffs from 234K to 123K
    
    Since pathkey comparison is a hot path in query planning and exercised
    by many test queries, I plan to gather more performance metrics.
    However, in a simple test running make installcheck with and without
    the patch, I observed no negative impact on the runtime of the
    regression test suite (which doesn’t include other btree-like indexes)
    and a positive impact on the same regression tests for xtree.
    
    Regression tests (same plans):
    w/o patch:
    make installcheck  1.36s user 2.21s system 12% cpu 28.018 total
    w/ patch:
    make installcheck  1.32s user 2.12s system 12% cpu 28.089 total
    
    xtree tests:
    w/o patch (inferior plan w/ extra sort node):
    make installcheck  1.52s user 2.42s system 10% cpu 36.817 total
    w/ patch (good plan):
    make installcheck  1.52s user 2.48s system 12% cpu 32.201 total
    
    I’ve marked the patch as no-cfbot, as it applies only on top of the
    aforementioned patch series [1].
    
    Thoughts?
    
    [1]
    https://www.postgresql.org/message-id/a5dfb7cd-7a89-48ab-a913-e5304eee0854%40eisentraut.org
    
    Best,
    Alex
    
  2. Re: Remove extra Sort node above a btree-compatible Index Scan

    Peter Geoghegan <pg@bowt.ie> — 2025-02-28T05:30:54Z

    On Fri, Feb 28, 2025 at 12:23 AM Alexandra Wang
    <alexandra.wang.oss@gmail.com> wrote:
    > While reviewing Mark Dilger’s patch series "Index AM API cleanup" [1],
    > I noticed some unexpected plan diffs when running the xtree and treeb
    > test modules. Specifically, an additional Sort node appears on top of
    > an Index Only Scan, even when the index key and sort key are the same.
    > For example:
    >
    > --- src/test/modules/xtree/expected/interval.out
    > +++ src/test/modules/xtree/results/interval.out
    
    What's the xtree module? There's no such test module?
    
    -- 
    Peter Geoghegan
    
    
    
    
  3. Re: Remove extra Sort node above a btree-compatible Index Scan

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-02-28T05:43:17Z

    Alexandra Wang <alexandra.wang.oss@gmail.com> writes:
    > I’ve attached a patch that removes this unnecessary Sort node for
    > B-tree-compatible indexes.
    
    This does not look right at all.  You can't just ignore the opfamily
    etc. while deciding whether two pathkeys represent the same sort
    ordering, as you did in create_mergejoin_plan().  I don't like
    pathkeys_have_same_sortop() either.  The pathkey data structures
    were designed to let pointer comparison be sufficient for deciding
    whether pathkeys are equivalent: see the "canonical pathkey" stuff
    in pathkeys.c.  I think that this patch may be band-aiding over some
    breakage of that concept, but you've not provided enough context to
    figure out what the underlying problem is.
    
    			regards, tom lane
    
    
    
    
  4. Re: Remove extra Sort node above a btree-compatible Index Scan

    Alexandra Wang <alexandra.wang.oss@gmail.com> — 2025-02-28T18:01:28Z

    Hi Peter,
    
    On Thu, Feb 27, 2025 at 11:31 PM Peter Geoghegan <pg@bowt.ie> wrote:
    
    > On Fri, Feb 28, 2025 at 12:23 AM Alexandra Wang
    > <alexandra.wang.oss@gmail.com> wrote:
    > > While reviewing Mark Dilger’s patch series "Index AM API cleanup" [1],
    > > I noticed some unexpected plan diffs when running the xtree and treeb
    > > test modules. Specifically, an additional Sort node appears on top of
    > > an Index Only Scan, even when the index key and sort key are the same.
    > > For example:
    > >
    > > --- src/test/modules/xtree/expected/interval.out
    > > +++ src/test/modules/xtree/results/interval.out
    >
    > What's the xtree module? There's no such test module?
    >
    
    Thank you so much for asking!
    
    I borrowed the following three patches from the Index AM API cleanup
    thread [1] and attached them here for reference:
    
    v20-0001: TEST - Add loadable modules as tests of the AM API
    v20-0003: TEST - Stop requiring BTREE_AM_OID outside btree
    v20-0012: Use CompareType more and StrategyNumber less
    
    The v20-0001 patch introduces the xtree loadable test module. Xtree as
    an extension is a shallow copy of the btree index, it should behave
    exactly like btree. Specifically, it uses the same operators and
    strategy numbers as btree, and all of its IndexAMRoutine methods point
    to the corresponding btree methods. The only difference is that xtree
    belongs to a different opfamily.
    
    v20-0001 adds two loadable modules:
    src/test/module/xtree (a shallow copy of btree)
    src/test/module/xhash (a shallow copy of hash, not relevant to this
    discussion)
    
    These test modules are not meant for commit; they are used solely for
    testing the Index AM API.
    
    By running "make check" in src/test/module/xtree, a script included in
    the patch copies all tests from src/test/regress and replaces all
    btree indexes with xtree indexes, allowing verification of how the
    Index AM API works with non-core indexes.
    
    The other two patches serve the following purposes:
    
    v20-0003 removes BTREE_AM_OID-specific assertions outside btree code,
    allowing other Index AMs.
    
    v20-0012 is nice to have since it also touches the PathKey struct and
    adds a CompareType field in PathKey. CompareType is more generic for
    sort ordering compared to AM-specific strategy numbers. I’d like to
    include this patch because even if two AMs have different strategy
    numbers, they could still map to the same CompareType. When it comes
    to deciding whether two PathKeys represent the same ordering, what
    really matters is the underlying sort operator, and CompareType is
    very relevant in that regard.
    
    So, with the above patches applied (v20-0012 is optional for this
    demonstration), I can run the following queries:
    
    -- setup
    create extension xtree;
    create table t (b int, x int);
    create index on t using btree(b);
    create index on t using xtree(x);
    insert into t select i, i from generate_series(1, 1000)i;
    analyze t;
    
    -- queries
    explain (costs off) select b from t where b < 10 order by (b);
    explain (costs off) select x from t where x < 10 order by (x);
    
    -- output:
    test=# explain (costs off) select b from t where b < 10 order by (b);
                 QUERY PLAN
    ------------------------------------
     Index Only Scan using t_b_idx on t
       Index Cond: (b < 10)
    (2 rows)
    
    test=# explain (costs off) select x from t where x < 10 order by (x);
                    QUERY PLAN
    ------------------------------------------
     Sort
       Sort Key: x
       ->  Index Only Scan using t_x_idx on t
             Index Cond: (x < 10)
    (4 rows)
    
    Notice that the xtree index scan introduces an unnecessary Sort node,
    even though the Index Only Scan already returns rows of x sorted by
    the same operator (int4lt (”<”)) used in the ORDER BY (x) clause.
    
    The problem I'm trying to solve is to eliminate this redundant sort
    for btree-like index AMs.
    
    [1]
    https://www.postgresql.org/message-id/a5dfb7cd-7a89-48ab-a913-e5304eee0854%40eisentraut.org
    
    Best,
    Alex