Thread

  1. Index Searches higher than expected for skip scan

    Michael Christofides <michael@pgmustard.com> — 2025-11-06T19:00:46Z

    Hi folks,
    
    I'm trying to understand the new Index Searches field in Postgres 18
    explain analyze output. I've put together a super simple test case (below)
    expecting a skip scan with 2 Index Searches, one for each value in the
    leading (boolean) column of the index.
    
    In reality, instead of 2 Index Searches, I got 4 (query plan below). I also
    experimented with a leading column with 5 distinct values (expecting 5
    searches), and got 7. I've not included that below, for brevity.
    
    I suspect I'm missing something obvious in either my understanding or my
    setup, but wondered why this might be happening?
    
    All the best,
    Michael
    
    
    CREATE TABLE example (
       integer_field bigint NOT NULL,
       boolean_field bool NOT NULL);
    
    INSERT INTO example (integer_field, boolean_field)
       SELECT random () * 10_000,
              random () < 0.5
       FROM   generate_series(1, 100_000);
    
    CREATE INDEX bool_int_idx
       ON example (boolean_field, integer_field);
    
    VACUUM ANALYZE example;
    
    EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS)
    SELECT boolean_field FROM example WHERE integer_field = 5432;
    
                                                                  QUERY PLAN
    
    ---------------------------------------------------------------------------------------------------------------------------------------
     Index Only Scan using bool_int_idx on public.example  (cost=0.29..13.04
    rows=10 width=1) (actual time=0.230..0.274 rows=5.00 loops=1)
       Output: boolean_field
       Index Cond: (example.integer_field = 5432)
       Heap Fetches: 0
       Index Searches: 4
       Buffers: shared hit=9
     Planning Time: 0.240 ms
     Execution Time: 0.323 ms
    
  2. Re: Index Searches higher than expected for skip scan

    Peter Geoghegan <pg@bowt.ie> — 2025-11-06T19:54:40Z

    On Thu, Nov 6, 2025 at 2:01 PM Michael Christofides
    <michael@pgmustard.com> wrote:
    > I'm trying to understand the new Index Searches field in Postgres 18 explain analyze output. I've put together a super simple test case (below) expecting a skip scan with 2 Index Searches, one for each value in the leading (boolean) column of the index.
    
    That's a good question. You're right that in this specific case, with
    a boolean column, skip scan performs 4 index searches, when in
    principle it only really needs to do 3.
    
    > In reality, instead of 2 Index Searches, I got 4 (query plan below).
    
    During work on skip scan (and on the 17 work), I went to quite a lot
    of effort to build custom instrumentation that would show me exactly
    what an index scan was doing. Including details of the scan's array
    keys, and how they advance as the scan advances through the index.
    
    Attached is its output when I run your test query. The issue here is
    that skip scan thinks that there are 4 distinct skip array values that
    it must use:
    
    1. SK_BT_MINVAL
    2. false
    3. true
    4. SK_ISNULL
    
    SK_BT_MINVAL is a sentinel value that represents the lowest possible
    value that can be stored by the data type. That's the same thing as
    "false", which is a little bit unfortunate with a datatype like bool,
    where the difference might actually matter -- here we waste an access
    to the leftmost leaf page to "advance" the skip array from
    SK_BT_MINVAL to false, instead of making false the lowest skip array
    value directly, from the very start.
    
    This is highly unlikely to matter with a data type like integer,
    though: in practice it's very unlikely that the value INT_MIN is
    actually stored in the index, so having 2 distinct representations for
    the same thing (SK_BT_MINVAL and INT_MIN) is equally unlikely to
    result in the scan reading any more leaf pages than strictly necessary
    due to this implementation deficiency. We'll have to go to the
    leftmost leaf page to determine what the real lowest value in the
    index is either way -- doesn't matter if you start from SK_BT_MINVAL
    or from INT_MIN (unless there really are hundreds of tuples that have
    the value INT_MIN, when the difference between those 2 things starts
    to matter, as with your bool test case).
    
    I did actually think about making this work. In fact, there were
    revisions of the skip scan patch where it actually did work.
    Ultimately I judged that it wasn't worth the trouble of introducing
    this special case. Better to have types with skip support (like
    boolean and integer) behave exactly the same as all other types, by
    also using these SK_BT_MINVAL/SK_BT_MAXVAL sentinels (we don't use
    SK_BT_MAXVAL at all here because the highest skip array value is
    SK_ISNULL, but we would if the index was declared NULLS FIRST).
    
    That just leaves SK_ISNULL. We cannot assume that the index doesn't
    have any NULLs (unless the query uses IS NOT NULL directly). We might
    end up having to read the rightmost leaf page anyway, in which case
    there won't be an extra search for SK_ISNULL, but we'll likely need an
    extra search for this too (just like the leftmost leaf page, with
    SK_BT_MINVAL). This isn't an implementation deficiency -- it's
    strictly necessary for correctness.
    
    -- 
    Peter Geoghegan
    
  3. Re: Index Searches higher than expected for skip scan

    Peter Geoghegan <pg@bowt.ie> — 2025-11-06T20:39:24Z

    On Thu, Nov 6, 2025 at 2:54 PM Peter Geoghegan <pg@bowt.ie> wrote:
    > That just leaves SK_ISNULL. We cannot assume that the index doesn't
    > have any NULLs (unless the query uses IS NOT NULL directly).
    
    Actually, that won't work here. Because the optimizer recognizes that
    the leading boolean column isn't nullable, and "helpfully"
    optimizes-away the IS NOT NULL qualification. But if the column *was*
    nullable, adding IS NOT NULL would cut the number of index searches by
    1.
    
    > We might end up having to read the rightmost leaf page anyway, in which case
    > there won't be an extra search for SK_ISNULL, but we'll likely need an
    > extra search for this too (just like the leftmost leaf page, with
    > SK_BT_MINVAL).
    
    I said the wrong thing here. In fact, it's very likely that we will
    have to read the rightmost leaf page with a query that has no qual on
    the leading column, no matter the datatype. Here's why:
    
    Suppose we only store tuples with the values 1 - 100 in the leading
    column "a", for a query "WHERE b = 5432", with an index on (a, b).
    Once skip scan reaches the point where it returns "(a, b) = (100,
    5432)" to the scan (or once it sees that there's no such match in the
    index), it'll guess that the next "a" value is 101. We'll then try and
    fail to find a match "(a, b) = (101, 5432)".
    
    If the column "a" has no NULLs, then the scan will already be at the
    rightmost position of its rightmost leaf page -- so we're done then
    and there. Notice that there hasn't been an extra search for SK_ISNULL
    here. Because we visited the leaf page where NULLs would have been,
    had there been any, and found that there was none at all (which is the
    common case, contrary to what I said earlier).
    
    If, on the other hand, the column "a" has many NULLs, then this failed
    attempt to find "(a, b) = (101, 5432)" will advance the array on "a"
    from 101 to SK_ISNULL, and then perform another search, this time for
    "(a, b) = (NULL, 5432)". Again, notice that this also isn't wasteful
    -- we simply have no better way to reliably determine that there is no
    non-NULL value after 100 in this index.
    
    It's perhaps worth noting that your original boolean example shows
    that skip scan *is* directly aware that the next value after false is
    SK_ISNULL -- even though (as I said in my first email from earlier) it
    cannot do the similar trick of knowing that the lowest value really
    should be false (this is knowable before the first scan/search even
    begins). There are very obscure performance reasons for this
    inconsistency, though it might seem a bit arbitrary.
    
    -- 
    Peter Geoghegan
    
    
    
    
  4. Re: Index Searches higher than expected for skip scan

    Michael Christofides <michael@pgmustard.com> — 2025-11-07T11:16:34Z

    Thank you for the incredibly helpful (and fast) replies Peter.
    
    
    > Attached is its output when I run your test query. The issue here is
    
    that skip scan thinks that there are 4 distinct skip array values that
    
    it must use:
    
    
    
    1. SK_BT_MINVAL
    
    2. false
    
    3. true
    
    4. SK_ISNULL
    
    
    This output in particular really helped it make sense to me.
    
    
    > But if the column *was* nullable, adding IS NOT NULL would cut the
    
    number of index searches by 1.
    >
    
    Nice idea. Once it sunk in, I realised I could try the explicit "AND
    boolean_field IN (true, false)" and got it down to 2 index searches:
    
    EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS)
    SELECT boolean_field FROM example WHERE integer_field = 5432 AND
    boolean_field IN (true, false);
    
                                                                  QUERY PLAN
    
    ---------------------------------------------------------------------------------------------------------------------------------------
     Index Only Scan using bool_int_idx on public.example  (cost=0.29..8.79
    rows=10 width=1) (actual time=0.060..0.077 rows=12.00 loops=1)
       Output: boolean_field
       Index Cond: ((example.boolean_field = ANY ('{t,f}'::boolean[])) AND
    (example.integer_field = 5432))
       Heap Fetches: 0
       Index Searches: 2
       Buffers: shared hit=5
     Planning Time: 0.265 ms
     Execution Time: 0.115 ms
    
    Thanks again,
    Michael
    
  5. Re: Index Searches higher than expected for skip scan

    Peter Geoghegan <pg@bowt.ie> — 2025-11-07T15:00:49Z

    On Fri, Nov 7, 2025 at 6:16 AM Michael Christofides
    <michael@pgmustard.com> wrote:
    > Thank you for the incredibly helpful (and fast) replies Peter.
    
    You're welcome.
    
    > Nice idea. Once it sunk in, I realised I could try the explicit "AND boolean_field IN (true, false)" and got it down to 2 index searches:
    >
    > EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS)
    > SELECT boolean_field FROM example WHERE integer_field = 5432 AND boolean_field IN (true, false);
    
    That's using the Postgres 17 work. You could also write the query as
    "SELECT boolean_field FROM example WHERE integer_field = 5432 AND
    boolean_field BETWEEN false AND true" and get 2 index searches. That
    variant uses what I've called "range skip scan", which is new in
    Postgres 18.
    
    -- 
    Peter Geoghegan