Thread

  1. Re: Surprising SeqScan of appendRel that can't contribute any rows to the result

    Dmytro Astapov <dastapov@gmail.com> — 2024-12-18T00:36:32Z

    Thank you for the very detailed answer, much appreciated!
    
    For the benefit of people who might find this in the future via search:  so
    far the best workaround seems to be something along the lines of:
    
    alter table partB add column always_null bigint;
    create index on partB(always_null);
    
    And then I change vw so that instead of constant NULL I expose this column
    instead. Thenat instead of seqscan I get index scan on always_null which
    (relatively) quickly yields zero rows.
    
    
    
    On Wed, 18 Dec 2024, 00:08 David Rowley, <dgrowleyml@gmail.com> wrote:
    
    > On Wed, 18 Dec 2024 at 12:17, Dmytro Astapov <dastapov@gmail.com> wrote:
    > > Surprisingly, this does SeqScan on partB in NestedLoops over some_ids
    > with a filter `some_ids.id = NULL::bigint`:
    > >
    > > Nested Loop (cost=0.29..359.16 rows=200 width=24)
    > >   -> Seq Scan on some_ids i (cost=0.00..1.02 rows=2 width=8)
    > >   -> Append (cost=0.29..178.56 rows=51 width=16)
    > >         -> Index Scan using parta_payload_idx on parta (cost=0.29..8.30
    > rows=1 width=16)
    > >               Index Cond: (payload = i.id)
    > >         -> Seq Scan on partb (cost=0.00..170.00 rows=50 width=16)
    > >               Filter: (i.id = NULL::bigint)
    > >
    > > At the same time `explain select * from vw where payload = 1` correctly
    > skips over partB entirely (the node is eliminated from execution plan), and
    > so does:
    > > explain select * from vw where payload in (1,2);
    > >
    > > However, any query that does not use explicit literal values still leads
    > to SeqScan access on partB, such as:
    > > explain select * from vw where payload in (select id from some_ids);
    > > explain select * from vw where payload = ANY(ARRAY(select id from
    > some_ids));
    > > or various forms of joins
    > >
    > > Do you know if this is expected/documented, or is this a bug?
    >
    > TL;DR is it's not a bug and expected behaviour.
    >
    > We tend not to do much in terms of documentation about which
    > optimisations the query planner does, so it's probably not documented
    > anywhere aside from perhaps the source code.  It might be possible for
    > us to eliminate the scan to "partb" for the first of the plans shown
    > above. However, the code that applies in your example case where the
    > planner does manage to eliminate the scan does so using "base" quals,
    > i.e. quals that are pushed down into the scan level.  See
    > apply_child_basequals().  For the Nested Loop example, the i.id =
    > NULL::bigint isn't a base qual, so it does not work for that case.
    > When we're building paramerised paths, as per what's used in your
    > Nested Loop example above, we've already done the work to eliminate
    > non-matching union children. We don't really have any concept of "this
    > union child does not match for this specific parameterisation", so
    > we'd need to invent something to do that (which perhaps is just
    > removing or not adding the particular unneeded subpath from the Append
    > pathlist.)
    >
    > For the other cases that depend on the results from subqueries, it's
    > more tricky and in many cases not possible to eliminate the scans
    > during query planner for those cases as the planner does not have
    > information to know what will be returned by the subqueries. There
    > might be something very limited we can do in terms of looking to see
    > if the operator is strict or not so that we at least know that NULLs
    > will never match, but that might be quite a corner case that it might
    > not be worth the complexity to make that work.  Someone might need to
    > write it and see how complex it is to implement before we'd know if it
    > was a worthwhile optimisation or not.
    >
    > David
    >