Thread

  1. Surprising SeqScan of appendRel that can't contribute any rows to the result

    Dmytro Astapov <dastapov@gmail.com> — 2024-12-17T22:15:44Z

    Hi!
    
    OS: Debian, Rock Linux
    Postgres versions: 13.6, 15.6, 17.0
    
    Setup:
    
    create table partA(id bigint not null, payload bigint);
    
    insert into partA select s, s from generate_series(1,10000) s;
    analyze partA;
    create index on partA(id);
    create index on partA(payload);
    
    create table partB(id bigint not null);
    insert into partB select s from generate_series(1,10000) s;
    analyze partB;
    create index on partB(id);
    
    create view vw as
      select id, payload from partA
      union all
      select id, NULL as payload from partB;
    
    As you can see, we have a view that UNION ALLs two tables with different
    number of columns. Missing column from partB is stubbed out with a constant
    NULL.
    
    Now we want to join this view with a small table that has some numbers that
    we want to find in the `payload` column:
    
    create table some_ids(id bigint not null);
    insert into some_ids select s from generate_series(1,2) s;
    analyze some_ids;
    
    explain select * from some_ids i join vw on (vw.payload = i.id);
    
    Surprisingly, this does SeqScan on partB in NestedLoops over some_ids with
    a filter `some_ids.id = NULL::bigint`:
    
    Nested Loop (cost=0.29..359.16 rows=200 width=24)
      -> Seq Scan on some_ids i (cost=0.00..1.02 rows=2 width=8)
      -> Append (cost=0.29..178.56 rows=51 width=16)
            -> Index Scan using parta_payload_idx on parta (cost=0.29..8.30
    rows=1 width=16)
                  Index Cond: (payload = i.id)
            -> Seq Scan on partb (cost=0.00..170.00 rows=50 width=16)
                  Filter: (i.id = NULL::bigint)
    
    At the same time `explain select * from vw where payload = 1` correctly
    skips over partB entirely (the node is eliminated from execution plan), and
    so does:
    explain select * from vw where payload in (1,2);
    
    However, any query that does not use explicit literal values still leads to
    SeqScan access on partB, such as:
    explain select * from vw where payload in (select id from some_ids);
    explain select * from vw where payload = ANY(ARRAY(select id from
    some_ids));
    or various forms of joins
    
    Do you know if this is expected/documented, or is this a bug?
    
    Same setup in db-fiddle if you want to give it a quick spin:
    https://www.db-fiddle.com/f/hNLCR9wou9TYzcLG57q9kj/3 or
    https://dbfiddle.uk/5o5LQlEB
    
    Best regards, Dmytro