Thread

  1. Re: GROUP BY ROLLUP queries on views trigger full table scans (index usage not optimized)

    Richard Guo <guofenglinux@gmail.com> — 2025-11-18T08:27:40Z

    On Mon, Nov 17, 2025 at 5:02 PM Haowu Ge <gehaowu@bitmoe.com> wrote:
    > # 8. Query the view table to compute the average unit price for material 'PI',using a simple GROUP BY (rollup)
    > EXPLAIN ANALYZE
    > SELECT material_id, AVG(unit_price) as avg_price
    > FROM materials_view
    > WHERE material_id = 'PI'
    > GROUP BY ROLLUP(material_id);
    
    > GroupAggregate  (cost=0.00..1097.39 rows=251 width=64) (actual time=3.983..3.985 rows=2.00 loops=1)
    >   Group Key: materials.material_id
    >   Group Key: ()
    >   Buffers: shared hit=468
    >   ->  Seq Scan on materials  (cost=0.00..1093.00 rows=250 width=48) (actual time=0.018..3.970 rows=1.00 loops=1)
    >         Filter: ((material_id)::text = 'PI'::text)
    >         Rows Removed by Filter: 49999
    >         Buffers: shared hit=468
    > Planning Time: 0.117 ms
    > Execution Time: 4.029 ms
    
    Thanks for the report.  The reason why this query cannot use an index
    scan is that, when the parent query uses grouping sets, all subquery
    outputs are wrapped in PlaceHolderVars.  This is to ensure that these
    expressions retain their separate identity so that they can match to
    the expected grouping set columns (cf. cc5d98525).  However, it also
    prevents the WHERE clause from matching the index column, which is why
    the index cannot be used.
    
    Prior to v18, we wrapped only the subquery outputs that were non-Var
    expressions.  That is why this query was able to use an index scan.
    However, that behavior could produce incorrect results when the
    subquery's targetlist contained two or more identical Var expressions.
    In addition, if the targetlist included any non-Var expressions, we
    could also fail to match the WHERE clause to the corresponding index
    expressions.
    
    After looking at this query more closely, it seems that there is no
    need to wrap the expressions in the WHERE clause in the first place.
    It should be sufficient to wrap only the expressions that are used
    logically above the aggregation step, ie, those in the targetlist and
    in the havingQual.  However, pullup_replace_vars currently lacks the
    infrastructure to do that.
    
    Another possible fix is to detect whether the subquery's targetlist
    contains two or more identical Var expressions and, if not, fall back
    to wrapping only the non-Var expressions.  This would restore the
    behavior that existed prior to v18 for this query.
    
    Alternatively, we could teach match_index_to_operand to unwrap PHV
    expressions when matching them to indexes.  This would also allow the
    WHERE clause to match indexes for non-Var expressions, which is an
    improvement over the behavior prior to v18.  But I haven't fully
    thought through whether this approach is safe in all cases,
    particularly when outer joins are involved.
    
    Any thoughts?
    
    - Richard