Thread

  1. Re: Reduce "Var IS [NOT] NULL" quals during constant folding

    Chengpeng Yan <chengpeng_yan@outlook.com> — 2025-05-03T10:48:44Z

    
    > On May 1, 2025, at 16:33, Richard Guo <guofenglinux@gmail.com> wrote:
    > 
    > Here is the patchset that implements this optimization.  0001 moves
    > the expansion of virtual generated columns to occur before sublink
    > pull-up.  0002 introduces a new function, preprocess_relation_rtes,
    > which scans the rangetable for relation RTEs and performs inh flag
    > updates and virtual generated column expansion in a single loop, so
    > that only one table_open/table_close call is required for each
    > relation.  0003 collects NOT NULL attribute information for each
    > relation within the same loop, stores it in a relation OID based hash
    > table, and uses this information to reduce NullTest quals during
    > constant folding.
    > 
    > I think the code now more closely resembles the phase 1 and phase 2
    > described earlier: it collects all required early-stage catalog
    > information within a single loop over the rangetable, allowing each
    > relation to be opened and closed only once.  It also avoids the
    > has_subclass() call along the way.
    > 
    > Thanks
    > Richard
    > <v4-0001-Expand-virtual-generated-columns-before-sublink-p.patch><v4-0002-Centralize-collection-of-catalog-info-needed-earl.patch><v4-0003-Reduce-Var-IS-NOT-NULL-quals-during-constant-fold.patch>
    
    Hi,
    
    I've been following the V4 patches (focusing on 1 and 2 for now): Patch 2's preprocess_relation_rtes is a nice improvement for efficiently gathering early catalog info like inh and attgenerated definitions in one pass.
    
    However, Patch 1 needs to add expansion calls inside specific pull-up functions (like convert_EXISTS_sublink_to_join) because the main expansion work was moved before pull_up_sublinks.
    
    Could we perhaps simplify this? What if preprocess_relation_rtes only collected the attgenerated definitions (storing them, maybe in a hashtable like planned for attnotnull in Patch 3), but didn't perform the actual expansion (Var replacement)?
    
    Then, we could perform the actual expansion (Var replacement) in a separate, single, global step later on. Perhaps after pull_up_sublinks (closer to the original timing), or maybe even later still, for instance after flatten_simple_union_all, once the main query structure including pulled-up subqueries/links has stabilized? A unified expansion after the major structural changes seems cleaner. I'm not sure where is the better position now.
    
    This might avoid the need for the extra expansion calls within convert_EXISTS_sublink_to_join, etc., keeping the information gathering separate from the expression transformation and potentially making the overall flow a bit cleaner.
    
    Any thoughts?
    
    Thanks,
    
    Chengpeng Yan