Thread

  1. Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)

    Hitoshi Harada <umi.tanuki@gmail.com> — 2011-07-02T08:02:07Z

    2011/7/2 Hitoshi Harada <umi.tanuki@gmail.com>:
    > 2011/6/29 Yeb Havinga <yebhavinga@gmail.com>:
    >>
    >> On 2011-06-17 09:54, Hitoshi Harada wrote:
    >>>
    >>> While reviewing the gist/box patch, I found some planner APIs that can
    >>> replace parts in my patch. Also, comments in includes wasn't updated
    >>> appropriately. Revised patch attached.
    >>
    >> Hello Hitoshi-san,
    >>
    >> I read your latest patch implementing parameterizing subquery scans.
    >
    > Attached is revised version.
    
    I failed to attached the patch. I'm trying again.
    
    >> 1)
    >> In the email from june 9 with the patch You wrote: "While IndexScan
    >> is simple since its information like costs are well known by the base
    >> relation, SubqueryScan should re-plan its Query to gain that, which is
    >> expensive."
    >>
    >> Initial concerns I had were caused by misinterpreting 'replanning' as: for
    >> each outer tuple, replan the subquery (it sounds a bit like 'ReScan').
    >> Though the general comments in the patch are helpful, I think it would be an
    >> improvement to describe why subqueries are planned more than once, i.e.
    >> something like
    >> "best_inner_subqueryscan
    >>   Try to find a better subqueryscan path and its associated plan for each
    >> join clause that can be pushed down, in addition to the path that is already
    >> calculated by set_subquery_pathlist."
    >
    > I changed comments around set_subquery_pathlist and best_inner_subqueryscan.
    >
    >> 2)
    >> Since 'subquery_is_pushdown_safe' is invariant under join clause push down,
    >> it might be possible to have it called only once in set_subquery_pathlist,
    >> i.e. by only attaching rel->preprocessed_subquery if the
    >> subquery_is_pushdown_safe.
    >
    > I modified as you suggested.
    >
    >> 3)
    >> /*
    >>  * set_subquery_pathlist
    >>  *        Build the (single) access path for a subquery RTE
    >>  */
    >> This unchanged comment is still correct, but 'the (single) access path'
    >> might have become a bit misleading now there are multiple paths possible,
    >> though not by set_subquery_pathlist.
    >
    > As noted #1.
    >
    >> 4) somewhere in the patch
    >> s/regsitered/registered/
    >
    > Fixed.
    >
    >> 5) Regression tests are missing; I think at this point they'd aid in
    >> speeding up development/test cycles.
    >
    > I'm still thinking about it. I can add complex test but the concept of
    > regression test focuses on small pieces of simple cases. I don't want
    > take pg_regress much more than before.
    >
    >> 6) Before patching Postgres, I could execute the test with the size_l and
    >> size_m tables, after patching against current git HEAD (patch without
    >> errors), I get the following error when running the example query:
    >> ERROR:  plan should not reference subplan's variable
    >>
    >> I get the same error with the version from june 9 with current git HEAD.
    >
    > Fixed. Some variable initializing was wrong.
    >
    >> 7) Since both set_subquery_pathlist and best_inner_subqueryscan push down
    >> clauses, as well as add a path and subplan, could a generalized function be
    >> made to support both, to reduce duplicate code?
    >
    > No touch as answered before.
    >
    > Although I still need to think about suitable regression test case,
    > the patch itself can be reviewed again. You may want to try some
    > additional tests as you imagine after finding my test case gets
    > quicker.
    >
    > Thanks,
    >
    > --
    > Hitoshi Harada
    >
    
    
    
    -- 
    Hitoshi Harada