Thread

  1. Re: Add a greedy join search algorithm to handle large join problems

    Chengpeng Yan <chengpeng_yan@outlook.com> — 2025-12-02T10:22:51Z

    Hi,
    
    Thanks for taking a look.
    
    > On Dec 2, 2025, at 13:36, Dilip Kumar <dilipbalaut@gmail.com> wrote:
    > 
    > Is pgbench the right workload to test this, I mean what are we trying
    > to compare here the planning time taken by DP vs GEQO vs GOO or the
    > quality of the plans generated by different join ordering algorithms
    > or both?  All pgbench queries are single table scans and there is no
    > involvement of the join search, so I am not sure how we can justify
    > these gains?
    
    Just to clarify: as noted in the cover mail, the numbers are not from
    default pgbench queries, but from the star-join / snowflake workloads in
    thread [1], using the benchmark included in the v5-0001 patch. These
    workloads contain multi-table joins and do trigger join search; you can
    reproduce them by configuring the GUCs as described in the cover mail.
    
    The benchmark tables contain no data, so execution time is negligible;
    the results mainly reflect planning time of the different join-ordering
    methods, which is intentional for this microbenchmark.
    
    A broader evaluation on TPC-H / TPC-DS / JOB is TODO, covering both
    planning time and plan quality. That should provide a more
    representative picture of GOO, beyond this synthetic setup.
    
    References: 
    [1] Star/snowflake join thread and benchmarks:
    https://www.postgresql.org/message-id/a22ec6e0-92ae-43e7-85c1-587df2a65f51%40vondra.me
    
    --
    Best regards,
    Chengpeng Yan