Thread

  1. Improve Hash/Merge Join estimate accuracy when all predicates are Hash/Merge clauses

    Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> — 2025-07-10T10:09:48Z

    Hi hackers,
    
    The planner currently calls approx_tuple_count() to estimate 
    hashjointuples and mergejointuples. That makes sense when 
    joinrestrictinfo contains additional clauses beyond the hash/merge 
    equality list. But if all join restriction clauses are exactly those 
    hash/merge clauses, the estimate already computed in 
    path->jpath.path.rows is usually more accurate (and free).
    
    This patch reuses path->jpath.path.rows in that case and skips 
    approx_tuple_count().
    
    Regression results
    ==================
    
                          | actual | approx | estimate
    ---------------------+--------+--------+-------
    join.sql q1          |      5 |      1 |     5
    join.sql q2          | 10 000 |      1 | 10 000
    join.sql q3          |      5 |      5 |     5
    join.sql q4          |      5 |      5 |     5
    join.sql q5          |      5 |      1 |     5
    partition_join q1    |    200 |      1 |   200
    partition_join q2    |     42 |     84 |    84
    partition_join q3    |      8 |      1 |     8
    postgres_fdw         |   2001 |   1001 |  2001
    select_parallel q1   |      0 |  5 000 | 5 000
    updatable_views q1   |      2 |      2 |   423
    
    Two cases get worse: select_parallel.sql and updatable_views.sql.
    
    Looking forward to your feedback!
    
    --
    Best regards,
    Ilia Evdokimov,
    Tantor Labs LLC.