Thread

  1. Re: explain analyze rows=%.0f

    Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> — 2025-04-09T19:40:42Z

    On 31.03.2025 23:59, Ilia Evdokimov wrote:
    >
    > We definitely shouldn’t remove the row counts < 1 check, since there 
    > are many places in the planner where we divide by rows. This mechanism 
    > was added specifically to prevent division by zero. Also, allowing 
    > rows estimates below 1 can sometimes make the planner overly 
    > optimistic, leading it to prefer cheaper-looking plans that may not 
    > perform well in practice. For example, choosing a Nested Loop instead 
    > of a more appropriate Hash Join.
    >
    > Allowing fractional rows > 1 might help improve planner accuracy in 
    > some cases, but this needs further study to fully understand the impact.
    >
    
    I've been investigating whether it's worth removing rounding in row 
    estimates - and I believe it is.
    
    [ v1-0001-Always-use-two-fractional-digits-for-estimated-rows_SRC.patch ]
    
    Currently, we round most row estimates using rint() inside 
    clamp_row_est(). However, this function is also used for rounding tuples 
    and page counts. These should remain integral, but row estimates can and 
    should remain fractional for better precision. To address this, I 
    introduced a new function clamp_tuple_est() which retains the existing 
    rounding behavior (via rint()), while clamp_tuple_est() no longer 
    rounds. I use clamp_tuple_est() only for row estimates and 
    clamp_tuple_est() for tuples and pages elsewhere.
    
    After removing rounding, two small issues needed fixing. First, there 
    was a check rows > 1 in the cost estimation for Nested Loop joins, which 
    is no longer reliable for values like 1.3. I updated it to rows >= 2 to 
    retain the original behavior. This can be refined further, but, in my 
    opinion, it's a practical compromise. Second, there is still a call to 
    rint() in cost of mergejoin which likely should be removed too - though 
    I haven’t included that here yet.
    
    Also, if we're no longer rounding estimates, EXPLAIN should display them 
    with two decimal digits, just like it already does for actual rows.
    
    [ v1-0002-Always-use-two-fractional-digits-for-estimated-rows_TESTS.patch ]
    
    So, what does this change improve? Here are some of the observed plan 
    improvements:
    
      * Previously, a Parallel Aggregate was chosen. With slightly more
        accurate estimation, the planner switches to FinalizeAggregate,
        which can be more efficient in distributed plans.
      * In certain nested joins with constant subqueries, the planner
        previously inserted an unnecessary Materialize. With improved
        estimates, it now skips that step, reducing memory usage and latency.
      * When the estimated number of iterations becomes non-integer but
        still justifies caching, the planner adds Memoize instead of
        re-running a function like generate_series(). This can speed up
        execution significantly.
      * In one case involving partitioned tables and filter conditions like
        t1.b = 0, the planner now chooses an index-based nested loop join
        instead of a hash join. This results in a more efficient plan with
        fewer memory and CPU costs.
    
    I know this patch still needs documentation updates to describe the new 
    estimation display and behaviors. But before that, I’d like to gather 
    feedback: does community agree that more precise estimates and 
    fractional values are better than always rounding?
    
    If anyone would like to see the EXPLAIN ANALYZE VERBOSE output of 
    changes in regression tests, I’d be happy to share them.
    
    All feedback and suggestions welcome!
    
    --
    Best regards,
    Ilia Evdokimov,
    Tantor Labs LLC.