Re: Use merge-based matching for MCVs in eqjoinsel
Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
From: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
To: PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2025-09-03T16:53:51Z
Lists: pgsql-hackers
Attachments
Following up on my previous messages about optimizing eqjoinsel() and eqjoinsel_semi() for Var1 = Var2 clauses, I’d like to share detailed profiling results showing the effect of the patch on JOB for different values of default_statistics_target. The first table shows the total planner time (summed over all 113 queries) before and after applying the patch, along with the speedup achieved: default_statistics_target | Planner Speedup (×) | Planner Before (ms) | Planner After (ms) --------------------------+---------------------+---------------------+-------------------- 100 | *1.00x* | 1828.433 | 1820.556 1000 | *1.12x* | 2194.282 | 1963.110 2500 | *2.15x* | 4606.705 | 2140.126 5000 | *6.37x* | 16661.581 | 2616.109 7500 | *11.76x* | 35988.569 | 3061.161 10000 | *19.01x* | 66616.620 | 3504.144 The second table shows the profiling of eqjoinsel() using *perf*, demonstrating that the function, which dominates planning at high statistics targets, becomes essentially negligible after the patch: default_statistics_target | eqjoinsel() Before (perf) | eqjoinsel() After (perf) --------------------------+---------------------------+-------------------------- 100 | 0.01% | 0.04% 1000 | 6.23% | 0.06% 2500 | 35.45% | 0.23% 5000 | 66.14% | 0.53% 7500 | 72.70% | 0.97% 10000 | 75.42% | 1.25% I’ve attached v3 of the patch. This version adds a check for NULL values when comparing MCV entries, ensuring correctness in edge cases. -- Best regards, Ilia Evdokimov, Tantor Labs LLC, https://tantorlabs.com