Thread
-
Re: Use merge-based matching for MCVs in eqjoinsel
Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> — 2025-09-09T07:29:36Z
Hi David, On 08.09.2025 17:36, David Geier wrote: > Do you think anything else needs changes in the patch? From an architectural perspective, I think the patch is already in good shape. However, I have some suggestions regarding code style: 1. I would move McvHashEntry, McvHashContext, he new hash table definition, hash_mcv and are_mcvs_equal to the top. 2. I’m not sure get_hash_func_oid() is needed at all – it seems we could do without it. 3. It would be better to name the parameters matchProductFrequencies -> matchprodfreq, nMatches -> nmatches, hasMatch1 -> hasmatch1, hasMatch2 -> hasmatch2 in eqjoinsel_inner_with_hashtable(). 4. As I wrote earlier, since we now have a dedicated function eqjoinsel_inner_with_hashtable(), perhaps it could be used in both eqjoinsel_inner() and eqjoinsel_semi(). And since the hash-based search was factored out, maybe it would make sense to also factor out the O(N^2) nested loop implementation? 5. I think it would be helpful to add a comment explaining that using a hash table is not efficient when the MCV array length equals 1: if (Min(statsSlot1->nvalues, statsSlot2->nvalues) == 1) return false; > Did you have a > chance to check tables with just few MCVs or are there any queries in > the JOB which regress with very small default_statistics_target? Sure. I need some time to check this. -- Best regards, Ilia Evdokimov, Tantor Labs LLC, https://tantorlabs.com