Thread
-
Re: Add a greedy join search algorithm to handle large join problems
Pavel Stehule <pavel.stehule@gmail.com> — 2025-12-11T17:33:52Z
čt 11. 12. 2025 v 18:07 odesílatel Tomas Vondra <tomas@vondra.me> napsal: > On 12/11/25 07:12, Pavel Stehule wrote: > > > > > > čt 11. 12. 2025 v 3:53 odesílatel John Naylor <johncnaylorls@gmail.com > > <mailto:johncnaylorls@gmail.com>> napsal: > > > > On Wed, Dec 10, 2025 at 5:20 PM Tomas Vondra <tomas@vondra.me > > <mailto:tomas@vondra.me>> wrote: > > > I did however notice an interesting thing - running EXPLAIN on the > 99 > > > queries (for 3 scales and 0/4 workers, so 6x 99) took this much > time: > > > > > > master: 8s > > > master/geqo: 20s > > > master/goo: 5s > > > > > It's nice that "goo" seems to be faster than "geqo" - assuming the > > plans > > > are comparable or better. But it surprised me switching to geqo > > makes it > > > slower than master. That goes against my intuition that geqo is > > meant to > > > be cheaper/faster join order planning. But maybe I'm missing > > something. > > > > Yeah, that was surprising. It seems that geqo has a large overhead, > so > > it takes a larger join problem for the asymptotic behavior to win > over > > exhaustive search. > > > > > > If I understand correctly to design - geqo should be slower for any > > queries with smaller complexity. The question is how many queries in the > > tested model are really complex. > > > > Depends on what you mean by "really complex". TPC-DS queries are not > trivial, but the complexity may not be in the number of joins. > > Of course, setting geqo_threshold to 2 may be too aggressive. Not sure. > I checked the TPC-H queries and almost all queries are simple - 5 x JOIN -- 2x nested subselect > > > regards > > -- > Tomas Vondra > >