Re: Subquery pull-up increases jointree search space
Andrei Lepikhov <lepihov@gmail.com>
From: Andrei Lepikhov <lepihov@gmail.com>
To: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, Richard Guo <guofenglinux@gmail.com>
Date: 2026-05-13T15:11:18Z
Lists: pgsql-hackers
Attachments
On 09/05/2026 12:51, Andrei Lepikhov wrote: > On 09/02/2026 23:34, Andrei Lepikhov wrote: >> On 9/2/26 21:16, Tom Lane wrote: >>> What I'm wondering about is that join_collapse_limit and >>> from_collapse_limit were invented more than two decades ago, but >>> we've not touched their default values since then. Machines are a >>> lot faster since 2004, and we've probably achieved some net speedups >>> in the planner logic as well. Could we alleviate this concern by >>> raising those defaults, and if so, what are reasonable values in 2026? >> >> As I see, people never use the default settings now. The case that triggered >> this topic could work well with a join collapse limit around 40 joins (GEQO >> started at 14). But a specific setting always depends on how much time people >> want to spend on planning. So, I don't think a change of default settings is >> needed. > After looking into more cases, I realized the main issue is actually something else. > Looking at pull_up_sublinks_qual_recurse, I see that we have information to skip > extra work if the join_collapse_limit is high enough. > And here is the patch. The main idea is that after a transformation (EXISTS or ANY), we should not just initialise the larg of the JoinExpr with the incoming join tree. Instead, we need to look for minimal subtree. Since we already have relids referenced in the subselect, we can traverse this subtree to find the smallest safe join tree that contains these relids. Then, we initialise the larg of our SEMI JOIN JoinExpr with this subtree and update the larg of the upper JoinExpr to point to our SemiJoin. Such behaviour is quite close to the not-pulled-up variant when a subselect is used as a filter and pushed down to the minimum level needed to evaluate this SubLink. So, delay the influence of join_collapse_limit as much as possible. The patch provided contains implementation of the idea - the key function is insert_pulled_up_sublink_join. It is a little polished with Claude AI - it added tests to detect any issues during rebase onto master and documentation. I'll CC Richard, since he often challenges the transformation machinery and may be interested in helping to stabilise the user experience after the upgrade. -- regards, Andrei Lepikhov, pgEdge