v0-0001-Push-pulled-up-SEMI-ANTI-joins-next-to-their-refe.patch

text/plain

Filename: v0-0001-Push-pulled-up-SEMI-ANTI-joins-next-to-their-refe.patch
Type: text/plain
Part: 0
Message: Re: Subquery pull-up increases jointree search space
From 402ded29a96861bc03bae06a3c2b5db832eea3db Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Wed, 6 May 2026 11:42:58 +0000
Subject: [PATCH v0] Push pulled-up SEMI/ANTI joins next to their referenced
 rels

When pull_up_sublinks converts a WHERE-level EXISTS/IN sublink into a
SEMI/ANTI JoinExpr, today's code puts the new join in-place.  Once
join_collapse_limit prevents
deconstruct_jointree from flattening that subtree, the SEMI/ANTI is
stranded above many siblings, forcing the existence-test rel to be
joined last regardless of selectivity.

Splice the pulled-up join at the deepest slot whose subtree already
contains every relid referenced by the new join's outer-side quals.
The descent only enters non-nullable sides of enclosing outer joins,
matching the available_rels invariant that pull_up_sublinks already
relies on, so it never pushes a SEMI filter into the nullable side
of a LEFT/RIGHT/FULL join.

Adjusted regression expected outputs and added a focused subselect.sql
case exercising the proposal's archetype (T1..Tn + EXISTS(Tx
referencing T1) under a low join_collapse_limit), plus a negative
case verifying we do not push past a LEFT JOIN.
---
 .../postgres_fdw/expected/postgres_fdw.out    |   2 +-
 doc/src/sgml/config.sgml                      |  33 ++
 src/backend/optimizer/README                  |  48 +++
 src/backend/optimizer/prep/prepjointree.c     | 353 +++++++++++++++++-
 src/backend/utils/misc/guc_parameters.dat     |   7 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/optimizer/planmain.h              |   1 +
 src/test/regress/expected/partition_join.out  |  36 +-
 src/test/regress/expected/subselect.out       | 135 +++++++
 src/test/regress/sql/subselect.sql            |  76 ++++
 10 files changed, 653 insertions(+), 39 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index aaffcf31271..183aabd5b99 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -4814,7 +4814,7 @@ SELECT ft2.*, ft4.* FROM ft2 INNER JOIN ft4 ON ft2.c2 = ft4.c1
  Foreign Scan
    Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8, ft4.c1, ft4.c2, ft4.c3
    Relations: ((public.ft2) INNER JOIN (public.ft4)) SEMI JOIN (public.ft5)
-   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 3" r2 ON (((r1.c2 = r2.c1)) AND ((r1."C 1" > 900)))) WHERE EXISTS (SELECT NULL FROM "S 1"."T 4" r4 WHERE ((r1.c2 = r4.c1))) ORDER BY r1."C 1" ASC NULLS LAST
+   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 3" r2 ON (((r1.c2 = r2.c1)) AND ((r1."C 1" > 900)))) WHERE EXISTS (SELECT NULL FROM "S 1"."T 4" r4 WHERE ((r2.c1 = r4.c1))) ORDER BY r1."C 1" ASC NULLS LAST
 (4 rows)
 
 SELECT ft2.*, ft4.* FROM ft2 INNER JOIN ft4 ON ft2.c2 = ft4.c1
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 73cc0412330..a34e078c7ca 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6124,6 +6124,29 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-pulled-up-semi-relocation" xreflabel="enable_pulled_up_semi_relocation">
+      <term><varname>enable_pulled_up_semi_relocation</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_pulled_up_semi_relocation</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        When the planner pulls up an <literal>EXISTS</literal> or
+        <literal>IN</literal> sublink in <literal>WHERE</literal> into a
+        SEMI- or ANTI-join, this setting controls whether the new join is
+        spliced into the join tree adjacent to the relation it references
+        (<literal>on</literal>, the default) or wrapped around the entire
+        scope at the original placement (<literal>off</literal>).  The
+        deeper placement helps when <xref linkend="guc-join-collapse-limit"/>
+        prevents the surrounding join list from being flattened, but adds a
+        small amount of planning work proportional to the join tree size.
+        Disabling it provides a runtime fallback in the unlikely event that
+        the deeper placement causes a plan regression on a particular query.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-self-join-elimination" xreflabel="enable_self_join_elimination">
       <term><varname>enable_self_join_elimination</varname> (<type>boolean</type>)
       <indexterm>
@@ -6855,6 +6878,16 @@ SELECT * FROM parent WHERE key = 2400;
         may trigger use of the GEQO planner, resulting in non-optimal
         plans.  See <xref linkend="runtime-config-query-geqo"/>.
        </para>
+
+       <para>
+        This limit governs only joins that the user wrote explicitly;
+        SEMI- and ANTI-joins synthesised from pulled-up <literal>EXISTS</literal>
+        or <literal>IN</literal> sublinks in <literal>WHERE</literal> are
+        placed at the deepest legal point in the join tree whose subtree
+        already covers every relation the new join's qualifications
+        reference.  This behaviour can be disabled with
+        <xref linkend="guc-enable-pulled-up-semi-relocation"/>.
+       </para>
       </listitem>
      </varlistentry>
 
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 6c35baceedb..aca2cd70699 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -298,6 +298,54 @@ search has runtime exponential in the number of FROM-items considered.
 Therefore, we don't merge FROM-lists if the result would have too many
 FROM-items in one list.
 
+Pulling Up Sublinks
+-------------------
+
+A separate pass, pull_up_sublinks() in prepjointree.c, runs before
+pull_up_subqueries() and converts WHERE-level EXISTS / IN (ANY) sublinks
+into SEMI/ANTI JoinExprs whenever the surrounding context permits.  When
+it does so, the new JoinExpr is inserted into the jointree at the deepest
+jtlink beneath the current one whose subtree already contains every
+relation the new join's outer-side qualifications reference -- both the
+testexpr / lifted EXISTS qual and any LATERAL references hidden inside
+the rarg subquery.  Inserting "next to the referenced rel" lets the SEMI
+survive join_collapse_limit flattening as a tightly-coupled pair with
+the rel it references, so the planner can consider join orders like
+(T1 SEMI Tx) JOIN T2 JOIN ... for queries shaped like
+
+    SELECT * FROM T1 JOIN T2 ON ... JOIN ... JOIN Tn ON ...
+    WHERE EXISTS (SELECT 1 FROM Tx WHERE clause(Tx, T1));
+
+instead of always inserting the SEMI at the outermost legal point and
+forcing Tx to be joined last whenever the join list exceeds
+join_collapse_limit.
+
+The descent only enters non-nullable sides of enclosing outer joins
+(LEFT.larg, RIGHT.rarg, both sides of INNER, larg-only of any
+already-inserted SEMI/ANTI in the path), mirroring the available_rels
+discipline pull_up_sublinks already maintains for the testexpr.  Every
+rel in the chosen subtree is therefore non-nullable in the surrounding
+qual's vantage point, so pre-filtering with the SEMI cannot drop outer
+rows that the original WHERE clause would have preserved.  FULL JOINs
+and the rarg of any LEFT/RIGHT JOIN are never descended into.
+
+Cost: the descent calls get_relids_in_jointree() at each level it inspects,
+which is O(N) per call where N is the size of the visited subtree.  In the
+worst case (a left-deep join tree of N rels), this sums to O(N^2) bitmapset
+operations per pulled-up sublink.  For typical queries (N <= 30) the cost
+is well under a hundred microseconds and dominated by the surrounding
+planner work; for very large generated queries it can become measurable.
+The runtime kill switch enable_pulled_up_semi_relocation falls back to the
+historical "wrap at the top" placement and bypasses the descent entirely.
+
+Extensions that walk parse->jointree between pull_up_sublinks and
+deconstruct_jointree -- notably planner-shaping extensions like
+pg_hint_plan and AQO that observe or override join order from the
+parse-tree shape -- will see this deeper insertion topology rather than
+the historical "always at the top" placement.  Extensions that cache
+join-order decisions across query rewrites should treat the topology as
+freshly computed per query, not stable across pull-up passes.
+
 
 Vars and PlaceHolderVars
 ------------------------
diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c
index 4424fdbe906..a9761d7e148 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -37,6 +37,7 @@
 #include "optimizer/optimizer.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planmain.h"
 #include "optimizer/prep.h"
 #include "optimizer/subselect.h"
 #include "optimizer/tlist.h"
@@ -105,6 +106,9 @@ typedef struct reduce_outer_joins_partial_state
 	Relids		unreduced_side; /* relids in its still-nullable side */
 } reduce_outer_joins_partial_state;
 
+/* GUC: relocate pulled-up SEMI/ANTI joins next to their referenced rels */
+bool		enable_pulled_up_semi_relocation = true;
+
 static Query *expand_virtual_generated_columns(PlannerInfo *root, Query *parse,
 											   RangeTblEntry *rte, int rt_index,
 											   Relation relation);
@@ -113,6 +117,10 @@ static Node *pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 static Node *pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 										   Node **jtlink1, Relids available_rels1,
 										   Node **jtlink2, Relids available_rels2);
+static Relids insert_pulled_up_sublink_join(PlannerInfo *root, JoinExpr *j,
+										   Node **jtlink, Relids available_rels);
+static Relids collect_sublink_rarg_lateral_rels(PlannerInfo *root, Node *rarg);
+static Node **find_sublink_jtlink(Node **slot, Relids target_rels);
 static Node *pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
 										JoinExpr *lowest_outer_join,
 										AppendRelInfo *containing_appendrel);
@@ -835,6 +843,303 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
 	return jtnode;
 }
 
+/*
+ * Insert a SEMI/ANTI JoinExpr produced by convert_*_sublink_to_join into
+ * the jointree, picking the deepest legal jtlink under *jtlink whose
+ * subtree already contains every relation the new join's qualifications
+ * reference on the outer side -- including any LATERAL references hidden
+ * inside the pulled-up rarg subquery.  Falls back to inserting at *jtlink
+ * itself when no deeper jtlink is safe; that matches the behaviour
+ * predating this routine and is therefore always correct.
+ *
+ * The descent only enters non-nullable sides of enclosing outer joins,
+ * mirroring the "available_rels" discipline used elsewhere in
+ * pull_up_sublinks: every rel in the chosen subtree is non-nullable in the
+ * surrounding qual's scope, so pre-filtering with the SEMI cannot drop
+ * outer rows that the original WHERE clause would have preserved.
+ *
+ * Returns the relids of the (post-insertion) subtree now occupying j->larg.
+ * Callers that recursively process j->quals must use that set in place of
+ * the caller's wider available_rels, since after a deeper insertion j->larg
+ * may cover only a strict subset of the original scope.  The four NOT-
+ * branch call sites discard this return value: they recurse only on
+ * j->rarg, never on j->larg, so a narrower scope cannot affect them.
+ */
+static Relids
+insert_pulled_up_sublink_join(PlannerInfo *root, JoinExpr *j,
+							 Node **jtlink, Relids available_rels)
+{
+	Relids		quals_varnos;
+	Relids		lateral_rels;
+	Relids		target_rels;
+	Node	  **target_jtlink;
+
+	/*
+	 * Operational kill switch.  When disabled, fall back to the historical
+	 * "splice in place" behaviour: the SEMI/ANTI is wrapped around the entire
+	 * subtree at *jtlink, which matches what pull_up_sublinks did before this
+	 * routine existed.  Useful for diagnosing planning-time regressions or
+	 * plan-shape changes attributable to the deeper insertion topology.
+	 *
+	 * Return available_rels rather than recomputing get_relids_in_jointree on
+	 * j->larg.  Since j->larg is now the entire subtree the caller knew as
+	 * *jtlink, its relids match available_rels by construction -- and the
+	 * historical code path passed available_rels unchanged into the recursive
+	 * qual scan, so reusing it here is strictly equivalent.  Skipping the walk
+	 * also avoids a redundant O(N) bitmapset allocation per pulled-up sublink
+	 * on the kill-switch path.
+	 */
+	if (!enable_pulled_up_semi_relocation)
+	{
+		j->larg = *jtlink;
+		*jtlink = (Node *) j;
+		return available_rels;
+	}
+
+	/*
+	 * Outer-side relids the new SEMI/ANTI references.  We must consider both
+	 * j->quals (the explicit testexpr / lifted EXISTS qual) and any LATERAL
+	 * references buried inside the rarg subquery; both anchor the SEMI to
+	 * outer rels and the chosen insertion jtlink must cover every one of them.
+	 * Intersecting with available_rels drops the rarg's freshly-added RTE
+	 * and any varnos from outer query levels.
+	 *
+	 * lateral_rels is NULL in the overwhelmingly common case (no LATERAL ref
+	 * in the rarg).  Splitting the two cases lets us call bms_intersect
+	 * directly there instead of allocating a transient bms_union result.
+	 */
+	quals_varnos = pull_varnos(root, j->quals);
+	lateral_rels = collect_sublink_rarg_lateral_rels(root, j->rarg);
+	if (lateral_rels == NULL)
+		target_rels = bms_intersect(quals_varnos, available_rels);
+	else
+	{
+		target_rels = bms_union(quals_varnos, lateral_rels);
+		target_rels = bms_int_members(target_rels, available_rels);
+	}
+
+	/*
+	 * After the intersection, target_rels is a subset of available_rels by
+	 * construction.  This invariant is load-bearing for the descent: the
+	 * find_sublink_jtlink logic only descends through non-nullable
+	 * sides of enclosing outer joins, but it relies on the caller having
+	 * already proven that every member of target_rels is itself
+	 * non-nullable in the surrounding scope -- which is exactly what
+	 * available_rels guarantees.
+	 */
+	Assert(bms_is_subset(target_rels, available_rels));
+
+	/*
+	 * target_rels cannot be empty here.  convert_ANY_sublink_to_join and
+	 * convert_EXISTS_sublink_to_join both require their respective outer-rel
+	 * reference set (upper_varnos / level-1 vars of the lifted WHERE) to be
+	 * non-empty and a subset of available_rels before they will pull up a
+	 * sublink, so by construction quals_varnos has at least one member in
+	 * common with available_rels.  Assert that here -- if a future change
+	 * loosens those gates, the descent below would silently degrade to the
+	 * top-of-scope placement and we want to know about it.
+	 */
+	Assert(!bms_is_empty(target_rels));
+
+	target_jtlink = find_sublink_jtlink(jtlink, target_rels);
+
+	j->larg = *target_jtlink;
+	*target_jtlink = (Node *) j;
+
+	/*
+	 * Post-condition: the chosen subtree, now occupying j->larg, must cover
+	 * every relid the SEMI/ANTI's quals reference within the surrounding
+	 * scope.  A descent that returned the wrong slot would silently produce
+	 * a join whose qual contains free vars; catch that here under cassert.
+	 */
+#ifdef USE_ASSERT_CHECKING
+	{
+		Relids		larg_check = get_relids_in_jointree(j->larg, true, false);
+
+		Assert(bms_is_subset(target_rels, larg_check));
+		bms_free(larg_check);
+	}
+#endif
+
+	bms_free(quals_varnos);
+	bms_free(lateral_rels);
+	bms_free(target_rels);
+
+	return get_relids_in_jointree(j->larg, true, false);
+}
+
+/*
+ * Find the outer-query relids that a freshly-pulled-up sublink's rarg
+ * references via LATERAL.
+ *
+ * Only ANY/IN pull-up can leave outer-query references inside the rarg:
+ * convert_ANY_sublink_to_join wraps the subselect in a subquery RTE that
+ * inherits any level-1 Vars referencing outer rels and is marked lateral.
+ * EXISTS pull-up, by contrast, hoists the subselect's whereClause into
+ * j->quals and rewrites its level-1 Vars to level-0; nothing of interest
+ * is left in the rarg, so we early-return NULL for non-RangeTblRef shapes.
+ *
+ * Returns NULL when the rarg has no relevant lateral references.  The
+ * caller bms_free's the result.
+ */
+static Relids
+collect_sublink_rarg_lateral_rels(PlannerInfo *root, Node *rarg)
+{
+	RangeTblRef *rtr;
+	RangeTblEntry *rte;
+
+	if (rarg == NULL || !IsA(rarg, RangeTblRef))
+		return NULL;
+
+	rtr = (RangeTblRef *) rarg;
+	rte = rt_fetch(rtr->rtindex, root->parse->rtable);
+
+	if (rte->rtekind != RTE_SUBQUERY || !rte->lateral)
+		return NULL;
+
+	/*
+	 * NULL root is intentional: pull_up_sublinks runs before
+	 * pull_up_subqueries, so no PlaceHolderVars exist yet and pull_varnos
+	 * machinery has nothing to consult root for.
+	 */
+	return pull_varnos_of_level(NULL, (Node *) rte->subquery, 1);
+}
+
+/*
+ * Walk down from *slot looking for the deepest position whose subtree
+ * relids cover target_rels, descending only through non-nullable sides.
+ *
+ * Returns a pointer to the slot we should overwrite; this is *slot itself
+ * if no deeper slot is safe.
+ */
+static Node **
+find_sublink_jtlink(Node **slot, Relids target_rels)
+{
+	Node	   *n;
+
+	Assert(slot != NULL);
+	check_stack_depth();
+
+	n = *slot;
+
+	if (n == NULL)
+		return slot;
+
+	if (IsA(n, RangeTblRef))
+		return slot;			/* cannot go deeper */
+
+	if (IsA(n, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) n;
+		ListCell   *lc;
+
+		/*
+		 * If exactly one fromlist child covers target_rels, descend into it.
+		 * Otherwise target straddles siblings -- insert at this FromExpr.
+		 * Sibling subtrees in a fromlist are disjoint, so at most one child
+		 * can cover target_rels; "first match wins" is therefore unique.
+		 */
+		foreach(lc, f->fromlist)
+		{
+			Node	   *child = (Node *) lfirst(lc);
+			Relids		child_rels = get_relids_in_jointree(child, true, false);
+
+			if (bms_is_subset(target_rels, child_rels))
+			{
+				bms_free(child_rels);
+
+				/*
+				 * Hand the recursive call the address of the list cell's
+				 * payload so it can rewrite the slot in place.  Safe because
+				 * find_sublink_jtlink never mutates fromlist below us.
+				 */
+				return find_sublink_jtlink((Node **) &lfirst(lc),
+												target_rels);
+			}
+			bms_free(child_rels);
+		}
+		return slot;
+	}
+
+	if (IsA(n, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) n;
+		Relids		side_rels;
+
+		switch (j->jointype)
+		{
+			case JOIN_INNER:
+				/* Both sides non-nullable -- try larg, then rarg. */
+				side_rels = get_relids_in_jointree(j->larg, true, false);
+				if (bms_is_subset(target_rels, side_rels))
+				{
+					bms_free(side_rels);
+					return find_sublink_jtlink(&j->larg, target_rels);
+				}
+				bms_free(side_rels);
+
+				side_rels = get_relids_in_jointree(j->rarg, true, false);
+				if (bms_is_subset(target_rels, side_rels))
+				{
+					bms_free(side_rels);
+					return find_sublink_jtlink(&j->rarg, target_rels);
+				}
+				bms_free(side_rels);
+				return slot;	/* spans both sides */
+
+			case JOIN_LEFT:
+				/* Only larg is non-nullable in the surrounding scope. */
+				side_rels = get_relids_in_jointree(j->larg, true, false);
+				if (bms_is_subset(target_rels, side_rels))
+				{
+					bms_free(side_rels);
+					return find_sublink_jtlink(&j->larg, target_rels);
+				}
+				bms_free(side_rels);
+				return slot;
+
+			case JOIN_RIGHT:
+				side_rels = get_relids_in_jointree(j->rarg, true, false);
+				if (bms_is_subset(target_rels, side_rels))
+				{
+					bms_free(side_rels);
+					return find_sublink_jtlink(&j->rarg, target_rels);
+				}
+				bms_free(side_rels);
+				return slot;
+
+			case JOIN_FULL:
+				/* Both sides nullable -- never descend. */
+				return slot;
+
+			case JOIN_SEMI:
+			case JOIN_ANTI:
+				/*
+				 * SEMI/ANTI joins do not normally appear in the user's
+				 * jointree, but a previous iteration of this same recursion
+				 * (e.g. a sibling AND-conjunct sublink already pulled up and
+				 * inserted by insert_pulled_up_sublink_join) may have planted
+				 * one in our path.  We may only descend into the larg side;
+				 * the rarg is the existence-test side and pre-filtering it
+				 * would change semantics.
+				 */
+				side_rels = get_relids_in_jointree(j->larg, true, false);
+				if (bms_is_subset(target_rels, side_rels))
+				{
+					bms_free(side_rels);
+					return find_sublink_jtlink(&j->larg, target_rels);
+				}
+				bms_free(side_rels);
+				return slot;
+
+			default:
+				return slot;
+		}
+	}
+
+	return slot;
+}
+
 /*
  * Recurse through top-level qual nodes for pull_up_sublinks()
  *
@@ -882,9 +1187,11 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 			if ((j = convert_ANY_sublink_to_join(root, sublink, false,
 												 available_rels1)) != NULL)
 			{
+				Relids		larg_rels;
+
 				/* Yes; insert the new join node into the join tree */
-				j->larg = *jtlink1;
-				*jtlink1 = (Node *) j;
+				larg_rels = insert_pulled_up_sublink_join(root, j, jtlink1,
+														 available_rels1);
 				/* Recursively process pulled-up jointree nodes */
 				j->rarg = pull_up_sublinks_jointree_recurse(root,
 															j->rarg,
@@ -898,7 +1205,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				j->quals = pull_up_sublinks_qual_recurse(root,
 														 j->quals,
 														 &j->larg,
-														 available_rels1,
+														 larg_rels,
 														 &j->rarg,
 														 child_rels);
 				/* Return NULL representing constant TRUE */
@@ -908,9 +1215,11 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				(j = convert_ANY_sublink_to_join(root, sublink, false,
 												 available_rels2)) != NULL)
 			{
+				Relids		larg_rels;
+
 				/* Yes; insert the new join node into the join tree */
-				j->larg = *jtlink2;
-				*jtlink2 = (Node *) j;
+				larg_rels = insert_pulled_up_sublink_join(root, j, jtlink2,
+														 available_rels2);
 				/* Recursively process pulled-up jointree nodes */
 				j->rarg = pull_up_sublinks_jointree_recurse(root,
 															j->rarg,
@@ -924,7 +1233,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				j->quals = pull_up_sublinks_qual_recurse(root,
 														 j->quals,
 														 &j->larg,
-														 available_rels2,
+														 larg_rels,
 														 &j->rarg,
 														 child_rels);
 				/* Return NULL representing constant TRUE */
@@ -936,9 +1245,11 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 			if ((j = convert_EXISTS_sublink_to_join(root, sublink, false,
 													available_rels1)) != NULL)
 			{
+				Relids		larg_rels;
+
 				/* Yes; insert the new join node into the join tree */
-				j->larg = *jtlink1;
-				*jtlink1 = (Node *) j;
+				larg_rels = insert_pulled_up_sublink_join(root, j, jtlink1,
+														 available_rels1);
 				/* Recursively process pulled-up jointree nodes */
 				j->rarg = pull_up_sublinks_jointree_recurse(root,
 															j->rarg,
@@ -952,7 +1263,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				j->quals = pull_up_sublinks_qual_recurse(root,
 														 j->quals,
 														 &j->larg,
-														 available_rels1,
+														 larg_rels,
 														 &j->rarg,
 														 child_rels);
 				/* Return NULL representing constant TRUE */
@@ -962,9 +1273,11 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				(j = convert_EXISTS_sublink_to_join(root, sublink, false,
 													available_rels2)) != NULL)
 			{
+				Relids		larg_rels;
+
 				/* Yes; insert the new join node into the join tree */
-				j->larg = *jtlink2;
-				*jtlink2 = (Node *) j;
+				larg_rels = insert_pulled_up_sublink_join(root, j, jtlink2,
+														 available_rels2);
 				/* Recursively process pulled-up jointree nodes */
 				j->rarg = pull_up_sublinks_jointree_recurse(root,
 															j->rarg,
@@ -978,7 +1291,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 				j->quals = pull_up_sublinks_qual_recurse(root,
 														 j->quals,
 														 &j->larg,
-														 available_rels2,
+														 larg_rels,
 														 &j->rarg,
 														 child_rels);
 				/* Return NULL representing constant TRUE */
@@ -1003,8 +1316,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 													 available_rels1)) != NULL)
 				{
 					/* Yes; insert the new join node into the join tree */
-					j->larg = *jtlink1;
-					*jtlink1 = (Node *) j;
+					(void) insert_pulled_up_sublink_join(root, j, jtlink1,
+														available_rels1);
 					/* Recursively process pulled-up jointree nodes */
 					j->rarg = pull_up_sublinks_jointree_recurse(root,
 																j->rarg,
@@ -1029,8 +1342,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 													 available_rels2)) != NULL)
 				{
 					/* Yes; insert the new join node into the join tree */
-					j->larg = *jtlink2;
-					*jtlink2 = (Node *) j;
+					(void) insert_pulled_up_sublink_join(root, j, jtlink2,
+														available_rels2);
 					/* Recursively process pulled-up jointree nodes */
 					j->rarg = pull_up_sublinks_jointree_recurse(root,
 																j->rarg,
@@ -1057,8 +1370,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														available_rels1)) != NULL)
 				{
 					/* Yes; insert the new join node into the join tree */
-					j->larg = *jtlink1;
-					*jtlink1 = (Node *) j;
+					(void) insert_pulled_up_sublink_join(root, j, jtlink1,
+														available_rels1);
 					/* Recursively process pulled-up jointree nodes */
 					j->rarg = pull_up_sublinks_jointree_recurse(root,
 																j->rarg,
@@ -1083,8 +1396,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
 														available_rels2)) != NULL)
 				{
 					/* Yes; insert the new join node into the join tree */
-					j->larg = *jtlink2;
-					*jtlink2 = (Node *) j;
+					(void) insert_pulled_up_sublink_join(root, j, jtlink2,
+														available_rels2);
 					/* Recursively process pulled-up jointree nodes */
 					j->rarg = pull_up_sublinks_jointree_recurse(root,
 																j->rarg,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index afaa058b046..c72d0ac9f3f 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1013,6 +1013,13 @@
   boot_val => 'true',
 },
 
+{ name => 'enable_pulled_up_semi_relocation', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+  short_desc => 'Enables relocation of pulled-up SEMI/ANTI joins next to their referenced relations.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'enable_pulled_up_semi_relocation',
+  boot_val => 'true',
+},
+
 { name => 'enable_self_join_elimination', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
   short_desc => 'Enables removal of unique self-joins.',
   flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index ac38cddaaf9..02dcc23bfab 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -446,6 +446,7 @@
 #enable_tidscan = on
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
+#enable_pulled_up_semi_relocation = on
 #enable_self_join_elimination = on
 #enable_eager_aggregate = on
 
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 71c043a25e8..5aa6f307012 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -21,6 +21,7 @@
 #define DEFAULT_CURSOR_TUPLE_FRACTION 0.1
 extern PGDLLIMPORT double cursor_tuple_fraction;
 extern PGDLLIMPORT bool enable_self_join_elimination;
+extern PGDLLIMPORT bool enable_pulled_up_semi_relocation;
 
 /* query_planner callback to compute query_pathkeys */
 typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra);
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 38643d41fd7..5ac4ae266d0 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -723,33 +723,33 @@ SELECT * FROM prt1 t1 JOIN prt1 t2 ON t1.a = t2.a WHERE t1.a IN (SELECT a FROM p
                     QUERY PLAN                    
 --------------------------------------------------
  Append
-   ->  Hash Semi Join
-         Hash Cond: (t1_1.a = t3_1.a)
-         ->  Hash Join
-               Hash Cond: (t1_1.a = t2_1.a)
+   ->  Hash Join
+         Hash Cond: (t1_1.a = t2_1.a)
+         ->  Hash Semi Join
+               Hash Cond: (t1_1.a = t3_1.a)
                ->  Seq Scan on prt1_p1 t1_1
                ->  Hash
-                     ->  Seq Scan on prt1_p1 t2_1
+                     ->  Seq Scan on prt1_p1 t3_1
          ->  Hash
-               ->  Seq Scan on prt1_p1 t3_1
-   ->  Hash Semi Join
-         Hash Cond: (t1_2.a = t3_2.a)
-         ->  Hash Join
-               Hash Cond: (t1_2.a = t2_2.a)
+               ->  Seq Scan on prt1_p1 t2_1
+   ->  Hash Join
+         Hash Cond: (t1_2.a = t2_2.a)
+         ->  Hash Semi Join
+               Hash Cond: (t1_2.a = t3_2.a)
                ->  Seq Scan on prt1_p2 t1_2
                ->  Hash
-                     ->  Seq Scan on prt1_p2 t2_2
+                     ->  Seq Scan on prt1_p2 t3_2
          ->  Hash
-               ->  Seq Scan on prt1_p2 t3_2
-   ->  Hash Semi Join
-         Hash Cond: (t1_3.a = t3_3.a)
-         ->  Hash Join
-               Hash Cond: (t1_3.a = t2_3.a)
+               ->  Seq Scan on prt1_p2 t2_2
+   ->  Hash Join
+         Hash Cond: (t1_3.a = t2_3.a)
+         ->  Hash Semi Join
+               Hash Cond: (t1_3.a = t3_3.a)
                ->  Seq Scan on prt1_p3 t1_3
                ->  Hash
-                     ->  Seq Scan on prt1_p3 t2_3
+                     ->  Seq Scan on prt1_p3 t3_3
          ->  Hash
-               ->  Seq Scan on prt1_p3 t3_3
+               ->  Seq Scan on prt1_p3 t2_3
 (28 rows)
 
 --
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index a3778c23c34..68e728054eb 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -3785,3 +3785,138 @@ WHERE id NOT IN (SELECT id FROM notnull_notvalid_tab);
 (0 rows)
 
 ROLLBACK;
+--
+-- Pulled-up SEMI/ANTI joins should be grafted next to the rel they
+-- reference, so they are not stranded above join_collapse_limit when
+-- many sibling joins exist.
+--
+BEGIN;
+CREATE TEMP TABLE pull_t1 (a int, b int);
+CREATE TEMP TABLE pull_t2 (a int);
+CREATE TEMP TABLE pull_t3 (a int);
+CREATE TEMP TABLE pull_t4 (a int);
+CREATE TEMP TABLE pull_t5 (a int);
+CREATE TEMP TABLE pull_x  (b int);
+INSERT INTO pull_t1 SELECT g, g FROM generate_series(1, 10) g;
+INSERT INTO pull_t2 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t3 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t4 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t5 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_x  SELECT g FROM generate_series(1, 10) g;
+ANALYZE pull_t1, pull_t2, pull_t3, pull_t4, pull_t5, pull_x;
+SET LOCAL join_collapse_limit = 2;
+SET LOCAL max_parallel_workers_per_gather = 0;
+-- Pin a single join method so the test asserts the SEMI/ANTI graft
+-- point and not the planner's hash-vs-merge cost crossover, which has
+-- been observed to flap on 32-bit and non-default-blocksize buildfarm
+-- members.  enable_memoize is pinned for the same reason: a future
+-- planner change that prefers Memoize over Materialize on the inner
+-- side of the SEMI would break the expected output.
+SET LOCAL enable_hashjoin = off;
+SET LOCAL enable_mergejoin = off;
+SET LOCAL enable_memoize = off;
+-- The EXISTS only references pull_t1, so the SEMI must end up adjacent
+-- to pull_t1 even though four other joins separate the WHERE from t1.
+-- The Nested Loop with pull_t1 directly underneath the Semi Join is
+-- the load-bearing assertion here.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t1
+JOIN pull_t2 ON pull_t1.a = pull_t2.a
+JOIN pull_t3 ON pull_t1.a = pull_t3.a
+JOIN pull_t4 ON pull_t1.a = pull_t4.a
+JOIN pull_t5 ON pull_t1.a = pull_t5.a
+WHERE EXISTS (SELECT 1 FROM pull_x WHERE pull_x.b = pull_t1.b);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Nested Loop
+   Join Filter: (pull_t1.a = pull_t5.a)
+   ->  Nested Loop
+         Join Filter: (pull_t1.a = pull_t4.a)
+         ->  Nested Loop
+               Join Filter: (pull_t1.a = pull_t3.a)
+               ->  Nested Loop
+                     Join Filter: (pull_t1.a = pull_t2.a)
+                     ->  Nested Loop Semi Join
+                           Join Filter: (pull_t1.b = pull_x.b)
+                           ->  Seq Scan on pull_t1
+                           ->  Materialize
+                                 ->  Seq Scan on pull_x
+                     ->  Materialize
+                           ->  Seq Scan on pull_t2
+               ->  Materialize
+                     ->  Seq Scan on pull_t3
+         ->  Materialize
+               ->  Seq Scan on pull_t4
+   ->  Materialize
+         ->  Seq Scan on pull_t5
+(21 rows)
+
+-- The IN's testexpr only references pull_t1, same expectation.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t1
+JOIN pull_t2 ON pull_t1.a = pull_t2.a
+JOIN pull_t3 ON pull_t1.a = pull_t3.a
+JOIN pull_t4 ON pull_t1.a = pull_t4.a
+JOIN pull_t5 ON pull_t1.a = pull_t5.a
+WHERE pull_t1.b IN (SELECT b FROM pull_x);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Nested Loop
+   Join Filter: (pull_t1.a = pull_t5.a)
+   ->  Nested Loop
+         Join Filter: (pull_t1.a = pull_t4.a)
+         ->  Nested Loop
+               Join Filter: (pull_t1.a = pull_t3.a)
+               ->  Nested Loop
+                     Join Filter: (pull_t1.a = pull_t2.a)
+                     ->  Nested Loop Semi Join
+                           Join Filter: (pull_t1.b = pull_x.b)
+                           ->  Seq Scan on pull_t1
+                           ->  Materialize
+                                 ->  Seq Scan on pull_x
+                     ->  Materialize
+                           ->  Seq Scan on pull_t2
+               ->  Materialize
+                     ->  Seq Scan on pull_t3
+         ->  Materialize
+               ->  Seq Scan on pull_t4
+   ->  Materialize
+         ->  Seq Scan on pull_t5
+(21 rows)
+
+-- An EXISTS referencing a rel on the nullable side of a LEFT JOIN
+-- must NOT be pushed past the outer join.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t2
+LEFT JOIN pull_t1 ON pull_t1.a = pull_t2.a
+WHERE EXISTS (SELECT 1 FROM pull_x WHERE pull_x.b = pull_t1.b);
+                  QUERY PLAN                  
+----------------------------------------------
+ Nested Loop Semi Join
+   Join Filter: (pull_t1.b = pull_x.b)
+   ->  Nested Loop
+         Join Filter: (pull_t2.a = pull_t1.a)
+         ->  Seq Scan on pull_t2
+         ->  Materialize
+               ->  Seq Scan on pull_t1
+   ->  Materialize
+         ->  Seq Scan on pull_x
+(9 rows)
+
+-- An IN-sublink whose pulled-up rarg carries a LATERAL reference to a
+-- second outer rel must not be grafted into a subtree that excludes
+-- that rel, even though only pull_t1 appears in the testexpr.  The
+-- result set must match the unoptimised plan.
+SELECT count(*)
+FROM pull_t1, pull_t2
+WHERE pull_t1.a = pull_t2.a
+  AND pull_t1.b IN (SELECT b FROM pull_x WHERE pull_x.b = pull_t2.a);
+ count 
+-------
+    10
+(1 row)
+
+ROLLBACK;
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 1a02c3f86c0..09e9ed747ca 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -1647,3 +1647,79 @@ SELECT * FROM not_null_tab
 WHERE id NOT IN (SELECT id FROM notnull_notvalid_tab);
 
 ROLLBACK;
+
+--
+-- Pulled-up SEMI/ANTI joins should be grafted next to the rel they
+-- reference, so they are not stranded above join_collapse_limit when
+-- many sibling joins exist.
+--
+BEGIN;
+
+CREATE TEMP TABLE pull_t1 (a int, b int);
+CREATE TEMP TABLE pull_t2 (a int);
+CREATE TEMP TABLE pull_t3 (a int);
+CREATE TEMP TABLE pull_t4 (a int);
+CREATE TEMP TABLE pull_t5 (a int);
+CREATE TEMP TABLE pull_x  (b int);
+
+INSERT INTO pull_t1 SELECT g, g FROM generate_series(1, 10) g;
+INSERT INTO pull_t2 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t3 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t4 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t5 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_x  SELECT g FROM generate_series(1, 10) g;
+ANALYZE pull_t1, pull_t2, pull_t3, pull_t4, pull_t5, pull_x;
+
+SET LOCAL join_collapse_limit = 2;
+SET LOCAL max_parallel_workers_per_gather = 0;
+-- Pin a single join method so the test asserts the SEMI/ANTI graft
+-- point and not the planner's hash-vs-merge cost crossover, which has
+-- been observed to flap on 32-bit and non-default-blocksize buildfarm
+-- members.  enable_memoize is pinned for the same reason: a future
+-- planner change that prefers Memoize over Materialize on the inner
+-- side of the SEMI would break the expected output.
+SET LOCAL enable_hashjoin = off;
+SET LOCAL enable_mergejoin = off;
+SET LOCAL enable_memoize = off;
+
+-- The EXISTS only references pull_t1, so the SEMI must end up adjacent
+-- to pull_t1 even though four other joins separate the WHERE from t1.
+-- The Nested Loop with pull_t1 directly underneath the Semi Join is
+-- the load-bearing assertion here.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t1
+JOIN pull_t2 ON pull_t1.a = pull_t2.a
+JOIN pull_t3 ON pull_t1.a = pull_t3.a
+JOIN pull_t4 ON pull_t1.a = pull_t4.a
+JOIN pull_t5 ON pull_t1.a = pull_t5.a
+WHERE EXISTS (SELECT 1 FROM pull_x WHERE pull_x.b = pull_t1.b);
+
+-- The IN's testexpr only references pull_t1, same expectation.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t1
+JOIN pull_t2 ON pull_t1.a = pull_t2.a
+JOIN pull_t3 ON pull_t1.a = pull_t3.a
+JOIN pull_t4 ON pull_t1.a = pull_t4.a
+JOIN pull_t5 ON pull_t1.a = pull_t5.a
+WHERE pull_t1.b IN (SELECT b FROM pull_x);
+
+-- An EXISTS referencing a rel on the nullable side of a LEFT JOIN
+-- must NOT be pushed past the outer join.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t2
+LEFT JOIN pull_t1 ON pull_t1.a = pull_t2.a
+WHERE EXISTS (SELECT 1 FROM pull_x WHERE pull_x.b = pull_t1.b);
+
+-- An IN-sublink whose pulled-up rarg carries a LATERAL reference to a
+-- second outer rel must not be grafted into a subtree that excludes
+-- that rel, even though only pull_t1 appears in the testexpr.  The
+-- result set must match the unoptimised plan.
+SELECT count(*)
+FROM pull_t1, pull_t2
+WHERE pull_t1.a = pull_t2.a
+  AND pull_t1.b IN (SELECT b FROM pull_x WHERE pull_x.b = pull_t2.a);
+
+ROLLBACK;
-- 
2.54.0