0001-Allow-a-CustomScan-to-receive-a-pushed-down-hashjoin.patch.text

text/plain

Filename: 0001-Allow-a-CustomScan-to-receive-a-pushed-down-hashjoin.patch.text
Type: text/plain
Part: 0
Message: Re: hashjoins vs. Bloom filters (yet again)
From ff734511d22bcb93f5c1256fd745a9d21818f7f1 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <andrew@dunslane.net>
Date: Sun, 31 May 2026 07:13:48 -0400
Subject: [PATCH addon 1/3] Allow a CustomScan to receive a pushed-down
 hashjoin bloom filter

Extend the hashjoin bloom-filter pushdown so that a base-relation
CustomScan can be a recipient, gated on a new opt-in path flag
CUSTOMPATH_SUPPORT_BLOOM_FILTERS.  This lets a table AM implemented as a
CustomScan scan provider consume the filter and apply it inside its own
scan loop -- for a column store, at row-group or dictionary granularity,
before decompression -- rather than only rejecting an already-produced
tuple.

find_bloom_filter_recipient() now treats a base-rel CustomScan
(scanrelid > 0) that advertised the flag the same as a SeqScan.  The
probe is not wired into ExecScanExtended() (a CustomScan dispatches to
the provider's ExecCustomScan), so the provider calls ExecBloomFilters()
itself; ExecInitCustomScan() compiles the probe state up front via
ExecInitBloomFilters() so the provider need not touch bloom internals.
set_customscan_references() fixes the pushed key expressions for a
base-relation scan just like the scan qual.

Providers that do not set the flag, and heap, are unaffected.
---
 src/backend/executor/nodeCustom.c       | 10 ++++++++++
 src/backend/optimizer/plan/createplan.c | 19 +++++++++++++++++++
 src/backend/optimizer/plan/setrefs.c    | 10 ++++++++++
 src/include/nodes/extensible.h          |  2 ++
 4 files changed, 41 insertions(+)

diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index b7cc890cd20..dfd87e49737 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -101,6 +101,16 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
 	css->ss.ps.qual =
 		ExecInitQual(cscan->scan.plan.qual, (PlanState *) css);
 
+	/*
+	 * Set up any bloom filters a hash join pushed down to this scan (see
+	 * nodeHashjoin.c).  This compiles the probe expressions against the scan
+	 * tuple slot; the provider is responsible for actually probing them with
+	 * ExecBloomFilters() from its ExecCustomScan callback, at whatever
+	 * granularity it supports.  A no-op unless the provider advertised
+	 * CUSTOMPATH_SUPPORT_BLOOM_FILTERS and the planner found a filter to push.
+	 */
+	ExecInitBloomFilters((PlanState *) css, css->ss.ss_ScanTupleSlot);
+
 	/*
 	 * The callback of custom-scan provider applies the final initialization
 	 * of the custom-scan-state node according to its logic.
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 7ecb551aae6..304ce0e3c0d 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -4799,6 +4799,25 @@ find_bloom_filter_recipient(Plan *plan, Index target_relid)
 					return plan;
 				return NULL;
 			}
+		case T_CustomScan:
+			{
+				/*
+				 * A CustomScan on a base relation can act as a recipient, but
+				 * only if the provider advertised that it knows how to consume
+				 * a pushed-down bloom filter.  Unlike the stock scans, the
+				 * probe is not performed by ExecScanExtended() (a CustomScan
+				 * dispatches to the provider's own ExecCustomScan); the
+				 * provider is responsible for calling ExecBloomFilters() at
+				 * whatever granularity it likes.  Non-leaf custom nodes have
+				 * scanrelid == 0 and so are rejected by the relid test.
+				 */
+				CustomScan *cscan = (CustomScan *) plan;
+
+				if ((cscan->flags & CUSTOMPATH_SUPPORT_BLOOM_FILTERS) &&
+					cscan->scan.scanrelid == target_relid)
+					return plan;
+				return NULL;
+			}
 		case T_Sort:
 		case T_IncrementalSort:
 		case T_Material:
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 0059acfccbe..74c7a5bf3a5 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1826,6 +1826,16 @@ set_customscan_references(PlannerInfo *root,
 		cscan->custom_exprs =
 			fix_scan_list(root, cscan->custom_exprs,
 						  rtoffset, NUM_EXEC_QUAL((Plan *) cscan));
+
+		/*
+		 * Bloom filters pushed down to a base-relation CustomScan: the key
+		 * expressions are plain Vars of the scanned relation, so they are
+		 * fixed up the same way as the scan qual.  (A CustomScan emitting a
+		 * custom_scan_tlist takes the branch above and would instead need
+		 * fix_upper_expr against the tlist index, like IndexOnlyScan; no
+		 * in-tree provider needs that yet.)
+		 */
+		fix_scan_bloom_filters(root, (Plan *) cscan, rtoffset);
 	}
 
 	/* Adjust child plan-nodes recursively, if needed */
diff --git a/src/include/nodes/extensible.h b/src/include/nodes/extensible.h
index 517db95c4a3..ea2cef4fe3b 100644
--- a/src/include/nodes/extensible.h
+++ b/src/include/nodes/extensible.h
@@ -84,6 +84,8 @@ extern const ExtensibleNodeMethods *GetExtensibleNodeMethods(const char *extnode
 #define CUSTOMPATH_SUPPORT_BACKWARD_SCAN	0x0001
 #define CUSTOMPATH_SUPPORT_MARK_RESTORE		0x0002
 #define CUSTOMPATH_SUPPORT_PROJECTION		0x0004
+/* provider can accept a hashjoin bloom filter pushed down to its scan */
+#define CUSTOMPATH_SUPPORT_BLOOM_FILTERS	0x0008
 
 /*
  * Custom path methods.  Mostly, we just need to know how to convert a
-- 
2.43.0