Thread
-
Re: Function scan FDW pushdown
Alexander Korotkov <aekorotkov@gmail.com> — 2026-05-11T12:02:24Z
On Mon, May 11, 2026 at 9:01 AM solaimurugan vellaipandiyan < drsolaimurugan.v@gmail.com> wrote: > Thanks for the detailed explanation and the example queries. That > helped me better understand the costing behavior behind the pushdown > decisions. > > I tested the new v4 patch on current master using a postgres_fdw > loopback setup with local table t1 and foreign table ft1. > > Here’s what I observed: > - Before ANALYZE, the planner chose a local Hash Join with separate > Foreign Scan and Function Scan nodes. > - After running ANALYZE on the empty table, it still preferred local > execution, which makes sense based on the costing explanation. > - I then inserted 1000 rows into t1, ran ANALYZE again, and repeated the test. > > Even after that, I still got a local Hash Join plan like: > Hash Join > - Foreign Scan on ft1 > - Function Scan on generate_series > I wasn’t able to observe the pushed-down Foreign Scan plan shown in > the example from the thread. > > The patch itself applied and built successfully on my side, so this > may just be due to planner cost differences or environment-specific > behavior on current master. This also comes from the cost model. Check this example. # CREATE TABLE t1 (id int); # INSERT INTO t1 SELECT g FROM generate_series(1, 1000) g; # CREATE FOREIGN TABLE ft1 (id int) SERVER loopback OPTIONS (table_name 't1'); # ANALYZE t1; # ANALYZE ft1; By default the local join is selected. # EXPLAIN (VERBOSE, COSTS ON) SELECT * FROM ft1, generate_series(1, 100) AS g(id) WHERE ft1.id = g.id; QUERY PLAN ----------------------------------------------------------------------------------------------- Hash Join (cost=102.25..332.00 rows=100 width=8) Output: ft1.id, g.id Hash Cond: (ft1.id = g.id) -> Foreign Scan on public.ft1 (cost=100.00..325.00 rows=1000 width=4) Output: ft1.id Remote SQL: SELECT id FROM public.t1 -> Hash (cost=1.00..1.00 rows=100 width=4) Output: g.id -> Function Scan on pg_catalog.generate_series g (cost=0.00..1.00 rows=100 width=4) Output: g.id Function Call: generate_series(1, 100) (11 rows) However, we can force remote join using enable_* options. You can see it has higher cost. This is because estimate_path_cost_size() expects join operator to be applied to the whole cross-product. # SET enable_hashjoin = off; # SET enable_mergejoin = off; # SET enable_nestloop = off; # EXPLAIN (VERBOSE, COSTS ON) SELECT * FROM ft1, generate_series(1, 100) AS g(id) WHERE ft1.id = g.id; QUERY PLAN ----------------------------------------------------------------------------------------------- Hash Join (cost=102.25..332.00 rows=100 width=8) Output: ft1.id, g.id Hash Cond: (ft1.id = g.id) -> Foreign Scan on public.ft1 (cost=100.00..325.00 rows=1000 width=4) Output: ft1.id Remote SQL: SELECT id FROM public.t1 -> Hash (cost=1.00..1.00 rows=100 width=4) Output: g.id -> Function Scan on pg_catalog.generate_series g (cost=0.00..1.00 rows=100 width=4) Output: g.id Function Call: generate_series(1, 100) (11 rows) # RESET enable_hashjoin; # RESET enable_mergejoin; # RESET enable_nestloop; Also, this can be fixed using remote estimate. You also can check that v3 regression tests by Pyhalov use this approach as well. # ALTER FOREIGN TABLE ft1 OPTIONS (ADD use_remote_estimate 'true'); # EXPLAIN (VERBOSE, COSTS ON) SELECT * FROM ft1, generate_series(1, 100) AS g(id) WHERE ft1.id = g.id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Foreign Scan (cost=102.25..143.00 rows=100 width=8) Output: ft1.id, g.id Relations: (public.ft1) INNER JOIN (Function g) Remote SQL: SELECT r1.id, f2.c1 FROM (public.t1 r1 INNER JOIN generate_series(1, 100) f2(c1) ON (((r1.id = f2.c1)))) (4 rows) Thus, I don't see it to be a problem of this specific patch. I think this is general inaccuracy of postgres_fdw cost model. ------ Regards, Alexander Korotkov Supabase