Thread
-
Re: BUG #19106: Potential regression with CTE materialization planning in Postgres 18
Richard Guo <guofenglinux@gmail.com> — 2025-11-12T07:04:05Z
On Wed, Nov 12, 2025 at 5:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thanks for doing that research. It's not at all surprising if back > in the early 90's nobody had tried to make it work for sub-selects > (or at least had not succeeded), so they just wrote the spec to not > require it. I played with this patch, but I couldn't quite wrap my head around the expected behavior of using subqueries as arguments to aggregate functions. The outputs of the following queries are confusing to me. create table t (a int); insert into t values (1), (2); Query 1: select (select sum((select a from t t1 limit 1))) from t t2; sum ----- 1 1 (2 rows) As I understand it, a query of the form: SELECT <scalar_expression> FROM table; ... produces one output row for each row in the table, with the value of <scalar_expression> evaluated for that row. Thus, the output of Query 1 makes sense to me. Query 2: select (select sum((select a from t t1 where a = t2.a or true limit 1))) from t t2; sum ----- 2 (1 row) I don't quite understand the output of Query 2. The subquery is now correlated with the outer table t2, but I believe it's still in the same form as Query 1, so I would expect it to also produce one output row per table row. Moreover, IIUC, the "or true" clause should make the two queries semantically equivalent. Query 3: with t as (select a from (values (1), (2)) as v(a)) select (select sum((select a from t t1 where a = t2.a or true limit 1))) from t t2; sum ----- 1 1 (2 rows) Query 3 replaces the physical table with a CTE that produces the same logical table content, so I would expect the query's output to remain unchanged. So the differing outputs of Query 2 and Query 3 are also confusing to me. - Richard