Re: BUG #19106: Potential regression with CTE materialization planning in Postgres 18
Tom Lane <tgl@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Vik Fearing <vik@postgresfriends.org>
Cc: Kamil Monicz <kamil@monicz.dev>, pgsql-bugs@lists.postgresql.org,
Peter Eisentraut <peter@eisentraut.org>
Date: 2025-11-11T20:27:59Z
Lists: pgsql-bugs
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Don't allow CTEs to determine semantic levels of aggregates.
- b85397481ce3 14.21 landed
- 9f5a58aacf65 15.16 landed
- 35b5c62c3ad9 19 (unreleased) landed
- 1c8c3206f4e0 16.12 landed
- 12bc3291772e 18.2 landed
- 075a763e2d42 17.8 landed
-
Calculate agglevelsup correctly when Aggref contains a CTE.
- b0cc0a71e0a0 19 (unreleased) cited
Vik Fearing <vik@postgresfriends.org> writes: > I had a rummage through the archives but couldn't easily find the paper > introducing aggregates so I can't see what the justification for that > rule was. This language was not in 1989 but is in 1992. It may just be a > case of "this is what we've implemented so this is what we are specifying." Thanks for doing that research. It's not at all surprising if back in the early 90's nobody had tried to make it work for sub-selects (or at least had not succeeded), so they just wrote the spec to not require it. After sleeping on it I feel that my proposal of "force the aggregate to have agglevelsup = 0" is a reasonably sane solution. I'd originally sought some minimal adjustment to the make-CTE-refs-work-like-Vars approach of b0cc0a71e, like moving the aggregate down one level if we detect that the CTE reference is to a sibling CTE. However, that wouldn't be sufficient to deal with nested WITHs, for example WITH a AS ( SELECT id FROM (VALUES (1), (2)) AS v(id) ), b AS ( WITH b1 AS ( SELECT max((SELECT sum(id) FROM a)) AS agg ) SELECT * FROM b1 ) SELECT agg FROM b; "Move down one level" would assign the max() to the "WITH b1 ... SELECT * FROM b1" level, so we still have the bug. "Move down through all levels of WITH" might do the trick, or it might not. In any case it's feeling arbitrary and rather far away from what we do for Vars. So I feel like the analogy to Vars was fatally flawed to start with. However, the real reason why I'm feeling good about the simplistic solution is: if we invent some more-complex rule, whose life are we making better? I think the submitted problem query is an example of a common SQL programming idiom, which is to use a series of WITH CTEs to chop up a complex computation into small independent black boxes. It's the exact opposite of independence if the positioning of a CTE reference in a later CTE affects the semantics of aggregates in that CTE. Now, I fear we can't solve the problem that b0cc0a71e set out to solve without having some impact of that sort, but we should keep it as minimal and surprise-free as possible. Forcing affected aggregates to have their semantic level equal to their syntactic level seems about as surprise-free as we can get. Especially since that would have been the result we produced before b0cc0a71e in all non-contrived cases. The bug report that induced us to do b0cc0a71e was a pretty contrived case if you ask me: why would you combine an outer Var reference with a select from a CTE? And if you did, why would you expect the surrounding aggregate to be considered non-local? regards, tom lane