Re: Reduce "Var IS [NOT] NULL" quals during constant folding
Andrei Lepikhov <lepihov@gmail.com>
From: Andrei Lepikhov <lepihov@gmail.com>
To: Richard Guo <guofenglinux@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, Robert Haas <robertmhaas@gmail.com>,
Peter Eisentraut <peter@eisentraut.org>, David Rowley
<dgrowleyml@gmail.com>, Tender Wang <tndrwang@gmail.com>,
Pg Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-07-03T09:08:54Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix misuse of Relids for storing attribute numbers
- 2d756ebbe857 19 (unreleased) landed
-
Reduce "Var IS [NOT] NULL" quals during constant folding
- e2debb64380e 19 (unreleased) landed
-
Centralize collection of catalog info needed early in the planner
- 904f6a593a06 19 (unreleased) landed
-
Expand virtual generated columns before sublink pull-up
- e0d05295268e 19 (unreleased) landed
-
Expand virtual generated columns in the planner
- 1e4351af329f 18.0 cited
On 3/7/2025 02:30, Richard Guo wrote: > On Wed, Jul 2, 2025 at 6:44 PM Andrei Lepikhov <lepihov@gmail.com> wrote: >> I apologise for the confusion in my previous message. I am not >> suggesting that we postpone this. Instead, I would like an explanation >> of why you believe that accessing the table statistics earlier could >> negatively impact planner performance. As I mentioned before, I have >> only envisioned rare instances where join eliminations may reduce the >> number of relations and clause evaluations resulting in a constant. > > I wonder how you arrived at the conclusion that these cases are rare. > If they truly are, then why have we invested so much effort in > optimizing for them? There is no direct connection between effort and frequency; it primarily depends on personal desire. As you might find, much of the effort goes into convincing the community. These specific cases should be rare from the Postgres perspective, the planner's code remains simple based on the assumption that crafting the appropriate query is the user's responsibility. > > I also wonder why you think we should collect all catalog information > at the very early stage of the planner, given that most of it is only > used much later -- after RelOptInfos have been created. If the goal > is to avoid redundant catalog retrieval for the same relation in > get_relation_info(), perhaps adding a caching mechanism within that > function would be a more targeted solution. I don't see a strong > reason for moving get_relation_info() to the very beginning of the > planner. This indicates that there is still room for further exploration and discussion. For starters, the 'Redundant NullTest' issue is not the only concern. Additionally, Postgres processes pull-up transformation blindly without considering the cost model. However, each pull-up has its corner case, and in practice, we often see new complaints arise after a new pull-up technique is committed. One possible solution I envision could be to examine indexes and/or make raw initial estimations to avoid problematic pull-up cases. -- regards, Andrei Lepikhov