Re: Reduce "Var IS [NOT] NULL" quals during constant folding
Andrei Lepikhov <lepihov@gmail.com>
From: Andrei Lepikhov <lepihov@gmail.com>
To: Richard Guo <guofenglinux@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, Robert Haas <robertmhaas@gmail.com>,
Peter Eisentraut <peter@eisentraut.org>, David Rowley
<dgrowleyml@gmail.com>, Tender Wang <tndrwang@gmail.com>,
Pg Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-07-02T07:32:43Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix misuse of Relids for storing attribute numbers
- 2d756ebbe857 19 (unreleased) landed
-
Reduce "Var IS [NOT] NULL" quals during constant folding
- e2debb64380e 19 (unreleased) landed
-
Centralize collection of catalog info needed early in the planner
- 904f6a593a06 19 (unreleased) landed
-
Expand virtual generated columns before sublink pull-up
- e0d05295268e 19 (unreleased) landed
-
Expand virtual generated columns in the planner
- 1e4351af329f 18.0 cited
On 2/7/2025 03:24, Richard Guo wrote: > On Tue, Jul 1, 2025 at 10:57 PM Andrei Lepikhov <lepihov@gmail.com> wrote: >> I like the general idea of this work. But I wonder, why is a new hash >> table designed to store only the notnullattnums field? From the >> discussion, it is not apparent why not to cache all (or most of) the >> data needed for get_relation_info. In cases where multiple subqueries >> reference the same table, it could save some cycles and memory. > > I think this idea was already thoroughly discussed earlier in this > thread when Robert proposed moving get_relation_info() to an earlier > stage. One reason against it is that not every RTE_RELATION relation > will be actively part of the query. Collecting the whole bundle of > catalog information for such relations is wasteful and can negatively > impact performance. I'm trying to understand the phrase "not every relation ...". Could you clarify that? I know that Postgres can eliminate some self-joins and outer joins, and might determine that a WHERE clause is always false, etc. However, these cases seem to be rare, especially when users refine their queries. Additionally, AFAICS, this is not an issue for partition pruning. Generally, I believe these optimisations should have a positive impact. So, I think "not actively participate" might mean something different. I must say that I appreciate Tom's idea and see significant benefits in making the parse tree a read-only structure. In complex queries, it can be frustrating to make copies of the parse tree, leading to complaints from users about insufficient memory allocation. This is why, in our enterprise fork, we support a specific option to avoid copying the parse tree multiple times. Therefore, it would be better to find a way to refactor the `preprocess_relation_rtes` function to gather table statistics lazily into the hash table when they are needed. For example, we could do this at the moment of creating the `RelOptInfo` or before a subquery pull-up, without modifying the RTE at all. -- regards, Andrei Lepikhov