Re: Reduce "Var IS [NOT] NULL" quals during constant folding

BharatDB <bharatdbpg@gmail.com>

From: BharatDB <bharatdbpg@gmail.com>
To: Richard Guo <guofenglinux@gmail.com>
Cc: Junwang Zhao <zhjwpku@gmail.com>, Nathan Bossart <nathandbossart@gmail.com>, Tomas Vondra <tomas@vondra.me>, Tom Lane <tgl@sss.pgh.pa.us>, Robert Haas <robertmhaas@gmail.com>, Peter Eisentraut <peter@eisentraut.org>, David Rowley <dgrowleyml@gmail.com>, Tender Wang <tndrwang@gmail.com>, Pg Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-09-18T12:46:02Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Fix misuse of Relids for storing attribute numbers

  2. Reduce "Var IS [NOT] NULL" quals during constant folding

  3. Centralize collection of catalog info needed early in the planner

  4. Expand virtual generated columns before sublink pull-up

  5. Expand virtual generated columns in the planner

Dear Team,

In continuation with the previous mail
(CAAh00ETEMEXntw1gxp=xP+4sqrz80tK1R4VEhTpqH9CJpxs-wA) regarding the
optimizations in PostgreSQL 18 to simplify query plans by folding away Var
IS [NOT] NULL checks on columns declared NOT NULL. I experimented with two
approaches, but both hit significant errors:

*1. PlannerInfo-level hash table (HTAB *rel_notnull_info)*

   - The idea was to collect NOT NULL constraint info early and use it for
   constant folding.
   - gen_node_support.pl cannot handle non-serializable HTAB* fields when
   generating node serialization code, leading to compilation errors (“could
   not handle type HTAB*”).
   - Workarounds (e.g., /* nonserialized */ comments) fail due to comment
   stripping, and marking the whole PlannerInfo with
pg_node_attr(no_copy_equal,
   no_read_write) risks breaking features like parallel query execution or
   plan caching.
   - Other limitations include potential ABI stability issues from
   modifying node structs, increased memory usage from hash tables in nodes,
   and the preference for per-relation data structures (e.g., in RelOptInfo)
   over global ones.
   - A global hash table is a viable alternative but complicates subquery
   handling.

*2. Planner-level relattrinfo_htab for column nullability*

   - This avoids touching node serialization, but still suffers from
   practical issues.
   - It crashes during initdb because catalog state is unavailable in
   bootstrap mode, requires fragile lifecycle management to avoid memory leaks
   or stale entries which leads to risking leaks or stale state, and largely
   duplicates the existing var_is_nonnullable() logic.
   - In practice, it yields minimal performance benefit since constant
   folding and nullability inference are largely handled in core

I’d appreciate feedback on whether pursuing either direction makes sense,
or whether improvements should instead focus on extending the existing
var_is_nonnullable() framework.

Sincerely,
Soumya

On Fri, Sep 12, 2025 at 7:51 AM Richard Guo <guofenglinux@gmail.com> wrote:

> On Mon, Sep 8, 2025 at 10:08 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
> > On Mon, Sep 8, 2025 at 4:21 PM Richard Guo <guofenglinux@gmail.com>
> wrote:
> > > Your patch misses one spot: the notnullattnums in
> > > get_relation_notnullatts() should also be fixed.  Otherwise it LGTM.
>
> > True, attached v2 adds that missing spot, thanks for the review.
>
> Pushed.  Thanks for the report and fix.
>
> - Richard
>
>
>