Re: BUG #19353: Error XX000 if referencing expanded array in grouping set: variable not found in subplan target list

Tender Wang <tndrwang@gmail.com>

From: Tender Wang <tndrwang@gmail.com>
To: Richard Guo <guofenglinux@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, marian.muller@serli.com, pgsql-bugs@lists.postgresql.org
Date: 2025-12-16T11:30:45Z
Lists: pgsql-bugs
Richard Guo <guofenglinux@gmail.com> 于2025年12月15日周一 22:29写道:

> On Sat, Dec 13, 2025 at 7:02 PM Tender Wang <tndrwang@gmail.com> wrote:
> > Tom Lane <tgl@sss.pgh.pa.us> 于2025年12月13日周六 00:54写道:
> >> PG Bug reporting form <noreply@postgresql.org> writes:
> >> > After upgrading to Postgres 18 I've come across an error I wasn't
> getting
> >> > beforehand. Here's a minimal way to reproduce the issue, that used to
> work
> >> > well in Postgres 12 and 17 at least.
>
> >> Thank you for this well-crafted bug report!  Bisecting shows that
> >> it broke at
> >>
> >> f5050f795aea67dfc40bbc429c8934e9439e22e7 is the first bad commit
> >> commit f5050f795aea67dfc40bbc429c8934e9439e22e7 (HEAD)
> >> Author: Richard Guo <rguo@postgresql.org>
> >> Date:   Tue Sep 10 12:36:48 2024 +0900
> >>
> >>     Mark expressions nullable by grouping sets
>
> > When there is SRF in the query, the grouping_target changes  in
> grouping_planner when entering the following function:
> > split_pathtarget_at_srfs(root, grouping_target, scanjoin_target,
> > &grouping_targets,
> > &grouping_targets_contain_srfs);
>
> Yeah, the issue happens here.  In split_pathtarget_at_srfs(), if we
> find a subexpression that matches an expression already computed in
> the previous plan level, we should treat it as a Var and should not
> split it further.  setrefs.c will later replace the expression with a
> Var referencing the subplan output.
>
> However, when processing the grouping target for grouping sets, this
> function can fail to recognize that an expression is already computed
> in the scan/join phase.  This happens because the comparison crosses
> the grouping boundary: expressions in the grouping target may carry
> the grouping nulling bit, while the corresponding expressions in the
> scan/join target do not.
>

Yes, grouping target containing nulling bits makes
equal(var_grouping_target, var_in_scan_join_target) return false.
The nulling bits in the grouping target should be ignored when processing
across the grouping boundary.


> This mismatch leads this function to incorrectly assume that the
> expression (e.g., unnest(markets)) needs to be re-evaluated from its
> arguments, which are often unavailable in the subplan.
>
> To fix, I think we should ignore the grouping nulling bit when
> checking if an expression from the grouping target is available in the
> pre-grouping input target.  This is actually what we do in setrefs.c.
>
> Hence, attached patch.
>

The patch works for me. All regression tests pass with this patch.

>
> (I'm currently on vacation and will take a closer look when I return.
> Anyone who wants to move this forward before then is welcome.)
>

Enjoy your vacation!


-- 
Thanks,
Tender Wang