Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

Tom Lane <tgl@sss.pgh.pa.us>

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Richard Guo <guofenglinux@gmail.com>
Cc: David Rowley <dgrowleyml@gmail.com>, 邱宇航 <iamqyh@gmail.com>, Bruce Momjian <bruce@momjian.us>, PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2025-10-17T16:18:40Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Fix test case from 40c242830

  2. Fix pushdown of degenerate HAVING clauses

  3. Allow pushdown of HAVING clauses with grouping sets

  4. Mark expressions nullable by grouping sets

Richard Guo <guofenglinux@gmail.com> writes:
> Having heard nothing but crickets and not wanting to leave this until
> the 11th hour before November, I'll plan to push the v1 patch next
> week, unless there are any objections.

I started to look at this again, and now I'm thinking that there is
indeed an issue related to "Query 1".  Recall that the test setup is

	CREATE TABLE t(id int);
	INSERT INTO t SELECT generate_series(1, 3);

If we do

regression=# SELECT id, 'XXX' FROM t GROUP BY ROLLUP (id, 1);

we get this, which seems correct:

 id | ?column? 
----+----------
    | XXX
  3 | XXX
  2 | XXX
  1 | XXX
  3 | XXX
  2 | XXX
  1 | XXX
(7 rows)

But leave out the "id" output, and look what happens:

regression=# SELECT 'XXX' FROM t GROUP BY ROLLUP (id, 1);
 ?column? 
----------
 
 XXX
 XXX
 XXX
 
 
 
(7 rows)

How can that be correct??  Simplifying further to

regression=# SELECT 'XXX' FROM t GROUP BY ROLLUP (id);
 ?column? 
----------
 XXX
 XXX
 XXX
 XXX
(4 rows)

restores sanity.  I've not dug into the code, but these two examples
make it look like we think that 'XXX' is dependent on '1', which
surely it is not, most especially since it shouldn't vary depending
on whether "id" is included as an output column.

This behavior is the same as in v17, but that doesn't make it not
broken.

			regards, tom lane