Thread

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Fix test case from 40c242830

  2. Fix pushdown of degenerate HAVING clauses

  3. Allow pushdown of HAVING clauses with grouping sets

  4. Mark expressions nullable by grouping sets

  1. Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    邱宇航 <iamqyh@gmail.com> — 2025-09-23T03:48:58Z

    I've noticed that two GROUP BY ROLLUP queries behave differently in v17
    compared to master and REL_18_STABLE. The issue can be reproduced by
    following SQL:
    
    ``` SQL
    CREATE TABLE t(id int);
    
    INSERT INTO t SELECT generate_series(1, 3);
    
    -- Query 1
    SELECT DISTINCT 'XXX'
    FROM t
    GROUP BY ROLLUP (id, 1);
    
    -- Query 2
    SELECT 'XXX'
    FROM t
    GROUP BY ROLLUP(id)
    HAVING NOT (NULL IS NULL);
    ```
    
    After some git bisect work, I traced the root cause:
    - The first issue was introduced by commit f5050f79 (Mark expressions
    nullable by grouping sets).
    - The second issue stems from commit 67a54b9e (Allow pushdown of HAVING
    clauses with grouping sets).
    
    Best regards,
    Yuhang Qiu
    
    
    
    
    
  2. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    David Rowley <dgrowleyml@gmail.com> — 2025-09-23T07:38:14Z

    On Tue, 23 Sept 2025 at 15:49, 邱宇航 <iamqyh@gmail.com> wrote:
    > I've noticed that two GROUP BY ROLLUP queries behave differently in v17
    > compared to master and REL_18_STABLE. The issue can be reproduced by
    > following SQL:
    >
    > After some git bisect work, I traced the root cause:
    > - The first issue was introduced by commit f5050f79 (Mark expressions
    > nullable by grouping sets).
    > - The second issue stems from commit 67a54b9e (Allow pushdown of HAVING
    > clauses with grouping sets).
    
    If you check the release notes and the commit message for f5050f795
    you'll see that it does mention that wrong results could be returned.
    
    What wasn't mentioned was that this wasn't fixed in prior versions.
    The reason being is that the fix required changing the query tree
    representation, which we can't change in the back branches due to
    incompatibility with stored rules in existing databases. So, a change
    in query results for certain queries here is expected.
    
    David
    
    
    
    
  3. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Bruce Momjian <bruce@momjian.us> — 2025-09-23T11:58:56Z

    On Tue, Sep 23, 2025 at 07:38:14PM +1200, David Rowley wrote:
    > On Tue, 23 Sept 2025 at 15:49, 邱宇航 <iamqyh@gmail.com> wrote:
    > > I've noticed that two GROUP BY ROLLUP queries behave differently in v17
    > > compared to master and REL_18_STABLE. The issue can be reproduced by
    > > following SQL:
    > >
    > > After some git bisect work, I traced the root cause:
    > > - The first issue was introduced by commit f5050f79 (Mark expressions
    > > nullable by grouping sets).
    > > - The second issue stems from commit 67a54b9e (Allow pushdown of HAVING
    > > clauses with grouping sets).
    > 
    > If you check the release notes and the commit message for f5050f795
    > you'll see that it does mention that wrong results could be returned.
    > 
    > What wasn't mentioned was that this wasn't fixed in prior versions.
    > The reason being is that the fix required changing the query tree
    > representation, which we can't change in the back branches due to
    > incompatibility with stored rules in existing databases. So, a change
    > in query results for certain queries here is expected.
    
    Uh, by design, items mentioned in the major release notes have _not_
    been fixed in previous minor versions.  Not sure if we can make that
    clearer to users.  I did write a blog about this:
    
    	https://momjian.us/main/blogs/pgblog/2022.html#June_13_2022
    
    -- 
      Bruce Momjian  <bruce@momjian.us>        https://momjian.us
      EDB                                      https://enterprisedb.com
    
      Do not let urgent matters crowd out time for investment in the future.
    
    
    
    
  4. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    邱宇航 <iamqyh@gmail.com> — 2025-09-23T12:26:40Z

    > 2025年9月23日 15:38,David Rowley <dgrowleyml@gmail.com> 写道:
    > 
    > If you check the release notes and the commit message for f5050f795
    > you'll see that it does mention that wrong results could be returned.
    > 
    > What wasn't mentioned was that this wasn't fixed in prior versions.
    > The reason being is that the fix required changing the query tree
    > representation, which we can't change in the back branches due to
    > incompatibility with stored rules in existing databases. So, a change
    > in query results for certain queries here is expected.
    
    Yeah, I got it. Thanks David and Bruce.
    
    And what about the query 2. This is caused by another commit, and
    it's not mentioned in the commit message or the mailing discussion.
    
    Best regards,
    Yuhang Qiu
    
    
    
    
    
  5. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-23T14:39:02Z

    =?utf-8?B?6YKx5a6H6Iiq?= <iamqyh@gmail.com> writes:
    > And what about the query 2. This is caused by another commit, and
    > it's not mentioned in the commit message or the mailing discussion.
    
    That one indeed seems quite broken.  EXPLAIN confirms that it's
    pushing the HAVING below the aggregation, which is simply wrong
    because it fails to filter the all-null row(s) that the aggregation
    node will create out of thin air.
    
    Is there anything we can salvage from 67a54b9e, or should
    we just revert it?
    
    			regards, tom lane
    
    
    
    
  6. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    David Rowley <dgrowleyml@gmail.com> — 2025-09-24T00:10:00Z

    On Wed, 24 Sept 2025 at 02:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Is there anything we can salvage from 67a54b9e, or should
    > we just revert it?
    
    It doesn't seem great that we need to reconsider the safety of that
    optimisation post-release. It's not as if 67a54b9e added several cases
    to test for and got one of them wrong. It's a case of the 1 case it
    did add wasn't fully thought through.
    
    As for fixing it; testing for a Const-false havingClause can't be done
    as that only works for this case which const-folding happens to figure
    out during planning. You could equally have something Var-less like:
    
    create or replace function one() returns int as $$ begin return 1;
    end; $$ stable language plpgsql;
    SELECT 'XXX',count(*) FROM t GROUP BY ROLLUP(id) HAVING NOT one()=1;
    
    which isn't known at plan-time. For me, I can't see how to make this
    safe and I suspect due to your above question that you're in a similar
    situation. Reverting and reconsidering for v19 seems like the safest
    option.
    
    Let's see what Richard thinks.
    
    David
    
    
    
    
  7. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-24T01:23:08Z

    David Rowley <dgrowleyml@gmail.com> writes:
    > As for fixing it; testing for a Const-false havingClause can't be done
    > as that only works for this case which const-folding happens to figure
    > out during planning. You could equally have something Var-less like:
    
    Yeah, I don't think the havingClause being constant-false is the
    key point here.  I've not looked closely at 67a54b9e, but I suspect
    that anything Var-free is potentially an issue.  Or it might even
    fail for something that has Vars but is non-strict.
    
    The core of the problem though is that the aggregation node will
    issue an all-nulls row that did not come from its input, and we
    have to be sure that the HAVING clause is properly evaluated
    for that row.
    
    			regards, tom lane
    
    
    
    
  8. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-09-24T03:10:08Z

    On Wed, Sep 24, 2025 at 10:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Yeah, I don't think the havingClause being constant-false is the
    > key point here.  I've not looked closely at 67a54b9e, but I suspect
    > that anything Var-free is potentially an issue.  Or it might even
    > fail for something that has Vars but is non-strict.
    >
    > The core of the problem though is that the aggregation node will
    > issue an all-nulls row that did not come from its input, and we
    > have to be sure that the HAVING clause is properly evaluated
    > for that row.
    
    I think the issue is that an empty grouping set can produce a row out
    of nowhere, and pushing down HAVING clauses in this case may cause us
    to fail to filter out that row.
    
    Currently, non-variable-free HAVING clauses aren't pushed down when
    there is an empty grouping set, because the empty grouping set would
    nullify the vars referenced in those clauses, and we have logic in
    place to prevent their pushdown.
    
    However, we (I) overlooked variable-free HAVING clauses.  If there are
    only empty grouping sets, this is not a problem since a copy of the
    HAVING clauses is retained in HAVING.  The issue arises when there are
    both nonempty and empty grouping sets.
    
    In summary, the problem occurs when both of the following conditions
    are met: 1) there are both nonempty and empty grouping sets, 2) there
    are variable-free HAVING clauses.
    
    I think the following change fixes this problem.
    
        foreach(l, (List *) parse->havingQual)
        {
            Node       *havingclause = (Node *) lfirst(l);
    +       Relids     having_relids;
    
            if (contain_agg_clause(havingclause) ||
                contain_volatile_functions(havingclause) ||
                contain_subplans(havingclause) ||
                (parse->groupClause && parse->groupingSets &&
    -            bms_is_member(root->group_rtindex, pull_varnos(root,
    havingclause))))
    +            ((having_relids = pull_varnos(root, havingclause)) == NULL ||
    +            bms_is_member(root->group_rtindex, having_relids))))
            {
                /* keep it in HAVING */
                newHaving = lappend(newHaving, havingclause);
    
    This change essentially prevents variable-free HAVING clauses from
    being pushed down when there are any nonempty grouping sets.  Of
    course, this could be done more precisely -- for example, we could
    restrict the prevention only to cases where empty grouping sets are
    also present, but I'm not sure it's worth the effort.
    
    - Richard
    
    
    
    
  9. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-09-24T08:18:35Z

    On Wed, Sep 24, 2025 at 12:10 PM Richard Guo <guofenglinux@gmail.com> wrote:
    > In summary, the problem occurs when both of the following conditions
    > are met: 1) there are both nonempty and empty grouping sets, 2) there
    > are variable-free HAVING clauses.
    >
    > I think the following change fixes this problem.
    >
    >     foreach(l, (List *) parse->havingQual)
    >     {
    >         Node       *havingclause = (Node *) lfirst(l);
    > +       Relids     having_relids;
    >
    >         if (contain_agg_clause(havingclause) ||
    >             contain_volatile_functions(havingclause) ||
    >             contain_subplans(havingclause) ||
    >             (parse->groupClause && parse->groupingSets &&
    > -            bms_is_member(root->group_rtindex, pull_varnos(root,
    > havingclause))))
    > +            ((having_relids = pull_varnos(root, havingclause)) == NULL ||
    > +            bms_is_member(root->group_rtindex, having_relids))))
    >         {
    >             /* keep it in HAVING */
    >             newHaving = lappend(newHaving, havingclause);
    >
    > This change essentially prevents variable-free HAVING clauses from
    > being pushed down when there are any nonempty grouping sets.  Of
    > course, this could be done more precisely -- for example, we could
    > restrict the prevention only to cases where empty grouping sets are
    > also present, but I'm not sure it's worth the effort.
    
    Here is the patch.
    
    - Richard
    
  10. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    wenhui qiu <qiuwenhuifx@gmail.com> — 2025-09-24T09:19:41Z

    Hi Richard
     > I think the following change fixes this problem.
    >
    >     foreach(l, (List *) parse->havingQual)
    >     {
    >         Node       *havingclause = (Node *) lfirst(l);
    > +       Relids     having_relids;
    >
    >         if (contain_agg_clause(havingclause) ||
    >             contain_volatile_functions(havingclause) ||
    >             contain_subplans(havingclause) ||
    >             (parse->groupClause && parse->groupingSets &&
    > -            bms_is_member(root->group_rtindex, pull_varnos(root,
    > havingclause))))
    > +            ((having_relids = pull_varnos(root, havingclause)) == NULL ||
    > +            bms_is_member(root->group_rtindex, having_relids))))
    >         {
    >             /* keep it in HAVING */
    >             newHaving = lappend(newHaving, havingclause);
    >
    > This change essentially prevents variable-free HAVING clauses from
    > being pushed down when there are any nonempty grouping sets.  Of
    > course, this could be done more precisely -- for example, we could
    > restrict the prevention only to cases where empty grouping sets are
    > also present, but I'm not sure it's worth the effort.
    I think it makes sense. However, it is now too close to the release date of
    v18.Awaiting Tom's feedback?
    
    
    Thanks
    
    On Wed, Sep 24, 2025 at 4:18 PM Richard Guo <guofenglinux@gmail.com> wrote:
    
    > On Wed, Sep 24, 2025 at 12:10 PM Richard Guo <guofenglinux@gmail.com>
    > wrote:
    > > In summary, the problem occurs when both of the following conditions
    > > are met: 1) there are both nonempty and empty grouping sets, 2) there
    > > are variable-free HAVING clauses.
    > >
    > > I think the following change fixes this problem.
    > >
    > >     foreach(l, (List *) parse->havingQual)
    > >     {
    > >         Node       *havingclause = (Node *) lfirst(l);
    > > +       Relids     having_relids;
    > >
    > >         if (contain_agg_clause(havingclause) ||
    > >             contain_volatile_functions(havingclause) ||
    > >             contain_subplans(havingclause) ||
    > >             (parse->groupClause && parse->groupingSets &&
    > > -            bms_is_member(root->group_rtindex, pull_varnos(root,
    > > havingclause))))
    > > +            ((having_relids = pull_varnos(root, havingclause)) == NULL
    > ||
    > > +            bms_is_member(root->group_rtindex, having_relids))))
    > >         {
    > >             /* keep it in HAVING */
    > >             newHaving = lappend(newHaving, havingclause);
    > >
    > > This change essentially prevents variable-free HAVING clauses from
    > > being pushed down when there are any nonempty grouping sets.  Of
    > > course, this could be done more precisely -- for example, we could
    > > restrict the prevention only to cases where empty grouping sets are
    > > also present, but I'm not sure it's worth the effort.
    >
    > Here is the patch.
    >
    > - Richard
    >
    
  11. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-09-25T01:01:22Z

    On Wed, Sep 24, 2025 at 5:18 PM Richard Guo <guofenglinux@gmail.com> wrote:
    > Here is the patch.
    
    I plan to push this patch soon, unless there are any objections.
    
    - Richard
    
    
    
    
  12. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    David Rowley <dgrowleyml@gmail.com> — 2025-09-25T02:05:18Z

    On Thu, 25 Sept 2025 at 13:01, Richard Guo <guofenglinux@gmail.com> wrote:
    > I plan to push this patch soon, unless there are any objections.
    
    What's your confidence levels on the logic now being correct? 100%?
    90%? Hopeful?
    
    I've not personally had the time to process the patch and figure out
    that this is now safe. It sounds like you have, but (with respect) I
    expect you also thought that before the initial commit too. I realise
    that you now have more of the picture, but how do we know the picture
    is now complete? I think we all know this stuff is hard and we also
    know that even the most seasoned of planner hackers don't always get
    it right first time.
    
    What I'm now wondering is the risk to reward ratio of fixing this in
    18.1. If it happens that it's still not right for 18.1, then we need
    to wait until 18.2, which is currently due Feb-26. I don't quite have
    the same confidence levels as you do, but maybe I would if I took the
    time to more carefully think about this.  For now, my thoughts are
    that it's safer to revert and try again for v19.  Probably it would
    make more sense to try harder for an 18.1 fix if this optimisation was
    a more critical and people had been waiting on it and there was no
    other way to obtain the benefits of it. But that's not quite the case
    here as, in theory, someone could rewrite their query if it's safe for
    their use case and end up with the same performance benefits.
    
    Just so it's clear, I'm not objecting to fixing for 18.1. I just want
    to ensure your judgment is for the project and not for
    self-preservation.  I think most people will respect the decision if
    you decide that you need more time to consider this and opt to revert
    in v18 and fix only in master. Based on my current understanding of
    the optimisation, I'd certainly be more at ease with that.
    
    On the other hand, if you're 100% confident, I'll step aside.
    
    David
    
    
    
    
  13. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-09-25T03:24:07Z

    David Rowley <dgrowleyml@gmail.com> writes:
    > On Thu, 25 Sept 2025 at 13:01, Richard Guo <guofenglinux@gmail.com> wrote:
    >> I plan to push this patch soon, unless there are any objections.
    
    > What's your confidence levels on the logic now being correct? 100%?
    > 90%? Hopeful?
    
    FWIW, my confidence in it is rather low.  I've not had time to think
    this through carefully, but it seems to me that the test ought to
    involve whether there is an empty grouping set, yet the proposed
    patch does no such thing --- or at least, if it manages to achieve
    that effect, it's not obvious how.
    
    18.1 will not be coming out till November, so I feel no need to
    rush to judgment on what to do here.
    
    			regards, tom lane
    
    
    
    
  14. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-09-25T07:55:57Z

    On Thu, Sep 25, 2025 at 11:05 AM David Rowley <dgrowleyml@gmail.com> wrote:
    > What's your confidence levels on the logic now being correct? 100%?
    > 90%? Hopeful?
    >
    > I've not personally had the time to process the patch and figure out
    > that this is now safe. It sounds like you have, but (with respect) I
    > expect you also thought that before the initial commit too. I realise
    > that you now have more of the picture, but how do we know the picture
    > is now complete? I think we all know this stuff is hard and we also
    > know that even the most seasoned of planner hackers don't always get
    > it right first time.
    >
    > What I'm now wondering is the risk to reward ratio of fixing this in
    > 18.1. If it happens that it's still not right for 18.1, then we need
    > to wait until 18.2, which is currently due Feb-26. I don't quite have
    > the same confidence levels as you do, but maybe I would if I took the
    > time to more carefully think about this.  For now, my thoughts are
    > that it's safer to revert and try again for v19.  Probably it would
    > make more sense to try harder for an 18.1 fix if this optimisation was
    > a more critical and people had been waiting on it and there was no
    > other way to obtain the benefits of it. But that's not quite the case
    > here as, in theory, someone could rewrite their query if it's safe for
    > their use case and end up with the same performance benefits.
    >
    > Just so it's clear, I'm not objecting to fixing for 18.1. I just want
    > to ensure your judgment is for the project and not for
    > self-preservation.  I think most people will respect the decision if
    > you decide that you need more time to consider this and opt to revert
    > in v18 and fix only in master. Based on my current understanding of
    > the optimisation, I'd certainly be more at ease with that.
    >
    > On the other hand, if you're 100% confident, I'll step aside.
    
    Thanks for the input; that's a reasonable concern.  Although I've
    reviewed my analysis again and didn't find any new issues, I don't
    think I would claim to be 100% confident that the current logic is
    bug-free.  Realistically, I doubt anyone would make such a claim.
    
    I'm completely open to reverting this and revisiting it for v19.
    However, I'd really appreciate any reviews that point out specific
    issues in the current logic.  I've shared my analysis in my initial
    email as well as in the commit message of the proposed patch.  If
    anyone can highlight parts that don't make sense, we can discuss them
    and update the patch accordingly.  Without that kind of feedback, I'm
    concerned that we may just end up repeating the same code in v19.
    
    - Richard
    
    
    
    
  15. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-09-25T08:27:57Z

    On Thu, Sep 25, 2025 at 12:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > FWIW, my confidence in it is rather low.  I've not had time to think
    > this through carefully, but it seems to me that the test ought to
    > involve whether there is an empty grouping set, yet the proposed
    > patch does no such thing --- or at least, if it manages to achieve
    > that effect, it's not obvious how.
    
    I explained the test for empty grouping sets in my initial email as
    well as in the commit message of the proposed patch.  Apparently, I
    failed to make myself clear, so here's another attempt.
    
    To be clear, we are only discussing HAVING clauses that do not contain
    aggregates, since clauses with aggregates wouldn't be pushed down
    anyway.
    
    First, let's consider the presence of empty grouping sets.  There are
    two cases: either the query contains only empty grouping sets, or it
    contains both non-empty and empty grouping sets.
    
    In the former case (only empty grouping sets), the HAVING clauses must
    be degenerate (variable-free).  These are placed in the WHERE clause
    and also retained in HAVING.  This follows the same logic as prior to
    commit 67a54b9e8.
    
    In the latter case (both non-empty and empty grouping sets), for
    non-degenerate HAVING clauses, the empty grouping sets would nullify
    the vars referenced in the clauses, so they would not be pushed
    down.  This is ensured by the check bms_is_member(root->group_rtindex,
    having_relids).  For degenerate HAVING clauses, they are also
    not pushed down, guaranteed by the new check added in the proposed
    patch: having_relids == NULL.
    
    Next, suppose there are no empty grouping sets in the query.  For
    non-degenerate HAVING clauses, only those clauses that do not
    reference columns nullable by the grouping sets are pushed down, which
    is guaranteed by the same check: bms_is_member(root->group_rtindex,
    having_relids).  For degenerate HAVING clauses, the check
    having_relids == NULL prevents pushing them down, though it could be
    argued that in this case, they could be pushed down.  This is what I
    meant by (quoting from the commit message):
    
    "
    This could be done more precisely, for example by restricting the
    prevention only to cases where empty grouping sets are also present,
    but the added complexity does not seem justified.
    "
    
    In summary, under the current logic, HAVING clauses in queries with
    grouping sets are pushed down only in the following two cases:
    
    1) There are only empty grouping sets.
    
    In this case, the HAVING clauses are pushed down, but they are also
    retained in HAVING.
    
    2) There are non-empty grouping sets and the HAVING clauses are
    non-degenerate and they do not reference any columns nullable by the
    grouping sets.
    
    I believe both of the above cases are safe for pushing down HAVING
    clauses.  It could be argued that there may be more cases where we can
    push down HAVING clauses with grouping sets, but for now, I'd prefer
    to focusing on ensuring the current logic is safe.
    
    Does this analysis make sense to you?  Am I missing anything?
    
    > 18.1 will not be coming out till November, so I feel no need to
    > rush to judgment on what to do here.
    
    Thanks.  I'll wait for any feedback on the patch itself before
    deciding how to proceed.  I'd appreciate it if a review could be done
    before November.
    
    - Richard
    
    
    
    
  16. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-10-16T09:13:35Z

    On Thu, Sep 25, 2025 at 5:27 PM Richard Guo <guofenglinux@gmail.com> wrote:
    > On Thu, Sep 25, 2025 at 12:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > > 18.1 will not be coming out till November, so I feel no need to
    > > rush to judgment on what to do here.
    
    > Thanks.  I'll wait for any feedback on the patch itself before
    > deciding how to proceed.  I'd appreciate it if a review could be done
    > before November.
    
    Having heard nothing but crickets and not wanting to leave this until
    the 11th hour before November, I'll plan to push the v1 patch next
    week, unless there are any objections.
    
    - Richard
    
    
    
    
  17. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-10-17T16:18:40Z

    Richard Guo <guofenglinux@gmail.com> writes:
    > Having heard nothing but crickets and not wanting to leave this until
    > the 11th hour before November, I'll plan to push the v1 patch next
    > week, unless there are any objections.
    
    I started to look at this again, and now I'm thinking that there is
    indeed an issue related to "Query 1".  Recall that the test setup is
    
    	CREATE TABLE t(id int);
    	INSERT INTO t SELECT generate_series(1, 3);
    
    If we do
    
    regression=# SELECT id, 'XXX' FROM t GROUP BY ROLLUP (id, 1);
    
    we get this, which seems correct:
    
     id | ?column? 
    ----+----------
        | XXX
      3 | XXX
      2 | XXX
      1 | XXX
      3 | XXX
      2 | XXX
      1 | XXX
    (7 rows)
    
    But leave out the "id" output, and look what happens:
    
    regression=# SELECT 'XXX' FROM t GROUP BY ROLLUP (id, 1);
     ?column? 
    ----------
     
     XXX
     XXX
     XXX
     
     
     
    (7 rows)
    
    How can that be correct??  Simplifying further to
    
    regression=# SELECT 'XXX' FROM t GROUP BY ROLLUP (id);
     ?column? 
    ----------
     XXX
     XXX
     XXX
     XXX
    (4 rows)
    
    restores sanity.  I've not dug into the code, but these two examples
    make it look like we think that 'XXX' is dependent on '1', which
    surely it is not, most especially since it shouldn't vary depending
    on whether "id" is included as an output column.
    
    This behavior is the same as in v17, but that doesn't make it not
    broken.
    
    			regards, tom lane
    
    
    
    
  18. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-10-17T16:40:41Z

    I wrote:
    > I started to look at this again, and now I'm thinking that there is
    > indeed an issue related to "Query 1".
    
    Oh, scratch that, I see my mistake: I was thinking of "1" as a
    constant, but actually we must be interpreting it per SQL92 rules
    as a reference to output column 1.  So that's why adding or dropping
    the "id" output column changes the results of the grouping.
    
    -ENOCAFFEINE ... sorry for the noise.  I'll go back to studying
    the HAVING issue.
    
    			regards, tom lane
    
    
    
    
  19. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-10-17T18:41:06Z

    Richard Guo <guofenglinux@gmail.com> writes:
    > Having heard nothing but crickets and not wanting to leave this until
    > the 11th hour before November, I'll plan to push the v1 patch next
    > week, unless there are any objections.
    
    OK, I finally made some time to think this through, and here's
    my thoughts:
    
    The definition of HAVING is that it excludes grouped rows that don't
    satisfy the HAVING condition.  The idea behind the push-down-HAVING
    optimization is that if we exclude all source rows that don't satisfy
    the condition, then no groups that don't satisfy it will be formed, so
    we don't have to re-check the HAVING condition after forming groups.
    
    Grouping sets break this concept, which is why the pre-67a54b9e
    planner code just punted on the optimization if there were grouping
    sets.  There are two reasons why they cause trouble:
    
    1. If a variable doesn't appear in every grouping set, then it will
    sometimes go to NULL during grouping, meaning that the HAVING clause
    needs to test a different value than what is available at the scan
    level.  We did not have a good way to deal with that pre-v18, but now
    we can check to see whether the clause contains any variables marked
    as nullable by group_rtindex; if not, it should be safe to push down.
    
    2. If there is an empty grouping set, it will give rise to a grouped
    row whether there is any input or not.  In this case we *must* keep
    the HAVING condition at the top level, else it fails to filter that
    row.  (We might still choose to copy it to WHERE, however.)  Note
    that whether the HAVING condition is degenerate (variable-free) or
    not does not enter into this directly; although if it has any Vars
    they are certainly all nullable by group_rtindex, meaning that
    checking for #1 accidentally keeps us out of trouble unless the
    condition is degenerate.
    
    67a54b9e accounted for #1 but not #2, which is why we have a bug.
    
    The proposed patch tries to close the hole by checking whether
    the condition is degenerate, but that feels subtly wrong to me:
    what we ought to check is whether there is any empty grouping set.
    As proposed, I think we miss optimization opportunities for
    degenerate HAVING because we will not try the trick of copying
    it to WHERE.
    
    Also, I suspect that this change in 67a54b9e was wrong in detail:
    
                newHaving = lappend(newHaving, havingclause);
            }
    -       else if (parse->groupClause && !parse->groupingSets)
    +       else if (parse->groupClause)
            {
                Node       *whereclause;
    
    because it allows us to apply "move to WHERE" rather than "copy to
    WHERE" in cases with grouping sets, and that can't always be right.
    I think that if that change hadn't been made, we would not have
    had a bug.  So the fact that the proposed fix doesn't touch this
    bit does not seem right.
    
    			regards, tom lane
    
    
    
    
  20. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-10-17T21:14:02Z

    I wrote:
    > The proposed patch tries to close the hole by checking whether
    > the condition is degenerate, but that feels subtly wrong to me:
    > what we ought to check is whether there is any empty grouping set.
    > As proposed, I think we miss optimization opportunities for
    > degenerate HAVING because we will not try the trick of copying
    > it to WHERE.
    
    Concretely, I think we could do the attached.  This has the same
    test query as in v1, but the generated plan is better because it
    realizes it can copy the constant-false HAVING clause into WHERE,
    resulting in a dummy scan of the table.
    
    I'm not sure if planner.c is the best place to put
    has_empty_grouping_set().  I couldn't find any existing code doing the
    same thing, but maybe someday we'd want the functionality elsewhere.
    
    			regards, tom lane
    
    
  21. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-10-20T13:32:26Z

    On Sat, Oct 18, 2025 at 6:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > I wrote:
    > > The proposed patch tries to close the hole by checking whether
    > > the condition is degenerate, but that feels subtly wrong to me:
    > > what we ought to check is whether there is any empty grouping set.
    > > As proposed, I think we miss optimization opportunities for
    > > degenerate HAVING because we will not try the trick of copying
    > > it to WHERE.
    
    > Concretely, I think we could do the attached.  This has the same
    > test query as in v1, but the generated plan is better because it
    > realizes it can copy the constant-false HAVING clause into WHERE,
    > resulting in a dummy scan of the table.
    
    Thanks for the patch.  Yeah, v2 generates better plans for queries
    with degenerate HAVING clauses, as it can move or copy such clauses
    into WHERE.
    
    The idea of 67a54b9e8 is to leverage the presence of group_rtindex in
    the HAVING clause to determine whether it references any columns that
    are not present in all grouping sets.  If it doesn't, then it's safe
    to push the clause down.  This logic was also intended to cope with
    empty grouping sets, since an empty grouping set would nullify any
    Vars referenced in the clause.  However, the loose end is that
    degenerate clauses are not subject to this check, as they lack any
    Vars and therefore cannot be marked with group_rtindex.
    
    v1 patch tries to close this gap by detecting degenerate HAVING
    clauses and preventing them from being pushed down.  While this may
    cause us to miss the optimization for degenerate clauses, I felt it
    was acceptable, because degenerate HAVING clauses don't seem to be
    common in practice, and detecting the presence of empty grouping sets
    doesn't seem a trivial task.
    
    I took a look at has_empty_grouping_set() in v2 patch, but I'm afraid
    it's not thorough enough.  I don't think it's correct to return true
    once we find an EMPTY, ROLLUP, or CUBE GroupingSet.  I think we need
    to compute the Cartesian product across different GroupingSets, just
    like what expand_grouping_sets() does.  With the v2 patch, I can see
    some bogus plans; for example:
    
    create table t (a int, b int);
    
    explain (costs off) select count(*) from t group by rollup(a), b having b > 1;
           QUERY PLAN
    -------------------------
     HashAggregate
       Hash Key: b, a
       Hash Key: b
       Filter: (b > 1)
       ->  Seq Scan on t
             Filter: (b > 1)
    (6 rows)
    
    has_empty_grouping_set() thinks there is an empty grouping set in this
    query, so a copy of the HAVING clause is retained in HAVING, which
    does not seem correct.
    
    Rather than extending has_empty_grouping_set() to compute the
    Cartesian product across different GroupingSets, I wonder if we could
    move the call to expand_grouping_sets() from
    preprocess_grouping_sets() to just before the push-down-HAVING
    optimization.  This way, we could leverage the expanded flat list of
    grouping sets to accurately determine whether any empty grouping sets
    exist.
    
    I did a quick search and didn't find any code between the
    push-down-HAVING optimization and preprocess_grouping_sets() that
    relies on parse->groupingSets being a list of GroupingSet's, so this
    should be safe.
    
    Hence, attached v3 patch.
    
    - Richard
    
  22. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-10-20T14:15:44Z

    Richard Guo <guofenglinux@gmail.com> writes:
    > On Sat, Oct 18, 2025 at 6:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    >>> The proposed patch tries to close the hole by checking whether
    >>> the condition is degenerate, but that feels subtly wrong to me:
    >>> what we ought to check is whether there is any empty grouping set.
    
    > v1 patch tries to close this gap by detecting degenerate HAVING
    > clauses and preventing them from being pushed down.  While this may
    > cause us to miss the optimization for degenerate clauses, I felt it
    > was acceptable, because degenerate HAVING clauses don't seem to be
    > common in practice, and detecting the presence of empty grouping sets
    > doesn't seem a trivial task.
    
    I actually agree with you that it's not that critical whether we fully
    optimize degenerate HAVING clauses in every case.  (Although it might
    be bad if we miss any cases that previous versions handled.)  What
    bothered me about v1 was that it seemed to be operating from a
    conceptually wrong model, which perhaps could lead to bugs in future.
    
    > I took a look at has_empty_grouping_set() in v2 patch, but I'm afraid
    > it's not thorough enough.  I don't think it's correct to return true
    > once we find an EMPTY, ROLLUP, or CUBE GroupingSet.  I think we need
    > to compute the Cartesian product across different GroupingSets, just
    > like what expand_grouping_sets() does.
    
    Oh, good point.  We could probably make it handle this with a little
    more complication, or decide that it's okay if we fail to optimize
    some cases --- but if we're going to do expand_grouping_sets() anyway,
    we might as well rely on that.
    
    > Rather than extending has_empty_grouping_set() to compute the
    > Cartesian product across different GroupingSets, I wonder if we could
    > move the call to expand_grouping_sets() from
    > preprocess_grouping_sets() to just before the push-down-HAVING
    > optimization.  This way, we could leverage the expanded flat list of
    > grouping sets to accurately determine whether any empty grouping sets
    > exist.
    
    I like the concept here, but not so much the details.  Pulling
    expand_grouping_sets out of preprocess_grouping_sets feels weird.
    I guess it's all right given that preprocess_grouping_sets doesn't
    manipulate the parse tree otherwise, but you didn't update the comment
    for preprocess_grouping_sets.  Also it might be a good idea to have a
    test case demonstrating why v2 wasn't good enough.
    
    > I did a quick search and didn't find any code between the
    > push-down-HAVING optimization and preprocess_grouping_sets() that
    > relies on parse->groupingSets being a list of GroupingSet's, so this
    > should be safe.
    
    I think it's actually quite awful that we replace parse->groupingSets
    with data of a totally different representation.  It would be better
    to store the result of expand_grouping_sets into some new PlannerInfo
    field, say root->preprocessed_groupingSets.  But we can't make a
    change like that in v18 now, so this is just something for future
    cleanup.
    
    			regards, tom lane
    
    
    
    
  23. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-10-20T16:27:58Z

    I wrote:
    > I like the concept here, but not so much the details.  Pulling
    > expand_grouping_sets out of preprocess_grouping_sets feels weird.
    > I guess it's all right given that preprocess_grouping_sets doesn't
    > manipulate the parse tree otherwise, but you didn't update the comment
    > for preprocess_grouping_sets.  Also it might be a good idea to have a
    > test case demonstrating why v2 wasn't good enough.
    
    Here's a v4 with some concrete suggestions for comment changes,
    plus the extra test case.  (I did some tiny cosmetic fooling with
    preprocess_grouping_sets too, because the order of its initial
    operations seemed quite weird to me.)
    
    			regards, tom lane
    
    
  24. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-10-21T03:16:38Z

    On Tue, Oct 21, 2025 at 1:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > I wrote:
    > > I like the concept here, but not so much the details.  Pulling
    > > expand_grouping_sets out of preprocess_grouping_sets feels weird.
    > > I guess it's all right given that preprocess_grouping_sets doesn't
    > > manipulate the parse tree otherwise, but you didn't update the comment
    > > for preprocess_grouping_sets.  Also it might be a good idea to have a
    > > test case demonstrating why v2 wasn't good enough.
    
    > Here's a v4 with some concrete suggestions for comment changes,
    > plus the extra test case.  (I did some tiny cosmetic fooling with
    > preprocess_grouping_sets too, because the order of its initial
    > operations seemed quite weird to me.)
    
    Yeah, I noticed the out-of-date comment for preprocess_grouping_sets
    shortly after sending out the v3 patch, but I was too tired last night
    to update and resend it.  Thanks for the suggestions in v4 -- all the
    changes look good to me.
    
    Regarding the tests, I think we could add another test query to cover
    the case with no empty grouping sets and degenerate HAVING clauses.
    This way, all cases for the HAVING pushdown optimization with grouping
    sets should be covered.
    
    I've added such a test in v5, along with a commit message.  Nothing
    else has changed.  I'll push this patch soon, barring any objections.
    
    - Richard
    
  25. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-10-21T03:26:08Z

    Richard Guo <guofenglinux@gmail.com> writes:
    > Regarding the tests, I think we could add another test query to cover
    > the case with no empty grouping sets and degenerate HAVING clauses.
    > This way, all cases for the HAVING pushdown optimization with grouping
    > sets should be covered.
    
    > I've added such a test in v5, along with a commit message.  Nothing
    > else has changed.  I'll push this patch soon, barring any objections.
    
    v5 LGTM.
    
    			regards, tom lane
    
    
    
    
  26. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-10-21T03:58:53Z

    On Tue, Oct 21, 2025 at 12:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Richard Guo <guofenglinux@gmail.com> writes:
    > > Regarding the tests, I think we could add another test query to cover
    > > the case with no empty grouping sets and degenerate HAVING clauses.
    > > This way, all cases for the HAVING pushdown optimization with grouping
    > > sets should be covered.
    >
    > > I've added such a test in v5, along with a commit message.  Nothing
    > > else has changed.  I'll push this patch soon, barring any objections.
    
    > v5 LGTM.
    
    Cool!  I've pushed and back-patched v5.  Thanks for working on this
    patch.
    
    邱宇航, thanks for the report -- it's a good one.  This issue should
    be fixed in 18.1, which is scheduled for release in November (the
    13th, I believe).
    
    - Richard
    
    
    
    
  27. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-10-21T04:12:30Z

    On Tue, Oct 21, 2025 at 12:58 PM Richard Guo <guofenglinux@gmail.com> wrote:
    > Cool!  I've pushed and back-patched v5.  Thanks for working on this
    > patch.
    
    Oops, I made a mistake in the test case for v18.  Fixing it now…
    
    - Richard
    
    
    
    
  28. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-10-21T04:18:16Z

    Richard Guo <guofenglinux@gmail.com> writes:
    > Oops, I made a mistake in the test case for v18.  Fixing it now…
    
    Bleah.  I pretty much don't ever commit things into back branches
    without running regression tests there as well as in master.
    Postgres is a moving target.
    
    			regards, tom lane
    
    
    
    
  29. Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master

    Richard Guo <guofenglinux@gmail.com> — 2025-10-21T05:26:19Z

    On Tue, Oct 21, 2025 at 1:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Richard Guo <guofenglinux@gmail.com> writes:
    > > Oops, I made a mistake in the test case for v18.  Fixing it now…
    
    > Bleah.  I pretty much don't ever commit things into back branches
    > without running regression tests there as well as in master.
    > Postgres is a moving target.
    
    Indeed.  I should have always stayed alert to the possibility that
    test outputs might differ across branches, even for simple queries.
    But somehow, it slipped through this time.
    
    I just pushed a fix to v18 and am now staring at the buildfarm
    nervously.  Fingers crossed this time.
    
    - Richard