Thread

  1. Wrong results in remove_useless_groupby_columns()

    Richard Guo <guofenglinux@gmail.com> — 2026-05-05T10:12:16Z

    I seem to have fallen into a rabbit-hole around equality-relation
    issues.  I noticed another wrong-result issue caused by collation
    mismatch.  This time it is in remove_useless_groupby_columns().
    
    remove_useless_groupby_columns() uses a relation's unique indexes to
    prove that some GROUP BY columns are functionally dependent on others,
    and so can be dropped from the GROUP BY clause.  The match between
    index columns and GROUP BY columns is done by attno alone.
    
    However, a unique index under one collation does not prove uniqueness
    under another.  For example:
    
    create collation ci (provider = icu, locale = 'und-u-ks-level2',
    deterministic = false);
    
    create table t (a text collate ci not null, b text);
    insert into t values ('foo','X'), ('FOO','Y');
    create unique index on t (a collate "C");
    
    -- wrong results: should be 2 rows
    select a, b from t group by a, b;
      a  | b
    -----+---
     foo | X
    (1 row)
    
    The wrong-result is caused by the planner incorrectly dropping b from
    GROUP BY.
    
    What is more, remove_useless_groupby_columns() does not bother to
    check opfamily either.  A unique index under one opfamily does not
    prove uniqueness under the equality used by GROUP BY when the
    SortGroupClause's eqop comes from a different opfamily.  For example:
    
    create type myrec as (x numeric);
    create table opf_t (a myrec not null, b text);
    create unique index on opf_t (a record_image_ops);
    insert into opf_t values (row(1.0)::myrec, 'X'), (row(1.00)::myrec, 'Y');
    
    -- wrong results: should be 2 rows
    select a, b from opf_t group by a, b;
       a   | b
    -------+---
     (1.0) | X
    (1 row)
    
    ROW(1.0) and ROW(1.00) are bytewise distinct, so record_image_ops
    admits both rows.  But GROUP BY a uses the type's default record_ops,
    under which they are logically equal, so the two rows merge into one
    group.  And again, the planner drops b and a row is silently lost.
    
    Attached patch fixes both issues.
    
    - Richard