Thread

Re: Eager aggregation, take 3

Tender Wang <tndrwang@gmail.com> — 2026-05-31T11:28:37Z
Radim Marek <radim@boringsql.com> 于2026年5月29日周五 23:55写道：
>
> Hey Richard,
>
> I might be out of my depth here - but while testing RegreSQL as correctness/performance harness on PostgreSQL it picked up a problem with the wrong-results case during eager aggregation.
>
> It reproduces on current HEAD (commit 2670cc298f42cd7b1c426bf7ccfb0652d8e0b347 now) with enable_eager_aggregate enabled.
>
> My testing environment
>   - Linux aarch64, gcc 12 (Debian)
>   - macOS arm64, Apple clang 21
>     (PostgreSQL 19devel on aarch64-apple-darwin25.5.0)
>
> == How to reproduce
>
>   CREATE TEMP TABLE c(id int, country text);
>   CREATE TEMP TABLE o(customer_id int);
>   INSERT INTO c VALUES (1,'US'),(2,'US'),(3,'DE'),(4,'DE'),(5,'DE');
>   INSERT INTO o VALUES (1),(3);   -- only customers 1 and 3 have a row in o
>
>   SELECT c.country, count(*) AS n
>   FROM c
>   WHERE NOT EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id)
>   GROUP BY c.country
>   ORDER BY c.country;
>
> Expected results (everywhere except master)
>
>  country | n
> ---------+---
>  DE      | 2
>  US      | 1
> (2 rows)
>
> The actual result with enable_eager_aggregate = on (default)
>
>  country | n
> ---------+---
>  DE      | 0
>  US      | 0
> (2 rows)
>
> With SET enable_eager_aggregate = off, the result is correct (DE=2, US=1), as it is on PG18.
>
> Query Plan
>
>                                                             QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=108.19..108.69 rows=200 width=40) (actual time=0.195..0.197 rows=2.00 loops=1)
>    Sort Key: c.country
>    Sort Method: quicksort  Memory: 25kB
>    Buffers: local hit=2
>    ->  Finalize HashAggregate  (cost=98.55..100.55 rows=200 width=40) (actual time=0.183..0.186 rows=2.00 loops=1)
>          Group Key: c.country
>          Batches: 1  Memory Usage: 32kB
>          Buffers: local hit=2
>          ->  Hash Anti Join  (cost=52.75..95.37 rows=635 width=40) (actual time=0.177..0.179 rows=3.00 loops=1)
>                Hash Cond: (c.id = o.customer_id)
>                Buffers: local hit=2
>                ->  Seq Scan on c  (cost=0.00..22.70 rows=1270 width=36) (actual time=0.024..0.025 rows=5.00 loops=1)
>                      Buffers: local hit=1
>                ->  Hash  (cost=50.25..50.25 rows=200 width=12) (actual time=0.145..0.146 rows=2.00 loops=1)
>                      Buckets: 1024  Batches: 1  Memory Usage: 9kB
>                      Buffers: local hit=1
>                      ->  Partial HashAggregate  (cost=48.25..50.25 rows=200 width=12) (actual time=0.122..0.123 rows=2.00 loops=1)
>                            Group Key: o.customer_id
>                            Batches: 1  Memory Usage: 32kB
>                            Buffers: local hit=1
>                            ->  Seq Scan on o  (cost=0.00..35.50 rows=2550 width=4) (actual time=0.002..0.003 rows=2.00 loops=1)
>                                  Buffers: local hit=1
>  Planning Time: 0.294 ms
>  Execution Time: 0.255 ms
> (24 rows)
>
> If this is already known or in progress, apologies for the noise.
Thanks for the report. This is a bug.
When we use eager_agg, it can reduce many tuples before doing a join
on the partial agg side.
After partial agg, when we are doing a join,  the matched rows will be
significantly reduced.
This is also the effect we want to achieve from eager_agg.

But we should be careful about anti-join. Because we will ignore the
matched row. The aggregate of unmatched rows seems wrong.
And I can get the wrong results from the semi-join, too.
For example:
postgres=# CREATE TEMP TABLE c(id int, country text);
CREATE TEMP TABLE o(customer_id int);
INSERT INTO c VALUES (1,'US'),(2,'US'),(3,'DE'),(4,'DE'),(5,'DE');
INSERT INTO o VALUES (1),(3);
CREATE TABLE
CREATE TABLE
INSERT 0 5
INSERT 0 2
postgres=# insert into o values (1);
INSERT 0 1
-- correct result
postgres=#  SELECT c.country, count(*) AS n
FROM c
WHERE EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id)
GROUP BY c.country
ORDER BY c.country;
 country | n
---------+---
 DE      | 1
 US      | 1
(2 rows)

I do some hacks that make the cost of the path created in
make_grouped_join_rel() very small.
So we can get a partial agg plan, as follow:

postgres=# explain SELECT c.country, count(*) AS n
FROM c
WHERE EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id)
GROUP BY c.country
ORDER BY c.country;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Finalize GroupAggregate  (cost=31.56..38.32 rows=200 width=40)
   Group Key: c.country
   ->  Sort  (cost=31.56..33.15 rows=635 width=40)
         Sort Key: c.country
         ->  Hash Semi Join  (cost=1.00..2.00 rows=635 width=40)
               Hash Cond: (c.id = o.customer_id)
               ->  Seq Scan on c  (cost=0.00..22.70 rows=1270 width=36)
               ->  Hash  (cost=200.91..200.91 rows=200 width=12)
                     ->  Partial GroupAggregate  (cost=179.78..200.91
rows=200 width=12)
                           Group Key: o.customer_id
                           ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
                                 Sort Key: o.customer_id
                                 ->  Seq Scan on o  (cost=0.00..35.50
rows=2550 width=4)
(13 rows)

postgres=#  SELECT c.country, count(*) AS n
FROM c
WHERE EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id)
GROUP BY c.country
ORDER BY c.country;
 country | n
---------+---
 DE      | 1
 US      | 2
(2 rows)

You can see that the count(us) has 2. Because partial agg
pre-aggregates the results for country =1.
However, for the semantics of semi-join, it returns once a match is found.

I haven't thought about it too deeply yet. Maybe we can do something
in the make_grouped_join_rel().
...
if (sjinfo->jointype == JOIN_ANTI || sjinfo->jointype == JOIN_SEMI)
    return;
...
The fixes above can temporarily resolve these issues. But it seems too strict.

-- 
Thanks,
Tender Wang