Thread
-
Re: Eager aggregation, take 3
Tender Wang <tndrwang@gmail.com> — 2026-05-31T11:28:37Z
Radim Marek <radim@boringsql.com> 于2026年5月29日周五 23:55写道: > > Hey Richard, > > I might be out of my depth here - but while testing RegreSQL as correctness/performance harness on PostgreSQL it picked up a problem with the wrong-results case during eager aggregation. > > It reproduces on current HEAD (commit 2670cc298f42cd7b1c426bf7ccfb0652d8e0b347 now) with enable_eager_aggregate enabled. > > My testing environment > - Linux aarch64, gcc 12 (Debian) > - macOS arm64, Apple clang 21 > (PostgreSQL 19devel on aarch64-apple-darwin25.5.0) > > == How to reproduce > > CREATE TEMP TABLE c(id int, country text); > CREATE TEMP TABLE o(customer_id int); > INSERT INTO c VALUES (1,'US'),(2,'US'),(3,'DE'),(4,'DE'),(5,'DE'); > INSERT INTO o VALUES (1),(3); -- only customers 1 and 3 have a row in o > > SELECT c.country, count(*) AS n > FROM c > WHERE NOT EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id) > GROUP BY c.country > ORDER BY c.country; > > Expected results (everywhere except master) > > country | n > ---------+--- > DE | 2 > US | 1 > (2 rows) > > The actual result with enable_eager_aggregate = on (default) > > country | n > ---------+--- > DE | 0 > US | 0 > (2 rows) > > With SET enable_eager_aggregate = off, the result is correct (DE=2, US=1), as it is on PG18. > > Query Plan > > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------- > Sort (cost=108.19..108.69 rows=200 width=40) (actual time=0.195..0.197 rows=2.00 loops=1) > Sort Key: c.country > Sort Method: quicksort Memory: 25kB > Buffers: local hit=2 > -> Finalize HashAggregate (cost=98.55..100.55 rows=200 width=40) (actual time=0.183..0.186 rows=2.00 loops=1) > Group Key: c.country > Batches: 1 Memory Usage: 32kB > Buffers: local hit=2 > -> Hash Anti Join (cost=52.75..95.37 rows=635 width=40) (actual time=0.177..0.179 rows=3.00 loops=1) > Hash Cond: (c.id = o.customer_id) > Buffers: local hit=2 > -> Seq Scan on c (cost=0.00..22.70 rows=1270 width=36) (actual time=0.024..0.025 rows=5.00 loops=1) > Buffers: local hit=1 > -> Hash (cost=50.25..50.25 rows=200 width=12) (actual time=0.145..0.146 rows=2.00 loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 9kB > Buffers: local hit=1 > -> Partial HashAggregate (cost=48.25..50.25 rows=200 width=12) (actual time=0.122..0.123 rows=2.00 loops=1) > Group Key: o.customer_id > Batches: 1 Memory Usage: 32kB > Buffers: local hit=1 > -> Seq Scan on o (cost=0.00..35.50 rows=2550 width=4) (actual time=0.002..0.003 rows=2.00 loops=1) > Buffers: local hit=1 > Planning Time: 0.294 ms > Execution Time: 0.255 ms > (24 rows) > > If this is already known or in progress, apologies for the noise. Thanks for the report. This is a bug. When we use eager_agg, it can reduce many tuples before doing a join on the partial agg side. After partial agg, when we are doing a join, the matched rows will be significantly reduced. This is also the effect we want to achieve from eager_agg. But we should be careful about anti-join. Because we will ignore the matched row. The aggregate of unmatched rows seems wrong. And I can get the wrong results from the semi-join, too. For example: postgres=# CREATE TEMP TABLE c(id int, country text); CREATE TEMP TABLE o(customer_id int); INSERT INTO c VALUES (1,'US'),(2,'US'),(3,'DE'),(4,'DE'),(5,'DE'); INSERT INTO o VALUES (1),(3); CREATE TABLE CREATE TABLE INSERT 0 5 INSERT 0 2 postgres=# insert into o values (1); INSERT 0 1 -- correct result postgres=# SELECT c.country, count(*) AS n FROM c WHERE EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id) GROUP BY c.country ORDER BY c.country; country | n ---------+--- DE | 1 US | 1 (2 rows) I do some hacks that make the cost of the path created in make_grouped_join_rel() very small. So we can get a partial agg plan, as follow: postgres=# explain SELECT c.country, count(*) AS n FROM c WHERE EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id) GROUP BY c.country ORDER BY c.country; QUERY PLAN ----------------------------------------------------------------------------------------- Finalize GroupAggregate (cost=31.56..38.32 rows=200 width=40) Group Key: c.country -> Sort (cost=31.56..33.15 rows=635 width=40) Sort Key: c.country -> Hash Semi Join (cost=1.00..2.00 rows=635 width=40) Hash Cond: (c.id = o.customer_id) -> Seq Scan on c (cost=0.00..22.70 rows=1270 width=36) -> Hash (cost=200.91..200.91 rows=200 width=12) -> Partial GroupAggregate (cost=179.78..200.91 rows=200 width=12) Group Key: o.customer_id -> Sort (cost=179.78..186.16 rows=2550 width=4) Sort Key: o.customer_id -> Seq Scan on o (cost=0.00..35.50 rows=2550 width=4) (13 rows) postgres=# SELECT c.country, count(*) AS n FROM c WHERE EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id) GROUP BY c.country ORDER BY c.country; country | n ---------+--- DE | 1 US | 2 (2 rows) You can see that the count(us) has 2. Because partial agg pre-aggregates the results for country =1. However, for the semantics of semi-join, it returns once a match is found. I haven't thought about it too deeply yet. Maybe we can do something in the make_grouped_join_rel(). ... if (sjinfo->jointype == JOIN_ANTI || sjinfo->jointype == JOIN_SEMI) return; ... The fixes above can temporarily resolve these issues. But it seems too strict. -- Thanks, Tender Wang