Re: Pathify RHS unique-ification for semijoin planning
Richard Guo <guofenglinux@gmail.com>
From: Richard Guo <guofenglinux@gmail.com>
To: Álvaro Herrera <alvherre@kurilemu.de>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>,
Tom Lane <tgl@sss.pgh.pa.us>, Andy Fan <zhihuifan1213@163.com>, wenhui qiu <qiuwenhuifx@gmail.com>
Date: 2025-09-02T10:10:23Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Simplify relation_has_unique_index_for()
- bf9ee294e567 19 (unreleased) landed
-
Pathify RHS unique-ification for semijoin planning
- 24225ad9aafc 19 (unreleased) landed
-
Convert varatt.h access macros to static inline functions.
- e035863c9a04 19 (unreleased) cited
-
Re-export a few of createplan.c's make_xxx() functions.
- 570be1f73f38 9.6.0 cited
On Wed, Jul 23, 2025 at 5:11 PM Álvaro Herrera <alvherre@kurilemu.de> wrote: > As a very trivial test on this patch, I ran the query in your opening > email, both with and without the patch, scaling up the size of the table > a little bit. > This is a really nice improvement. I think we could find queries that > are arbitrarily faster, by feeding enough tuples to the unnecessary Sort > nodes. FWIW, I'm looking for a query to better showcase the performance improvement from this patch. Here is one I found. create table t (a int, b int); insert into t select i%10, i%10 from generate_series(1,50000) i; create index on t (a, b); analyze t; explain (analyze, costs on) select * from t t1, t t2 where (t1.a, t2.b) in (select a, b from t t3) order by t1.a, t2.b; Here are the planning and execution time on my snail-paced machine (best of 3), without and with this patch. -- without this patch Planning Time: 0.850 ms Execution Time: 108149.907 ms -- with this patch Planning Time: 0.728 ms Execution Time: 29229.748 ms So this specific case runs about 3.7 times faster, which is really nice. - Richard