Re: Memoize ANTI and SEMI JOIN inner
David Rowley <dgrowleyml@gmail.com>
From: David Rowley <dgrowleyml@gmail.com>
To: Andrei Lepikhov <lepihov@gmail.com>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2025-03-20T06:02:16Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Enable use of Memoize for ANTI joins
- 0da29e4cb161 19 (unreleased) landed
Attachments
- memoize_semi_and_anti_joins_experiment.patch (application/octet-stream) patch
On Thu, 20 Mar 2025 at 06:16, Andrei Lepikhov <lepihov@gmail.com> wrote: > How can we be sure that semi or anti-join needs only one tuple? I think > it would be enough to control the absence of join clauses and filters in > the join. Unfortunately, we only have such a guarantee in the plan > creation stage (maybe even setrefs.c). So, it seems we need to invent an > approach like AlternativeSubplan. I suggest looking at what 9e215378d did. You might be able to also allow semi and anti-joins providing the cache keys cover the entire join condition. I think this might be safe as Nested Loop will only ask its inner subnode for the first match before skipping to the next outer row and with anti-join, there's no reason to look for additional rows after the first. Semi-join and unique joins do the same thing in nodeNestloop.c. To save doing additional checks at run-time, the code does: nlstate->js.single_match = (node->join.inner_unique || node->join.jointype == JOIN_SEMI); For making this work, I think the attached should be about the guts of the code changes. I didn't look at the comments. Right now I can't think of any reason why this can't be done, but some experimentation might reveal some reason that it can't. David