Re: Memoize ANTI and SEMI JOIN inner
Alena Rybakina <a.rybakina@postgrespro.ru>
From: Alena Rybakina <a.rybakina@postgrespro.ru>
To: Andrei Lepikhov <lepihov@gmail.com>
Cc: David Rowley <dgrowleyml@gmail.com>,
PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2025-03-31T02:50:07Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Enable use of Memoize for ANTI joins
- 0da29e4cb161 19 (unreleased) landed
Attachments
- memoize.diff.no-cfbot (text/plain)
- v1-0001-Memoise-the-inner-of-SEMI-and-ANTI-join.patch (text/x-patch) patch v1-0001
I realized that I uploaded my diff file with a small mistake - sorry about that. I've corrected it with this message so your tests can pass in the CI. On 31.03.2025 05:33, Alena Rybakina wrote: > > Hi! > > On 21.03.2025 18:56, Andrei Lepikhov wrote: >> On 20/3/2025 07:02, David Rowley wrote: >>> On Thu, 20 Mar 2025 at 06:16, Andrei Lepikhov <lepihov@gmail.com> >>> wrote: >>>> How can we be sure that semi or anti-join needs only one tuple? I >>>> think >>>> it would be enough to control the absence of join clauses and >>>> filters in >>>> the join. Unfortunately, we only have such a guarantee in the plan >>>> creation stage (maybe even setrefs.c). So, it seems we need to >>>> invent an >>>> approach like AlternativeSubplan. >>> >>> I suggest looking at what 9e215378d did. You might be able to also >>> allow semi and anti-joins providing the cache keys cover the entire >>> join condition. I think this might be safe as Nested Loop will only >>> ask its inner subnode for the first match before skipping to the next >>> outer row and with anti-join, there's no reason to look for additional >>> rows after the first. Semi-join and unique joins do the same thing in >>> nodeNestloop.c. To save doing additional checks at run-time, the code >>> does: >> Thank you for the clue! I almost took the wrong direction. >> I have attached the new patch, which includes corrected comments for >> better clarification of the changes, as well as some additional tests. >> I now feel much more confident about this version since I have >> resolved that concern. >> > > I reviewed your patch and made a couple of suggestions. > > The first change is related to your comment (and the one before it). I > fixed some grammar issues and simplified the wording to make it > clearer and easier to understand. > > The second change involves adding an Assert when generating the > Memoize path. Based on the existing comment and the surrounding logic > (shown below), > I believe it's worth asserting that both inner_unique and single_mode > are not true at the same time — just as a safety check. > /* > * We may do this for SEMI or ANTI joins when they need only one tuple from > * the inner side to produce the result. Following if condition checks that > * rule. > * > * XXX Currently we don't attempt to mark SEMI/ANTI joins as inner_unique > * = true. Should we? See add_paths_to_joinrel() > */ > if(!extra->inner_unique&& (jointype== JOIN_SEMI|| > jointype== JOIN_ANTI)) > single_mode= true; -- Regards, Alena Rybakina Postgres Professional