Re: Making Vars outer-join aware

Richard Guo <guofenglinux@gmail.com>

From: Richard Guo <guofenglinux@gmail.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Pg Hackers <pgsql-hackers@lists.postgresql.org>, "Finnerty, Jim" <jfinnert@amazon.com>
Date: 2022-07-12T07:20:37Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Re-allow INDEX_VAR as rt_index in ChangeVarNodes().

  2. Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.

  3. Invent "join domains" to replace the below_outer_join hack.

  4. Do assorted mop-up in the planner.

  5. Make Vars be outer-join-aware.

  6. Invent "multibitmapsets", and use them to speed up antijoin detection.

  7. Add basic regression tests for semi/antijoin recognition.

  8. Improve performance of adjust_appendrel_attrs_multilevel.

  9. Refactor addition of PlaceHolderVars to joinrel targetlists.

  10. Use an explicit state flag to control PlaceHolderInfo creation.

  11. Make PlaceHolderInfo lookup O(1).

On Mon, Jul 11, 2022 at 3:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Here's v2 of this patch series.  It's functionally identical to v1,
> but I've rebased it over the recent auto-node-support-generation
> changes, and also extracted a few separable bits in hopes of making
> the main planner patch smaller.  (It's still pretty durn large,
> unfortunately.)  Unlike the original submission, each step will
> compile on its own, though the intermediate states mostly don't
> pass all regression tests.


Noticed a different behavior from previous regarding PlaceHolderVar.
Take the query below as an example:

select a.i, ss.jj from a left join (select b.i, b.j + 1 as jj from b) ss
on a.i = ss.i;

Previously the expression 'b.j + 1' would not be wrapped in a
PlaceHolderVar, since it contains a Var of the subquery and meanwhile it
does not contain any non-strict constructs. And now in the patch, we
would insert a PlaceHolderVar for it, in order to have a place to store
varnullingrels. So the plan for the above query now becomes:

# explain (verbose, costs off) select a.i, ss.jj from a left join
(select b.i, b.j + 1 as jj from b) ss on a.i = ss.i;
            QUERY PLAN
----------------------------------
 Hash Right Join
   Output: a.i, ((b.j + 1))
   Hash Cond: (b.i = a.i)
   ->  Seq Scan on public.b
         Output: b.i, (b.j + 1)
   ->  Hash
         Output: a.i
         ->  Seq Scan on public.a
               Output: a.i
(9 rows)

Note that the evaluation of expression 'b.j + 1' now occurs below the
outer join. Is this something we need to be concerned about?

Thanks
Richard