Re: Making Vars outer-join aware

Richard Guo <guofenglinux@gmail.com>

From: Richard Guo <guofenglinux@gmail.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Pg Hackers <pgsql-hackers@lists.postgresql.org>, "Finnerty, Jim" <jfinnert@amazon.com>
Date: 2022-12-27T08:27:49Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Re-allow INDEX_VAR as rt_index in ChangeVarNodes().

  2. Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.

  3. Invent "join domains" to replace the below_outer_join hack.

  4. Do assorted mop-up in the planner.

  5. Make Vars be outer-join-aware.

  6. Invent "multibitmapsets", and use them to speed up antijoin detection.

  7. Add basic regression tests for semi/antijoin recognition.

  8. Improve performance of adjust_appendrel_attrs_multilevel.

  9. Refactor addition of PlaceHolderVars to joinrel targetlists.

  10. Use an explicit state flag to control PlaceHolderInfo creation.

  11. Make PlaceHolderInfo lookup O(1).

On Sat, Dec 24, 2022 at 2:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I shoved some preliminary refactoring into the 0001 patch,
> notably splitting deconstruct_jointree into two passes.
> 0002-0009 cover the same ground as they did before, though
> with some differences in detail.  0010-0012 are new work
> mostly aimed at removing kluges we no longer need.


I'm looking at 0010-0012 and I really like the changes and removals
there.  Thanks for the great work!

For 0010, the change seems quite independent.  I think maybe we can
apply it to HEAD directly.

For 0011, I found that some clauses that were outerjoin_delayed and thus
not equivalent before might be treated as being equivalent now.  For
example

explain (costs off)
select * from a left join b on a.i = b.i where coalesce(b.j, 0) = 0 and
coalesce(b.j, 0) = a.j;
            QUERY PLAN
----------------------------------
 Hash Right Join
   Hash Cond: (b.i = a.i)
   Filter: (COALESCE(b.j, 0) = 0)
   ->  Seq Scan on b
   ->  Hash
         ->  Seq Scan on a
               Filter: (j = 0)
(7 rows)

This is different behavior from HEAD.  But I think it's an improvement.

For 0012, I'm still trying to understand JoinDomain.  AFAIU all EC
members of the same EC should have the same JoinDomain, because for
constants we match EC members only within the same JoinDomain, and for
Vars if they come from different join domains they will have different
nullingrels and thus will not match.  So I wonder if we can have the
JoinDomain kept in EquivalenceClass rather than in each
EquivalenceMembers.

Thanks
Richard