Re: Making Vars outer-join aware

David G. Johnston <david.g.johnston@gmail.com>

From: "David G. Johnston" <david.g.johnston@gmail.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Hans Buschmann <buschmann@nidsa.net>, Richard Guo <guofenglinux@gmail.com>, Pg Hackers <pgsql-hackers@lists.postgresql.org>, "Finnerty, Jim" <jfinnert@amazon.com>
Date: 2023-01-24T19:47:53Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Re-allow INDEX_VAR as rt_index in ChangeVarNodes().

  2. Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.

  3. Invent "join domains" to replace the below_outer_join hack.

  4. Do assorted mop-up in the planner.

  5. Make Vars be outer-join-aware.

  6. Invent "multibitmapsets", and use them to speed up antijoin detection.

  7. Add basic regression tests for semi/antijoin recognition.

  8. Improve performance of adjust_appendrel_attrs_multilevel.

  9. Refactor addition of PlaceHolderVars to joinrel targetlists.

  10. Use an explicit state flag to control PlaceHolderInfo creation.

  11. Make PlaceHolderInfo lookup O(1).

On Tue, Jan 24, 2023 at 12:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I wrote:
> > Hans Buschmann <buschmann@nidsa.net> writes:
> >> I just noticed your new efforts in this area.
> >> I wanted to recurr to my old thread [1] considering constant
> propagation of quals.
> >> [1]
> https://www.postgresql.org/message-id/1571413123735.26467@nidsa.net
>
> > Yeah, this patch series is not yet quite up to the point of improving
> > that.  That area is indeed the very next thing I want to work on, and
> > I did spend some effort on it last month, but I ran out of time to get
> > it working.  Maybe we'll have something there for v17.
>
> BTW, to clarify what's going on there: what I want to do is allow
> the regular equivalence-class machinery to handle deductions from
> equality operators appearing in LEFT JOIN ON clauses (maybe full
> joins too, but I'd be satisfied if it works for one-sided outer
> joins).  I'd originally hoped that distinguishing pre-nulled from
> post-nulled variables would be enough to make that safe, but it's
> not.  Here's an example:
>
>         select ... from t1 left join t2 on (t1.x = t2.y and t1.x = 1);
>
> If we turn the generic equivclass.c logic loose on these clauses,
> it will deduce t2.y = 1, which is good, and then apply t2.y = 1 at
> the scan of t2, which is even better (since we might be able to turn
> that into an indexscan qual).  However, it will also try to apply
> t1.x = 1 at the scan of t1, and that's just wrong, because that
> will eliminate t1 rows that should come through with null extension.
>
>
Is there a particular comment or README where that last conclusion is
explained so that it makes sense.  Intuitively, I would expect t1.x = 1 to
be applied during the scan of t1 - it isn't like the output of the join is
allowed to include t1 rows not matching that condition anyway.

IOW, I thought the more verbose but equivalent syntax for that was:

select ... from (select * from t1 as insub where insub.x = 1) as t1 left
join t2 on (t1.x  = t2.y)

Thanks!

David J.