Re: Making Vars outer-join aware

Tom Lane <tgl@sss.pgh.pa.us>

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Richard Guo <guofenglinux@gmail.com>
Cc: Justin Pryzby <pryzby@telsasoft.com>, pgsql-hackers@lists.postgresql.org, "Finnerty, Jim" <jfinnert@amazon.com>
Date: 2023-02-13T16:50:17Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Re-allow INDEX_VAR as rt_index in ChangeVarNodes().

  2. Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.

  3. Invent "join domains" to replace the below_outer_join hack.

  4. Do assorted mop-up in the planner.

  5. Make Vars be outer-join-aware.

  6. Invent "multibitmapsets", and use them to speed up antijoin detection.

  7. Add basic regression tests for semi/antijoin recognition.

  8. Improve performance of adjust_appendrel_attrs_multilevel.

  9. Refactor addition of PlaceHolderVars to joinrel targetlists.

  10. Use an explicit state flag to control PlaceHolderInfo creation.

  11. Make PlaceHolderInfo lookup O(1).

Richard Guo <guofenglinux@gmail.com> writes:
> Thanks for the report!  I've looked at it a little bit and traced down
> to function have_unsafe_outer_join_ref().  The comment there says
>  * In practice, this test never finds a problem ...
> This seems not correct as showed by the counterexample.

Right.  I'd noticed that the inner loop of that function was not
reached in our regression tests, and incorrectly concluded that it
was not reachable --- but I failed to consider cases where the
inner rel's parameterization depends on Vars from multiple places.

> The NOT_USED part of code is doing this check.  But I think we need a
> little tweak.  We should check the nullable side of related outer joins
> against the satisfied part, rather than inner_paramrels.  Maybe
> something like attached.

Agreed.

> However, this test seems to cost some cycles after the change.  So I
> wonder if it's worthwhile to perform it, considering that join order
> restrictions should be able to guarantee there is no problem here.

Yeah, I think we should reduce it to an Assert check.  It shouldn't be
worth the cycles to run in production, and that will also make it easier
to notice in sqlsmith testing if anyone happens across another
counterexample.

Pushed that way.  Thanks for the report and the patch!

			regards, tom lane