Re: Making Vars outer-join aware

Richard Guo <guofenglinux@gmail.com>

From: Richard Guo <guofenglinux@gmail.com>
To: Justin Pryzby <pryzby@telsasoft.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, pgsql-hackers@lists.postgresql.org, "Finnerty, Jim" <jfinnert@amazon.com>
Date: 2023-02-13T07:33:15Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Re-allow INDEX_VAR as rt_index in ChangeVarNodes().

  2. Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.

  3. Invent "join domains" to replace the below_outer_join hack.

  4. Do assorted mop-up in the planner.

  5. Make Vars be outer-join-aware.

  6. Invent "multibitmapsets", and use them to speed up antijoin detection.

  7. Add basic regression tests for semi/antijoin recognition.

  8. Improve performance of adjust_appendrel_attrs_multilevel.

  9. Refactor addition of PlaceHolderVars to joinrel targetlists.

  10. Use an explicit state flag to control PlaceHolderInfo creation.

  11. Make PlaceHolderInfo lookup O(1).

Attachments

On Mon, Feb 13, 2023 at 7:58 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

> The patch broke this query:
>
> select from pg_inherits inner join information_schema.element_types
> right join (select from pg_constraint as sample_2) on true
> on false, lateral (select scope_catalog, inhdetachpending from
> pg_publication_namespace limit 3);
> ERROR:  could not devise a query plan for the given query


Thanks for the report!  I've looked at it a little bit and traced down
to function have_unsafe_outer_join_ref().  The comment there says

 * In practice, this test never finds a problem ...
 * ...
 * It still seems worth checking
 * as a backstop, but we don't go to a lot of trouble: just reject if the
 * unsatisfied part includes any outer-join relids at all.

This seems not correct as showed by the counterexample.  ISTM that we
need to do the check honestly as what the other comment says

 * If the parameterization is only partly satisfied by the outer rel,
 * the unsatisfied part can't include any outer-join relids that could
 * null rels of the satisfied part.

The NOT_USED part of code is doing this check.  But I think we need a
little tweak.  We should check the nullable side of related outer joins
against the satisfied part, rather than inner_paramrels.  Maybe
something like attached.

However, this test seems to cost some cycles after the change.  So I
wonder if it's worthwhile to perform it, considering that join order
restrictions should be able to guarantee there is no problem here.

BTW, here is a simplified query that can trigger this issue on HEAD.

select * from t1 inner join t2 left join (select null as c from t3 left
join t4 on true) as sub on true on true, lateral (select c, t1.a from t5
offset 0 ) ss;

Thanks
Richard