Re: Making Vars outer-join aware

Tom Lane <tgl@sss.pgh.pa.us>

From: Tom Lane <tgl@sss.pgh.pa.us>
To: "David G. Johnston" <david.g.johnston@gmail.com>
Cc: Hans Buschmann <buschmann@nidsa.net>, Richard Guo <guofenglinux@gmail.com>, Pg Hackers <pgsql-hackers@lists.postgresql.org>, "Finnerty, Jim" <jfinnert@amazon.com>
Date: 2023-01-24T20:25:42Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Re-allow INDEX_VAR as rt_index in ChangeVarNodes().

  2. Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.

  3. Invent "join domains" to replace the below_outer_join hack.

  4. Do assorted mop-up in the planner.

  5. Make Vars be outer-join-aware.

  6. Invent "multibitmapsets", and use them to speed up antijoin detection.

  7. Add basic regression tests for semi/antijoin recognition.

  8. Improve performance of adjust_appendrel_attrs_multilevel.

  9. Refactor addition of PlaceHolderVars to joinrel targetlists.

  10. Use an explicit state flag to control PlaceHolderInfo creation.

  11. Make PlaceHolderInfo lookup O(1).

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Tue, Jan 24, 2023 at 12:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> select ... from t1 left join t2 on (t1.x = t2.y and t1.x = 1);
>> 
>> If we turn the generic equivclass.c logic loose on these clauses,
>> it will deduce t2.y = 1, which is good, and then apply t2.y = 1 at
>> the scan of t2, which is even better (since we might be able to turn
>> that into an indexscan qual).  However, it will also try to apply
>> t1.x = 1 at the scan of t1, and that's just wrong, because that
>> will eliminate t1 rows that should come through with null extension.

> Is there a particular comment or README where that last conclusion is
> explained so that it makes sense.

Hm?  It's a LEFT JOIN, so it must not eliminate any rows from t1.
A row that doesn't have t1.x = 1 will appear in the output with
null columns for t2 ... but it must still appear, so we cannot
filter on t1.x = 1 in the scan of t1.

			regards, tom lane