Re: Making Vars outer-join aware
Richard Guo <guofenglinux@gmail.com>
From: Richard Guo <guofenglinux@gmail.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Pg Hackers <pgsql-hackers@lists.postgresql.org>, "Finnerty, Jim" <jfinnert@amazon.com>
Date: 2022-12-28T08:49:23Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Re-allow INDEX_VAR as rt_index in ChangeVarNodes().
- fbf80421ead5 16.0 landed
-
Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.
- f50f029c497d 16.0 landed
-
Invent "join domains" to replace the below_outer_join hack.
- 3bef56e11650 16.0 landed
-
Do assorted mop-up in the planner.
- b448f1c8d83f 16.0 landed
-
Make Vars be outer-join-aware.
- 2489d76c4906 16.0 landed
-
Invent "multibitmapsets", and use them to speed up antijoin detection.
- e9e26b5e7166 16.0 landed
-
Add basic regression tests for semi/antijoin recognition.
- 0043aa6b8597 16.0 landed
-
Improve performance of adjust_appendrel_attrs_multilevel.
- 2f17b57017e5 16.0 landed
-
Refactor addition of PlaceHolderVars to joinrel targetlists.
- afa0ec30bfd1 16.0 landed
-
Use an explicit state flag to control PlaceHolderInfo creation.
- b3ff6c742f6c 16.0 landed
-
Make PlaceHolderInfo lookup O(1).
- 6569ca43973b 16.0 landed
On Tue, Dec 27, 2022 at 11:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The thing that I couldn't get around before is that if you have,
> say, a mergejoinable equality clause in an outer join:
>
> select ... from a left join b on a.x = b.y;
>
> that equality clause can only be associated with the join domain
> for B, because it certainly can't be enforced against A. However,
> you'd still wish to be able to do a mergejoin using indexes on
> a.x and b.y, and this means that we have to understand the ordering
> induced by a PathKey based on this EC as applicable to A, even
> though that relation is not in the same join domain. So there are
> situations where sort orderings apply across domain boundaries even
> though equalities don't. We might have to split the notion of
> EquivalenceClass into two sorts of objects, and somewhere right
> about here is where I realized that this wasn't getting finished
> for v16 :-(.
I think I see where the problem is. And I can see currently in
get_eclass_for_sort_expr we always use the top JoinDomain. So although
the equality clause 'a.x = b.y' belongs to JoinDomain {B}, we set up ECs
for 'a.x' and 'b.y' that belong to the top JoinDomain {A, B, A/B}.
But doing so would lead to a situation where the "same" Vars from
different join domains might have the same varnullingrels and thus would
match by equal(). As an example, consider
select ... from a left join b on a.x = b.y where a.x = 1;
As said we would set up EC for 'b.y' as belonging to the top JoinDomain.
Then when reconsider_outer_join_clause generates the equality clause
'b.y = 1', we figure out that the new clause belongs to JoinDomain {B}.
Note that the two 'b.y' here belong to different join domains but they
have the same varnullingrels (empty varnullingrels actually). As a
result, the equality 'b.y = 1' would be merged into the existing EC for
'b.y', because the two 'b.y' matches by equal() and we do not check
JoinDomain for non-const EC members. So we would end up with an EC
containing EC members of different join domains.
And it seems this would make the following statement in README not hold
any more.
We don't have to worry about this for Vars (or expressions
containing Vars), because references to the "same" column from
different join domains will have different varnullingrels and thus
won't be equal() anyway.
Thanks
Richard