Re: Making Vars outer-join aware
Tom Lane <tgl@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Richard Guo <guofenglinux@gmail.com>
Cc: Pg Hackers <pgsql-hackers@lists.postgresql.org>,
"Finnerty,
Jim" <jfinnert@amazon.com>
Date: 2022-11-16T20:46:44Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Re-allow INDEX_VAR as rt_index in ChangeVarNodes().
- fbf80421ead5 16.0 landed
-
Fix thinkos in have_unsafe_outer_join_ref; reduce to Assert check.
- f50f029c497d 16.0 landed
-
Invent "join domains" to replace the below_outer_join hack.
- 3bef56e11650 16.0 landed
-
Do assorted mop-up in the planner.
- b448f1c8d83f 16.0 landed
-
Make Vars be outer-join-aware.
- 2489d76c4906 16.0 landed
-
Invent "multibitmapsets", and use them to speed up antijoin detection.
- e9e26b5e7166 16.0 landed
-
Add basic regression tests for semi/antijoin recognition.
- 0043aa6b8597 16.0 landed
-
Improve performance of adjust_appendrel_attrs_multilevel.
- 2f17b57017e5 16.0 landed
-
Refactor addition of PlaceHolderVars to joinrel targetlists.
- afa0ec30bfd1 16.0 landed
-
Use an explicit state flag to control PlaceHolderInfo creation.
- b3ff6c742f6c 16.0 landed
-
Make PlaceHolderInfo lookup O(1).
- 6569ca43973b 16.0 landed
Richard Guo <guofenglinux@gmail.com> writes:
> I'm reviewing the part about multiple version clauses, and I find a case
> that may not work as expected. I tried with some query as below
> (A leftjoin (B leftjoin C on (Pbc)) on (Pab)) left join D on (Pcd)
> Assume Pbc is strict for B and Pcd is strict for C.
> According to identity 3, we know one of its equivalent form is
> ((A leftjoin B on (Pab)) leftjoin C on (Pbc)) left join D on (Pcd)
> For outer join clause Pcd, we would generate two versions from the first
> form
> Version 1: C Vars with nullingrels as {A/B}
> Version 2: C Vars with nullingrels as {B/C, A/B}
> I understand version 2 is reasonable as the nullingrels from parser
> would be set as that. But it seems version 1 is not applicable in
> either form.
Hmm. Looking at the data structures generated for the first form,
we have
B/C join:
{SPECIALJOININFO
:min_lefthand (b 2)
:min_righthand (b 3)
:syn_lefthand (b 2)
:syn_righthand (b 3)
:jointype 1
:ojrelid 4
:commute_above_l (b 7)
:commute_above_r (b 5)
:commute_below (b)
A/B join:
{SPECIALJOININFO
:min_lefthand (b 1)
:min_righthand (b 2)
:syn_lefthand (b 1)
:syn_righthand (b 2 3 4)
:jointype 1
:ojrelid 5
:commute_above_l (b)
:commute_above_r (b)
:commute_below (b 4)
everything-to-D join:
{SPECIALJOININFO
:min_lefthand (b 1 2 3 4 5)
:min_righthand (b 6)
:syn_lefthand (b 1 2 3 4 5)
:syn_righthand (b 6)
:jointype 1
:ojrelid 7
:commute_above_l (b)
:commute_above_r (b)
:commute_below (b 4)
So we've marked the 4 and 7 joins as possibly commuting, but they
cannot commute according to 7's min_lefthand set. I don't think
the extra clone condition is terribly harmful --- it's useless
but shouldn't cause any problems. However, if these joins should be
able to commute then the min_lefthand marking is preventing us
from considering legal join orders (and has been doing so all along,
that's not new in this patch). It looks to me like they should be
able to commute (giving your third form), so this is a pre-existing
planning deficiency.
Without having looked too closely, I suspect this is coming from
the delay_upper_joins/check_outerjoin_delay stuff in initsplan.c.
That's a chunk of logic that I'd like to nuke altogether, and maybe
we will be able to do so once this patchset is a bit further along.
But I've not had time to look at it yet.
I'm not entirely clear on whether the strange selection of clone
clauses for this example is a bug in process_postponed_left_join_quals
or if that function is just getting misled by the bogus min_lefthand
value.
> Looking at the two forms again, it seems the expected two versions for
> Pcd should be
> Version 1: C Vars with nullingrels as {B/C}
> Version 2: C Vars with nullingrels as {B/C, A/B}
> With this we may have another problem that the two versions are both
> applicable at the C/D join according to clause_is_computable_at(), in
> both forms.
At least when I tried it just now, clause_is_computable_at correctly
rejected the first version, because we've already computed A/B when
we are trying to form the C/D join so we expect it to be listed in
varnullingrels.
regards, tom lane