Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500
Tom Lane <tgl@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Jian Guo <gjian@vmware.com>
Cc: Tomas Vondra <tomas.vondra@enterprisedb.com>,
Hans Buschmann <buschmann@nidsa.net>,
"pgsql-hackers@lists.postgresql.org"
<pgsql-hackers@lists.postgresql.org>,
Zhenghua Lyu <zlyu@vmware.com>
Date: 2023-11-08T22:44:55Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Allow examine_simple_variable() to work on INSERT RETURNING Vars.
- 89b69db82adf 17.0 landed
-
Extract column statistics from CTE references, if possible.
- f7816aec23ee 17.0 landed
-
Remove SQL regression tests for GUCs related to NO_SHOW_ALL
- dbe8a1726cfd 15.3 cited
Attachments
- v2-0001-Examine-simple-variable-for-Var-in-CTE.patch (text/x-diff) patch v2-0001
Jian Guo <gjian@vmware.com> writes: > I found a new approach to fix this issue, which seems better, so I would like to post another version of the patch here. The origin patch made the assumption of the values of Vars from CTE must be unique, which could be very wrong. This patch examines variables for Vars inside CTE, which avoided the bad assumption, so the results could be much more accurate. You have the right general idea, but there is nothing about this patch that's right in detail. The outer Var doesn't refer to any particular RTE within the subquery; it refers to a targetlist entry. You have to drill down to that, see if it's a Var, and if so you can recurse into the subroot with that Var. As this stands, it might accidentally get the right answer for "SELECT * FROM foo" subqueries, but it will get the wrong answer or even crash for anything that's not that. The existing RTE_SUBQUERY stanza has most of what we need for this, so I experimented with extending that to also handle RTE_CTE. It seems to work, though I soon found out that it needed tweaking for the case where the CTE is INSERT/UPDATE/DELETE RETURNING. Interestingly, this does not change any existing regression test results. I'd supposed there might be at least one place with a visible plan change, but nope. Running a coverage test does show that the new code paths are exercised, but I wonder if we ought to try to devise a regression test that proves it more directly. regards, tom lane PS: please, please, please do not quote the entire damn thread when replying. Trim it to just a minimum amount of relevant text. You think people want to read all that again?