Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

Tom Lane <tgl@sss.pgh.pa.us>

From: Tom Lane <tgl@sss.pgh.pa.us>
To: Jian Guo <gjian@vmware.com>
Cc: Tomas Vondra <tomas.vondra@enterprisedb.com>, Hans Buschmann <buschmann@nidsa.net>, "pgsql-hackers@lists.postgresql.org" <pgsql-hackers@lists.postgresql.org>, Zhenghua Lyu <zlyu@vmware.com>
Date: 2023-11-08T22:44:55Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Allow examine_simple_variable() to work on INSERT RETURNING Vars.

  2. Extract column statistics from CTE references, if possible.

  3. Remove SQL regression tests for GUCs related to NO_SHOW_ALL

Attachments

Jian Guo <gjian@vmware.com> writes:
> I found a new approach to fix this issue, which seems better, so I would like to post another version of the patch here. The origin patch made the assumption of the values of Vars from CTE must be unique, which could be very wrong. This patch examines variables for Vars inside CTE, which avoided the bad assumption, so the results could be much more accurate.

You have the right general idea, but there is nothing about this patch
that's right in detail.  The outer Var doesn't refer to any particular
RTE within the subquery; it refers to a targetlist entry.  You have to
drill down to that, see if it's a Var, and if so you can recurse into
the subroot with that Var.  As this stands, it might accidentally get
the right answer for "SELECT * FROM foo" subqueries, but it will get
the wrong answer or even crash for anything that's not that.

The existing RTE_SUBQUERY stanza has most of what we need for this,
so I experimented with extending that to also handle RTE_CTE.  It
seems to work, though I soon found out that it needed tweaking for
the case where the CTE is INSERT/UPDATE/DELETE RETURNING.

Interestingly, this does not change any existing regression test
results.  I'd supposed there might be at least one place with a
visible plan change, but nope.  Running a coverage test does show
that the new code paths are exercised, but I wonder if we ought
to try to devise a regression test that proves it more directly.

			regards, tom lane

PS: please, please, please do not quote the entire damn thread
when replying.  Trim it to just a minimum amount of relevant
text.  You think people want to read all that again?