Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

Laurenz Albe <laurenz.albe@cybertec.at>

From: Laurenz Albe <laurenz.albe@cybertec.at>
To: Tom Lane <tgl@sss.pgh.pa.us>, Richard Guo <guofenglinux@gmail.com>
Cc: Jian Guo <gjian@vmware.com>, Tomas Vondra <tomas.vondra@enterprisedb.com>, Hans Buschmann <buschmann@nidsa.net>, "pgsql-hackers@lists.postgresql.org" <pgsql-hackers@lists.postgresql.org>, Zhenghua Lyu <zlyu@vmware.com>
Date: 2023-11-17T03:53:31Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Allow examine_simple_variable() to work on INSERT RETURNING Vars.

  2. Extract column statistics from CTE references, if possible.

  3. Remove SQL regression tests for GUCs related to NO_SHOW_ALL

On Thu, 2023-11-16 at 22:38 -0500, Tom Lane wrote:
> That line of argument also leads to the conclusion that it'd be
> okay to expose info about the ordering of the CTE result to the
> upper planner.  [...]  The fence is sort of one-way
> in this line of thinking: information can propagate up to the outer
> planner level, but not down into the CTE plan.
> 
> Thoughts?

That agrees with my intuition about MATERIALIZED CTEs.
I think of them as "first calculate the CTE, then calculate the
rest of the query" or an ad-hoc temporary table for the duration
of a query.  I would expect the upper planner to know estimates
and other data about the result of the CTE.

Yours,
Laurenz Albe