Re: Extended Statistics set/restore/clear functions.

Michael Paquier <michael@paquier.xyz>

From: Michael Paquier <michael@paquier.xyz>
To: Corey Huinker <corey.huinker@gmail.com>
Cc: jian he <jian.universality@gmail.com>, Tomas Vondra <tomas@vondra.me>, pgsql-hackers@lists.postgresql.org, tgl@sss.pgh.pa.us
Date: 2025-11-14T06:25:27Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Add test doing some cloning of extended statistics data

  2. Add test for pg_restore_extended_stats() with multiranges

  3. Add support for "mcv" in pg_restore_extended_stats()

  4. Include extended statistics data in pg_dump

  5. Add support for "dependencies" in pg_restore_extended_stats()

  6. Add test for MAINTAIN permission with pg_restore_extended_stats()

  7. Add pg_restore_extended_stats()

  8. Add routine to free MCVList

  9. Improve pg_clear_extended_stats() with incorrect relation/stats combination

  10. Add pg_clear_extended_stats()

  11. Introduce routines to validate and free MVNDistinct and MVDependencies

  12. Fix typo in stat_utils.c

  13. Move attribute statistics functions to stat_utils.c

  14. Improve error messages of input functions for pg_dependencies and pg_ndistinct

  15. Improve test output of extended statistics for ndistinct and dependencies

  16. Fix some compiler warnings

  17. Add input function for data type pg_dependencies

  18. Add input function for data type pg_ndistinct

  19. Rework output format of pg_dependencies

  20. Rework output format of pg_ndistinct

  21. Fix comments of output routines for pg_ndistinct and pg_dependencies

  22. Move code specific to pg_dependencies to new file

  23. Move code specific to pg_ndistinct to new file

  24. Document some structures in attribute_stats.c

  25. Fix FATAL message for invalid recovery timeline at beginning of recovery

On Fri, Nov 14, 2025 at 12:49:23AM -0500, Corey Huinker wrote:
> Negative numbers represent the Nth expression defined in the extended
> statistics object. So if you have extended statistics on a, b, length(a),
> length(b) then you can legally have -1 and -2 in the attributes, but
> nothing lower than that.
>
> See functions pg_ndistinct_validate_items() and
> pg_depdendencies_validate_deps() as these check the attributes in the value
> against the definition of the extended stats object.

Exactly.  Extended stats on system columns don't work because they
don't really make sense as we want to track correlations between the
attributes defined, and these reflect internal states:
create table poo (a int, b int);
create statistics poos (ndistinct ) ON cmax, a from poo;
ERROR:  0A000: statistics creation on system columns is not supported

Note that the expressions are also stored in pg_stats_ext_exprs.

> I'm trying to implement those test cases, but I may have missed some.

I've found a lot of them with coverage-html during a previous lookup.
I'd like to think that we should aim for something close to 100%
coverage for the two input functions.

> Implemented many, but not all of these suggestions.

Thanks for the new versions, I'll also look at all these across the
next couple of days.  Probably not at 0005~ for now.
--
Michael