Re: Extended Statistics set/restore/clear functions.
Michael Paquier <michael@paquier.xyz>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add test doing some cloning of extended statistics data
- fc365e4fccc4 19 (unreleased) landed
-
Add test for pg_restore_extended_stats() with multiranges
- 0b7beec42ae2 19 (unreleased) landed
-
Add support for "mcv" in pg_restore_extended_stats()
- efbebb4e8587 19 (unreleased) landed
-
Include extended statistics data in pg_dump
- c32fb29e979d 19 (unreleased) landed
-
Add support for "dependencies" in pg_restore_extended_stats()
- 302879bd68d1 19 (unreleased) landed
-
Add test for MAINTAIN permission with pg_restore_extended_stats()
- d9abd9e1050d 19 (unreleased) landed
-
Add pg_restore_extended_stats()
- 0e80f3f88dea 19 (unreleased) landed
-
Add routine to free MCVList
- 7ebb64c55757 19 (unreleased) landed
-
Improve pg_clear_extended_stats() with incorrect relation/stats combination
- 395b73c045e0 19 (unreleased) landed
-
Add pg_clear_extended_stats()
- d756fa1019ff 19 (unreleased) landed
-
Introduce routines to validate and free MVNDistinct and MVDependencies
- 32e27bd32082 19 (unreleased) landed
-
Fix typo in stat_utils.c
- eee19a30d60d 19 (unreleased) landed
-
Move attribute statistics functions to stat_utils.c
- 213a1b895270 19 (unreleased) landed
-
Improve error messages of input functions for pg_dependencies and pg_ndistinct
- f68597ee777d 19 (unreleased) landed
-
Improve test output of extended statistics for ndistinct and dependencies
- 2f04110225ab 19 (unreleased) landed
-
Fix some compiler warnings
- 7bc88c3d6f3a 19 (unreleased) landed
-
Add input function for data type pg_dependencies
- e1405aa5e3ac 19 (unreleased) landed
-
Add input function for data type pg_ndistinct
- 44eba8f06e55 19 (unreleased) landed
-
Rework output format of pg_dependencies
- e76defbcf09e 19 (unreleased) landed
-
Rework output format of pg_ndistinct
- 1f927cce4498 19 (unreleased) landed
-
Fix comments of output routines for pg_ndistinct and pg_dependencies
- 040a39ed25bf 19 (unreleased) landed
-
Move code specific to pg_dependencies to new file
- 2ddc8d9e9baa 19 (unreleased) landed
-
Move code specific to pg_ndistinct to new file
- a5523123430f 19 (unreleased) landed
-
Document some structures in attribute_stats.c
- d6c132d83bff 19 (unreleased) landed
-
Fix FATAL message for invalid recovery timeline at beginning of recovery
- 71f17823ba01 18.0 cited
On Wed, Oct 22, 2025 at 02:55:31PM +0300, Corey Huinker wrote: >> Do you have some numbers regarding the increase in size this generates >> for the catalogs? > > Sorry, I don't understand. There shouldn't be any increase inside the > catalogs as the internal storage of the datatypes hasn't changed, so I can > only conclude that you're referring to something else. The new format meant more characters, perhaps I've just missed something while quickly testing the patch.. Anyway, that's OK at this stage. > The equivalent structures in attribute_stats.c will need documenting too. Right. This sounds like a separate patch to me, impacting HEAD. > Right now we have a situation where the vast majority of databases can > carry forward all of their stats via pg_upgrade, except for those databases > that have extended stats. The trouble is, most customers don't know if > their database uses extended statistics or not, and those that do are in > for some bad query plans if they haven't run vacuumdb --missing-stats-only. > Explaining that to customers is complicated, especially when most of them > do not know what extended stats are, let alone whether they have them. It > would be a lot simpler to just say "all stats are carried over on upgrade", > and vacuumdb becomes unnecessary, making upgrades one step simpler as well. Okay. > Given that, I think that the admittedly ugly transformation is worth it, > and sequestering it inside pg_dump is the smallest footprint it can have. > Earlier in this thread I posted some functions that did the translation > from the existing formats to the proposed new formats. We could include > those as new system functions, and that would make the dump code very > simple. Having said that, I don't know that there would be use for those > functions except inside pg_dump, hence the decision to do the transforms > right in the dump query. I'd prefer the new format. One killer pushing in favor of the new format that you are making upthread in favor of is that it makes much easier the viewing, editing and injecting of these stats. It's the part of the patch where we would need Tomas' input on the matter before deciding anything, I guess, as primary author of the original facilities. My view of the problem is just one opinion. > If the format translation is a barrier to fetching existing extended stats, > then I'd be more inclined to keep the existing pg_ndistinct and > pg_dependencies data formats as they are now. Not necessarily, it can be possible to also take that in multiple steps rather than a single one: - First do the basics in v19 with the new format. - Raise the bar to older versions. -- Michael