Re: Extended Statistics set/restore/clear functions.
Chao Li <li.evan.chao@gmail.com>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add test doing some cloning of extended statistics data
- fc365e4fccc4 19 (unreleased) landed
-
Add test for pg_restore_extended_stats() with multiranges
- 0b7beec42ae2 19 (unreleased) landed
-
Add support for "mcv" in pg_restore_extended_stats()
- efbebb4e8587 19 (unreleased) landed
-
Include extended statistics data in pg_dump
- c32fb29e979d 19 (unreleased) landed
-
Add support for "dependencies" in pg_restore_extended_stats()
- 302879bd68d1 19 (unreleased) landed
-
Add test for MAINTAIN permission with pg_restore_extended_stats()
- d9abd9e1050d 19 (unreleased) landed
-
Add pg_restore_extended_stats()
- 0e80f3f88dea 19 (unreleased) landed
-
Add routine to free MCVList
- 7ebb64c55757 19 (unreleased) landed
-
Improve pg_clear_extended_stats() with incorrect relation/stats combination
- 395b73c045e0 19 (unreleased) landed
-
Add pg_clear_extended_stats()
- d756fa1019ff 19 (unreleased) landed
-
Introduce routines to validate and free MVNDistinct and MVDependencies
- 32e27bd32082 19 (unreleased) landed
-
Fix typo in stat_utils.c
- eee19a30d60d 19 (unreleased) landed
-
Move attribute statistics functions to stat_utils.c
- 213a1b895270 19 (unreleased) landed
-
Improve error messages of input functions for pg_dependencies and pg_ndistinct
- f68597ee777d 19 (unreleased) landed
-
Improve test output of extended statistics for ndistinct and dependencies
- 2f04110225ab 19 (unreleased) landed
-
Fix some compiler warnings
- 7bc88c3d6f3a 19 (unreleased) landed
-
Add input function for data type pg_dependencies
- e1405aa5e3ac 19 (unreleased) landed
-
Add input function for data type pg_ndistinct
- 44eba8f06e55 19 (unreleased) landed
-
Rework output format of pg_dependencies
- e76defbcf09e 19 (unreleased) landed
-
Rework output format of pg_ndistinct
- 1f927cce4498 19 (unreleased) landed
-
Fix comments of output routines for pg_ndistinct and pg_dependencies
- 040a39ed25bf 19 (unreleased) landed
-
Move code specific to pg_dependencies to new file
- 2ddc8d9e9baa 19 (unreleased) landed
-
Move code specific to pg_ndistinct to new file
- a5523123430f 19 (unreleased) landed
-
Document some structures in attribute_stats.c
- d6c132d83bff 19 (unreleased) landed
-
Fix FATAL message for invalid recovery timeline at beginning of recovery
- 71f17823ba01 18.0 cited
> On Nov 25, 2025, at 15:28, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Nov 22, 2025 at 03:26:19AM -0500, Corey Huinker wrote:
>> I added a comment debating the feasibility of testing for subsets of
>> attribute sets in pg_dependencies. Basically, I think we can't have the
>> test at all, but I haven't removed it just yet pending consensus.
>
> + * Verify that all attnum sets are a proper subset of the first longest
> + * attnum set.
> + *
> + * TODO:
> + *
> + * I'm fairly certain that because statisticsally insignificant dependency
> + * combinations are not stored, there is a chance that the longest dependency
> + * does not exist, and therefore this test cannot be done. I have left the
> + * test in place for the time being until the issue can be definitively
> + * settled.
>
> As you have already quoted upthread, statext_dependencies_build()
> settles the issue on this one, I think. It is entirely possible that
> any group returned by DependencyGenerator generates a degree value
> that would prevent a given group to be stored, and this could as well
> be the largest possible group there could be in the set. So we cannot
> do any of that for dependencies, unfortunately. We can always rely on
> the list of attributes when assigning the json blob to the stats
> object, at least, cross-checking that each attribute list matches with
> the numbers of the stats object. At least we can check for
> duplicates, which is better than nothing at all.
>
> Regarding the suggested check where we'd want to enforce all the
> groups of attributes to be listed depending on the longest set we have
> found, at the end estimate_multivariate_ndistinct() checks the items
> listed one-by-one, giving up if we cannot find something in the list
> of items. I think that I am going to be content with the patch as it
> is, without this piece. Let's add an extra SQL test to treat that as
> valid input, though. So I am feeling OK with the input for ndistinct
> at this stage. I have noticed a couple of issues in passing,
> adjusting them. We are reaching more than 90% of coverage with the
> tests, and I am not sure that we can actually reach the rest except if
> one of the previous steps failed.
>
> So That's one. Now into the second patch for the input of the
> dependencies.
>
> +SELECT '[{"attributes" : [2], "dependency" : 4, "degree": "NaN"}]'::pg_dependencies;
> +SELECT '[{"attributes" : [2], "dependency" : 4, "degree": "-inf"}]'::pg_dependencies;
> +SELECT '[{"attributes" : [2], "dependency" : 4, "degree": "inf"}]'::pg_dependencies;
> +SELECT '[{"attributes" : [2], "dependency" : 4, "degree": "-inf"}]'::pg_dependencies::text::pg_dependencies;
>
> Okay, I have to admit that these ones are fun. I doubt that anybody
> would actually do that, and these do not produce valid json objects,
> which is what the last case shows. Hmm, it makes sense to keep these,
> and I'm still siding that we should not care too much about applying
> checks on the values and complicate the input function more than that,
> so fine by me.
>
> There were a couple of things in the tests, missing quite a few soft
> errors. Many typos, grammar mistakes in the whole. Also, please do
> not split the error strings into multiple lines to make these
> greppable. There is also no need for a break after a return. In some
> cases, a return was used where a break made more sense as the default
> path returned a failure..
>
> The TODO in build_mvdependencies() could be an elog(), but I have left
> it untouched for the errdetail().
>
> We're reaching 91% of coverage here, not bad. The rest does not seem
> reachable, as far as I can see.
>
> With that said, a v18 for the first two patches with the input
> functions. Comments and/or opinions?
> --
> Michael
> <v18-0001-Add-working-input-function-for-pg_ndistinct.patch><v18-0002-Add-working-input-function-for-pg_dependencies.patch>
I don’t see any of my comments are addressed in v18.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/