Re: Extended Statistics set/restore/clear functions.

Corey Huinker <corey.huinker@gmail.com>

From: Corey Huinker <corey.huinker@gmail.com>
To: jian he <jian.universality@gmail.com>
Cc: Tomas Vondra <tomas@vondra.me>, pgsql-hackers@lists.postgresql.org
Date: 2025-01-29T20:04:40Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Add test doing some cloning of extended statistics data

  2. Add test for pg_restore_extended_stats() with multiranges

  3. Add support for "mcv" in pg_restore_extended_stats()

  4. Include extended statistics data in pg_dump

  5. Add support for "dependencies" in pg_restore_extended_stats()

  6. Add test for MAINTAIN permission with pg_restore_extended_stats()

  7. Add pg_restore_extended_stats()

  8. Add routine to free MCVList

  9. Improve pg_clear_extended_stats() with incorrect relation/stats combination

  10. Add pg_clear_extended_stats()

  11. Introduce routines to validate and free MVNDistinct and MVDependencies

  12. Fix typo in stat_utils.c

  13. Move attribute statistics functions to stat_utils.c

  14. Improve error messages of input functions for pg_dependencies and pg_ndistinct

  15. Improve test output of extended statistics for ndistinct and dependencies

  16. Fix some compiler warnings

  17. Add input function for data type pg_dependencies

  18. Add input function for data type pg_ndistinct

  19. Rework output format of pg_dependencies

  20. Rework output format of pg_ndistinct

  21. Fix comments of output routines for pg_ndistinct and pg_dependencies

  22. Move code specific to pg_dependencies to new file

  23. Move code specific to pg_ndistinct to new file

  24. Document some structures in attribute_stats.c

  25. Fix FATAL message for invalid recovery timeline at beginning of recovery

On Wed, Jan 29, 2025 at 2:50 AM jian he <jian.universality@gmail.com> wrote:

> hi.
>
> select '{"1, 0B100101":"NaN"}'::pg_ndistinct;
>       pg_ndistinct
> ------------------------
>  {"1, 37": -2147483648}
> (1 row)
>

I think my initial reaction is to just refuse those special values, but
I'll look into the parsing code to see what can be done.



this is not what we expected?
> For the VALUE part of pg_ndistinct, float8 has 3 special values: inf,
> -inf, NaN.
>
> For the key part of pg_ndistinct, see example.
> select '{"1, 16\t":"1"}'::pg_ndistinct;
> here \t is not tab character, ascii 9. it's two characters: backslash
> and character "t".
> so here it should error out?
> (apply this to \n, \r, \b)
>

I don't have a good answer as to what should happen here. Special cases
like this make Tomas' suggestion to change the in/out format more
attractive.




>
>
> pg_ndistinct_in(PG_FUNCTION_ARGS)
> ending part should be:
>
>     freeJsonLexContext(lex);
>     if (result == JSON_SUCCESS)
>     {
>         ......
>     }
>     else
>     {
>        ereturn(parse_state.escontext, (Datum) 0,
>                     errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
>                     errmsg("malformed pg_ndistinct: \"%s\"", str),
>                     errdetail("Must be valid JSON."));
>        PG_RETURN_NULL();
>     }
> result should be either JSON_SUCCESS or anything else.
>
>
>
> all these functions:
> ndistinct_object_start, ndistinct_array_start,
> ndistinct_object_field_start, ndistinct_array_element_start
> have
> ndistinctParseState *parse = state;
>
> do we need to change it to
> ndistinctParseState *parse = (ndistinctParseState *)state;
> ?
>

The compiler isn't complaining so far, but I see no harm in it.