Re: Remaining dependency on setlocale()

Thomas Munro <thomas.munro@gmail.com>

From: Thomas Munro <thomas.munro@gmail.com>
To: Peter Eisentraut <peter@eisentraut.org>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, Jeff Davis <pgsql@j-davis.com>, pgsql-hackers@postgresql.org
Date: 2024-08-15T21:09:40Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. fuzzystrmatch: use pg_ascii_toupper().

  2. Avoid global LC_CTYPE dependency in pg_locale_icu.c.

  3. downcase_identifier(): use method table from locale provider.

  4. ltree: fix case-insensitive matching.

  5. Fix multibyte issue in ltree_strncasecmp().

  6. Use multibyte-aware extraction of pattern prefixes.

  7. Add pg_iswcased().

  8. Remove char_tolower() API.

  9. Make regex "max_chr" depend on encoding, not provider.

  10. Change some callers to use pg_ascii_toupper().

  11. Allow pg_locale_t APIs to work when ctype_is_c.

  12. Add #define for UNICODE_CASEMAP_BUFSZ.

  13. Inline pg_ascii_tolower() and pg_ascii_toupper().

  14. Avoid global LC_CTYPE dependency in pg_locale_libc.c.

  15. Force LC_COLLATE to C in postmaster.

  16. Change wchar2char() and char2wchar() to accept a locale_t.

  17. Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.

  18. inet_net_pton.c: use pg_ascii_tolower() rather than tolower().

  19. isn.c: use pg_ascii_toupper() instead of toupper().

  20. contrib/spi/refint.c: use pg_ascii_tolower() instead.

  21. copyfromparse.c: use pg_ascii_tolower() rather than tolower().

  22. Revert "Tidy up locale thread safety in ECPG library."

  23. Tidy up locale thread safety in ECPG library.

  24. All supported systems have locale_t.

On Fri, Aug 16, 2024 at 1:25 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 15.08.24 00:43, Thomas Munro wrote:
> > "" is a problem however... the special value for "native environment"
> > is returned as a real locale name, which we probably still need in
> > places.  We could change that to newlocale("") + query instead, but
>
> Where do we need that in the server?

Hmm.  Yeah, right, the only way I've found so far to even reach that
code and that captures that result is:

create database db2 locale = '';

Thats puts 'en_NZ.UTF-8' or whatever in pg_database.  In contrast,
create collation will accept '' but just store it verbatim, and the
GUCs for changing time, monetary, numeric accept it too and keep it
verbatim.  We could simply ban '' in all user commands.  I doubt
they're documented as acceptable values, once you get past initdb and
have a running system.  Looking into that...

> It should just be initdb doing that and then initializing the server
> with concrete values based on that.

Right.

> I guess technically some of these GUC settings default to the
> environment?  But I think we could consider getting rid of that.

Yeah.