Re: Remaining dependency on setlocale()

Jeff Davis <pgsql@j-davis.com>

From: Jeff Davis <pgsql@j-davis.com>
To: Thomas Munro <thomas.munro@gmail.com>
Cc: Peter Eisentraut <peter@eisentraut.org>, Tom Lane <tgl@sss.pgh.pa.us>, pgsql-hackers@postgresql.org
Date: 2025-07-10T18:22:50Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. fuzzystrmatch: use pg_ascii_toupper().

  2. Avoid global LC_CTYPE dependency in pg_locale_icu.c.

  3. downcase_identifier(): use method table from locale provider.

  4. ltree: fix case-insensitive matching.

  5. Fix multibyte issue in ltree_strncasecmp().

  6. Use multibyte-aware extraction of pattern prefixes.

  7. Add pg_iswcased().

  8. Remove char_tolower() API.

  9. Make regex "max_chr" depend on encoding, not provider.

  10. Change some callers to use pg_ascii_toupper().

  11. Allow pg_locale_t APIs to work when ctype_is_c.

  12. Add #define for UNICODE_CASEMAP_BUFSZ.

  13. Inline pg_ascii_tolower() and pg_ascii_toupper().

  14. Avoid global LC_CTYPE dependency in pg_locale_libc.c.

  15. Force LC_COLLATE to C in postmaster.

  16. Change wchar2char() and char2wchar() to accept a locale_t.

  17. Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.

  18. inet_net_pton.c: use pg_ascii_tolower() rather than tolower().

  19. isn.c: use pg_ascii_toupper() instead of toupper().

  20. contrib/spi/refint.c: use pg_ascii_tolower() instead.

  21. copyfromparse.c: use pg_ascii_tolower() rather than tolower().

  22. Revert "Tidy up locale thread safety in ECPG library."

  23. Tidy up locale thread safety in ECPG library.

  24. All supported systems have locale_t.

On Thu, 2025-07-10 at 11:53 +1200, Thomas Munro wrote:
> On Thu, Jul 10, 2025 at 10:52 AM Jeff Davis <pgsql@j-davis.com>
> wrote:
> > The first problem -- how to affect the encoding of strings returned
> > by
> > strerror() on windows -- may be solvable as well. It looks like
> > LC_MESSAGES is not supported at all on windows, so the only thing
> > to be
> > concerned about is the encoding, which is affected by LC_CTYPE. But
> > windows doesn't offer uselocale() or strerror_l(). The only way
> > seems
> > to be to call _configthreadlocale(_ENABLE_PER_THREAD_LOCALE) and
> > then
> > setlocale(LC_CTYPE, datctype) right before strerror(), and switch
> > it
> > back to "C" right afterward. Comments welcome.
> 
> FWIW there is an example of that in src/port/pg_localeconv_r.c.

OK, so it seems we have a path forward here:

1. Have a global_libc_locale that represents all of the categories, and
keep it up to date with GUC changes. On windows, it requires keeping
the textual locale names handy (e.g. copies of datcollate and
datctype), and building the special locale string and doing
_create_locale(LC_ALL, "LC_ABC=somelocale;LC_XYZ=otherlocale").

2. When there's no _l() variant of a function, like strerror_r(), wrap
with uselocale(). On windows, this means using the trick above with
_configthreadlocale(_ENABLE_PER_THREAD_LOCALE).

I don't have a great windows development environment, and it appears CI
and the buildfarm don't offer great coverage either. Can I ask for a
volunteer to do the windows side of this work?

Regards,
	Jeff Davis