Re: Remaining dependency on setlocale()

Peter Eisentraut <peter@eisentraut.org>

From: Peter Eisentraut <peter@eisentraut.org>
To: Daniel Verite <daniel@manitou-mail.org>, Jeff Davis <pgsql@j-davis.com>
Cc: Thomas Munro <thomas.munro@gmail.com>, Tom Lane <tgl@sss.pgh.pa.us>, pgsql-hackers@postgresql.org
Date: 2025-11-12T18:41:58Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. fuzzystrmatch: use pg_ascii_toupper().

  2. Avoid global LC_CTYPE dependency in pg_locale_icu.c.

  3. downcase_identifier(): use method table from locale provider.

  4. ltree: fix case-insensitive matching.

  5. Fix multibyte issue in ltree_strncasecmp().

  6. Use multibyte-aware extraction of pattern prefixes.

  7. Add pg_iswcased().

  8. Remove char_tolower() API.

  9. Make regex "max_chr" depend on encoding, not provider.

  10. Change some callers to use pg_ascii_toupper().

  11. Allow pg_locale_t APIs to work when ctype_is_c.

  12. Add #define for UNICODE_CASEMAP_BUFSZ.

  13. Inline pg_ascii_tolower() and pg_ascii_toupper().

  14. Avoid global LC_CTYPE dependency in pg_locale_libc.c.

  15. Force LC_COLLATE to C in postmaster.

  16. Change wchar2char() and char2wchar() to accept a locale_t.

  17. Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.

  18. inet_net_pton.c: use pg_ascii_tolower() rather than tolower().

  19. isn.c: use pg_ascii_toupper() instead of toupper().

  20. contrib/spi/refint.c: use pg_ascii_tolower() instead.

  21. copyfromparse.c: use pg_ascii_tolower() rather than tolower().

  22. Revert "Tidy up locale thread safety in ECPG library."

  23. Tidy up locale thread safety in ECPG library.

  24. All supported systems have locale_t.

On 03.11.25 20:14, Daniel Verite wrote:
> No, I think we should put the database's lc_ctype
> into LC_CTYPE and the database's lc_collate into
> LC_COLLATE, independently of anything else,
> like it was done until commit 5e6e42e.
> I believe that's the purpose of these database
> properties, whether the provider is libc or ICU or builtin.
> 
> Forcing "C" is a disruptive change, that IMO does
> not seem compensated by substantial advantages
> that would justify the disruption.

 From my perspective, the difference between LC_COLLATE and LC_CTYPE is 
that LC_COLLATE has a quite limited impact area.  Either your code uses 
strcoll() (or strxfrm()) or it does not.  And if it does, you can find 
all the places and adjust them, and it probably won't be that many 
places.  The impact area of LC_CTYPE is much larger and more complicated 
and possibly interacts with other settings and third-party libraries in 
ways that we don't understand yet and might not be able to change. 
That's why I'm more hesitant about it.  But I don't see any reason to 
keep LC_COLLATE set going forward.