Re: Remaining dependency on setlocale()

Daniel Verite <daniel@manitou-mail.org>

From: "Daniel Verite" <daniel@manitou-mail.org>
To: "Jeff Davis" <pgsql@j-davis.com>
Cc: Thomas Munro <thomas.munro@gmail.com>, Peter Eisentraut <peter@eisentraut.org>, Tom Lane <tgl@sss.pgh.pa.us>, pgsql-hackers@postgresql.org
Date: 2025-11-03T19:14:03Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. fuzzystrmatch: use pg_ascii_toupper().

  2. Avoid global LC_CTYPE dependency in pg_locale_icu.c.

  3. downcase_identifier(): use method table from locale provider.

  4. ltree: fix case-insensitive matching.

  5. Fix multibyte issue in ltree_strncasecmp().

  6. Use multibyte-aware extraction of pattern prefixes.

  7. Add pg_iswcased().

  8. Remove char_tolower() API.

  9. Make regex "max_chr" depend on encoding, not provider.

  10. Change some callers to use pg_ascii_toupper().

  11. Allow pg_locale_t APIs to work when ctype_is_c.

  12. Add #define for UNICODE_CASEMAP_BUFSZ.

  13. Inline pg_ascii_tolower() and pg_ascii_toupper().

  14. Avoid global LC_CTYPE dependency in pg_locale_libc.c.

  15. Force LC_COLLATE to C in postmaster.

  16. Change wchar2char() and char2wchar() to accept a locale_t.

  17. Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.

  18. inet_net_pton.c: use pg_ascii_tolower() rather than tolower().

  19. isn.c: use pg_ascii_toupper() instead of toupper().

  20. contrib/spi/refint.c: use pg_ascii_tolower() instead.

  21. copyfromparse.c: use pg_ascii_tolower() rather than tolower().

  22. Revert "Tidy up locale thread safety in ECPG library."

  23. Tidy up locale thread safety in ECPG library.

  24. All supported systems have locale_t.

	Jeff Davis wrote:

> > > Extensions often need to be updated for a new major version.
> > 
> > I think forcing the C locale is not comparable to API changes,
> > and the consequences are not even necessarily fixable for extensions.
> 
> Are we in agreement that it's fine for C extensions?

No, I think we should put the database's lc_ctype
into LC_CTYPE and the database's lc_collate into
LC_COLLATE, independently of anything else,
like it was done until commit 5e6e42e.
I believe that's the purpose of these database
properties, whether the provider is libc or ICU or builtin.

Forcing "C" is a disruptive change, that IMO does
not seem compensated by substantial advantages
that would justify the disruption.


> > CREATE FUNCTION lt_test(text,text) RETURNS boolean as $$
> >  use locale; return ($_[0] lt $_[1])?1:0;
> > $$ LANGUAGE plperlu;
> > 
> > select lt_test('a', 'B');
> 
> Are you aware of PL code that does things like that? If the database
> locale is ICU, that would be at least a little bit confusing.

plperl users writing "use locale" should understand that
it's the libc locale, like when this code is run outside Postgres.


Best regards,
-- 
Daniel Vérité 
https://postgresql.verite.pro/