Re: Remaining dependency on setlocale()

Peter Eisentraut <peter@eisentraut.org>

From: Peter Eisentraut <peter@eisentraut.org>
To: Jeff Davis <pgsql@j-davis.com>, Thomas Munro <thomas.munro@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, pgsql-hackers@postgresql.org
Date: 2024-12-19T16:23:11Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. fuzzystrmatch: use pg_ascii_toupper().

  2. Avoid global LC_CTYPE dependency in pg_locale_icu.c.

  3. downcase_identifier(): use method table from locale provider.

  4. ltree: fix case-insensitive matching.

  5. Fix multibyte issue in ltree_strncasecmp().

  6. Use multibyte-aware extraction of pattern prefixes.

  7. Add pg_iswcased().

  8. Remove char_tolower() API.

  9. Make regex "max_chr" depend on encoding, not provider.

  10. Change some callers to use pg_ascii_toupper().

  11. Allow pg_locale_t APIs to work when ctype_is_c.

  12. Add #define for UNICODE_CASEMAP_BUFSZ.

  13. Inline pg_ascii_tolower() and pg_ascii_toupper().

  14. Avoid global LC_CTYPE dependency in pg_locale_libc.c.

  15. Force LC_COLLATE to C in postmaster.

  16. Change wchar2char() and char2wchar() to accept a locale_t.

  17. Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.

  18. inet_net_pton.c: use pg_ascii_tolower() rather than tolower().

  19. isn.c: use pg_ascii_toupper() instead of toupper().

  20. contrib/spi/refint.c: use pg_ascii_tolower() instead.

  21. copyfromparse.c: use pg_ascii_tolower() rather than tolower().

  22. Revert "Tidy up locale thread safety in ECPG library."

  23. Tidy up locale thread safety in ECPG library.

  24. All supported systems have locale_t.

On 17.12.24 19:10, Jeff Davis wrote:
> On Tue, 2024-12-17 at 13:14 +0100, Peter Eisentraut wrote:
>> I think we will need to keep the global LC_CTYPE setting set to
>> something useful, for example so that system error messages come out
>> in
>> the right encoding.
> 
> Do we need to rely on the global LC_CTYPE setting? We already use
> bind_textdomain_codeset().

I don't think that would cover messages from the C library (strerror, 
dlerror, etc.).

>> But I'm concerned about the the Perl_setlocale() dance in plperl.c.
>> Perl apparently does a setlocale(LC_ALL, "") during startup, and that
>> code is a workaround to reset everything back afterwards.  We need to
>> be
>> careful not to break that.
>>
>> (Perl has fixed that in 5.19, but the fix requires that you set
>> another
>> environment variable before launching Perl, which you can't do in a
>> threaded system, so we'd probably need another fix eventually.  See
>> <https://github.com/Perl/perl5/issues/8274>.)
> 
> I don't fully understand that issue, but I would think the direction we
> are going (keeping the global LC_CTYPE more consistent and relying on
> it less) would make the problem better.

Yes, I think it's the right direction, but we need to figure this issue 
out eventually.