Re: Remaining dependency on setlocale()

Joe Conway <mail@joeconway.com>

From: Joe Conway <mail@joeconway.com>
To: Jeff Davis <pgsql@j-davis.com>, Thomas Munro <thomas.munro@gmail.com>, Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-hackers@postgresql.org
Date: 2024-08-07T17:28:45Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. fuzzystrmatch: use pg_ascii_toupper().

  2. Avoid global LC_CTYPE dependency in pg_locale_icu.c.

  3. downcase_identifier(): use method table from locale provider.

  4. ltree: fix case-insensitive matching.

  5. Fix multibyte issue in ltree_strncasecmp().

  6. Use multibyte-aware extraction of pattern prefixes.

  7. Add pg_iswcased().

  8. Remove char_tolower() API.

  9. Make regex "max_chr" depend on encoding, not provider.

  10. Change some callers to use pg_ascii_toupper().

  11. Allow pg_locale_t APIs to work when ctype_is_c.

  12. Add #define for UNICODE_CASEMAP_BUFSZ.

  13. Inline pg_ascii_tolower() and pg_ascii_toupper().

  14. Avoid global LC_CTYPE dependency in pg_locale_libc.c.

  15. Force LC_COLLATE to C in postmaster.

  16. Change wchar2char() and char2wchar() to accept a locale_t.

  17. Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.

  18. inet_net_pton.c: use pg_ascii_tolower() rather than tolower().

  19. isn.c: use pg_ascii_toupper() instead of toupper().

  20. contrib/spi/refint.c: use pg_ascii_tolower() instead.

  21. copyfromparse.c: use pg_ascii_tolower() rather than tolower().

  22. Revert "Tidy up locale thread safety in ECPG library."

  23. Tidy up locale thread safety in ECPG library.

  24. All supported systems have locale_t.

On 8/7/24 13:16, Jeff Davis wrote:
> On Wed, 2024-08-07 at 19:07 +1200, Thomas Munro wrote:
>> How far can we get by using more _l() functions?
> 
> There are a ton of calls to, for example, isspace(), used mostly for
> parsing.
> 
> I wouldn't expect a lot of differences in behavior from locale to
> locale, like might be the case with iswspace(), but behavior can be
> different at least in theory.
> 
> So I guess we're stuck with setlocale()/uselocale() for a while, unless
> we're able to move most of those call sites over to an ascii-only
> variant.

FWIW I see all of these in glibc:

isalnum_l, isalpha_l, isascii_l, isblank_l, iscntrl_l, isdigit_l, 
isgraph_l,  islower_l, isprint_l, ispunct_l, isspace_l, isupper_l, 
isxdigit_l


-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com