Re: Remaining dependency on setlocale()
Thomas Munro <thomas.munro@gmail.com>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
fuzzystrmatch: use pg_ascii_toupper().
- b96a9fd76f32 19 (unreleased) landed
-
Avoid global LC_CTYPE dependency in pg_locale_icu.c.
- 0a90df58cf38 19 (unreleased) landed
-
downcase_identifier(): use method table from locale provider.
- 87b2968df0f8 19 (unreleased) landed
-
ltree: fix case-insensitive matching.
- 806555e3000d 18.2 landed
- 7f007e4a044a 19 (unreleased) landed
-
Fix multibyte issue in ltree_strncasecmp().
- 898991966bc9 14.21 landed
- 335b2f30b468 15.16 landed
- b80227c0a54c 16.12 landed
- b8cfe9dc2e7f 17.8 landed
- f79e239e0bc6 18.2 landed
- 84d5efa7e3eb 19 (unreleased) landed
-
Use multibyte-aware extraction of pattern prefixes.
- 9c8de1596912 19 (unreleased) landed
-
Add pg_iswcased().
- 630706ced04e 19 (unreleased) landed
-
Remove char_tolower() API.
- 1e493158d3d2 19 (unreleased) landed
-
Make regex "max_chr" depend on encoding, not provider.
- 19b966243c38 19 (unreleased) landed
-
Change some callers to use pg_ascii_toupper().
- 99cd8890beca 19 (unreleased) landed
-
Allow pg_locale_t APIs to work when ctype_is_c.
- 147602822597 19 (unreleased) landed
-
Add #define for UNICODE_CASEMAP_BUFSZ.
- 8d299052fe58 19 (unreleased) landed
-
Inline pg_ascii_tolower() and pg_ascii_toupper().
- ec4997a9d733 19 (unreleased) landed
-
Avoid global LC_CTYPE dependency in pg_locale_libc.c.
- f81bf78ce12b 19 (unreleased) landed
-
Force LC_COLLATE to C in postmaster.
- 5e6e42e44fe1 19 (unreleased) landed
-
Change wchar2char() and char2wchar() to accept a locale_t.
- 53cd0b71ee2e 19 (unreleased) landed
-
Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.
- d81dcc8d6243 19 (unreleased) landed
-
inet_net_pton.c: use pg_ascii_tolower() rather than tolower().
- 8898082a5d3e 18.0 landed
-
isn.c: use pg_ascii_toupper() instead of toupper().
- 7a6880fadc17 18.0 landed
-
contrib/spi/refint.c: use pg_ascii_tolower() instead.
- 78bd364ee39c 18.0 landed
-
copyfromparse.c: use pg_ascii_tolower() rather than tolower().
- 4c787a24e7e2 18.0 landed
-
Revert "Tidy up locale thread safety in ECPG library."
- 3c8e463b0d88 18.0 cited
-
Tidy up locale thread safety in ECPG library.
- 8e993bff5326 18.0 cited
-
All supported systems have locale_t.
- 8d9a9f034e92 17.0 cited
On Mon, Aug 12, 2024 at 3:24 PM Thomas Munro <thomas.munro@gmail.com> wrote: > 1. The nl_langinfo() call in pg_get_encoding_from_locale(), can > probably be changed to nl_langinfo_l() (it is everywhere we currently > care about except Windows, which has a different already-thread-safe > alternative ... ... though if we wanted to replace all use of localeconv and struct lconv with nl_langinfo_l() calls, it's not totally obvious how to do that on Windows. Its closest thing is GetLocaleInfoEx(), but that has complications: it takes wchar_t locale names, which we don't even have and can't access when we only have a locale_t, and it must look them up in some data structure every time, and it copies data out to the caller as wchar_t so now you have two conversion problems and a storage problem. If I understand correctly, the whole point of nl_langinfo_l(item, loc) is that it is supposed to be fast, it's really just an array lookup, and item is just an index, and the result is supposed to be stable as long as loc hasn't been freed (and the thread hasn't exited). So you can use it without putting your own caching in front of it. One idea I came up with which I haven't tried and it might turn out to be terrible, is that we could change our definition of locale_t on Windows. Currently it's a typedef to Windows' _locale_t, and we use it with a bunch of _XXX functions that we access by macro to remove the underscore. Instead, we could make locale_t a pointer to a struct of our own design in WIN32 builds, holding the native _locale_t and also an array full of all the values that nl_langinfo_l() can return. We'd provide the standard enums, indexes into that array, in a fake POSIX-oid header <langinfo.h>. Then nl_langinfo_l(item, loc) could be implemented as loc->private_langinfo[item], and strcoll_l(.., loc) could be a static inline function that does _strcol_l(..., loc->private_windows_locale_t). These structs would be allocated and freed with standard-looking newlocale() and freelocale(), so we could finally stop using #ifdef WIN32-wrapped _create_locale() directly. Then everything would look more POSIX-y, nl_langinfo_l() could be used directly wherever we need fast access to that info, and we could, I think, banish the awkward localeconv, right? I don't know if this all makes total sense and haven't tried it, just spitballing here...