Re: Remaining dependency on setlocale()
Chao Li <li.evan.chao@gmail.com>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
fuzzystrmatch: use pg_ascii_toupper().
- b96a9fd76f32 19 (unreleased) landed
-
Avoid global LC_CTYPE dependency in pg_locale_icu.c.
- 0a90df58cf38 19 (unreleased) landed
-
downcase_identifier(): use method table from locale provider.
- 87b2968df0f8 19 (unreleased) landed
-
ltree: fix case-insensitive matching.
- 806555e3000d 18.2 landed
- 7f007e4a044a 19 (unreleased) landed
-
Fix multibyte issue in ltree_strncasecmp().
- 898991966bc9 14.21 landed
- 335b2f30b468 15.16 landed
- b80227c0a54c 16.12 landed
- b8cfe9dc2e7f 17.8 landed
- f79e239e0bc6 18.2 landed
- 84d5efa7e3eb 19 (unreleased) landed
-
Use multibyte-aware extraction of pattern prefixes.
- 9c8de1596912 19 (unreleased) landed
-
Add pg_iswcased().
- 630706ced04e 19 (unreleased) landed
-
Remove char_tolower() API.
- 1e493158d3d2 19 (unreleased) landed
-
Make regex "max_chr" depend on encoding, not provider.
- 19b966243c38 19 (unreleased) landed
-
Change some callers to use pg_ascii_toupper().
- 99cd8890beca 19 (unreleased) landed
-
Allow pg_locale_t APIs to work when ctype_is_c.
- 147602822597 19 (unreleased) landed
-
Add #define for UNICODE_CASEMAP_BUFSZ.
- 8d299052fe58 19 (unreleased) landed
-
Inline pg_ascii_tolower() and pg_ascii_toupper().
- ec4997a9d733 19 (unreleased) landed
-
Avoid global LC_CTYPE dependency in pg_locale_libc.c.
- f81bf78ce12b 19 (unreleased) landed
-
Force LC_COLLATE to C in postmaster.
- 5e6e42e44fe1 19 (unreleased) landed
-
Change wchar2char() and char2wchar() to accept a locale_t.
- 53cd0b71ee2e 19 (unreleased) landed
-
Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.
- d81dcc8d6243 19 (unreleased) landed
-
inet_net_pton.c: use pg_ascii_tolower() rather than tolower().
- 8898082a5d3e 18.0 landed
-
isn.c: use pg_ascii_toupper() instead of toupper().
- 7a6880fadc17 18.0 landed
-
contrib/spi/refint.c: use pg_ascii_tolower() instead.
- 78bd364ee39c 18.0 landed
-
copyfromparse.c: use pg_ascii_tolower() rather than tolower().
- 4c787a24e7e2 18.0 landed
-
Revert "Tidy up locale thread safety in ECPG library."
- 3c8e463b0d88 18.0 cited
-
Tidy up locale thread safety in ECPG library.
- 8e993bff5326 18.0 cited
-
All supported systems have locale_t.
- 8d9a9f034e92 17.0 cited
> On Nov 26, 2025, at 09:50, Chao Li <li.evan.chao@gmail.com> wrote:
>
> I will review the rest 3 commits tomorrow.
10 - 0009
```
{
if (isalpha((unsigned char) c))
{
- c = toupper((unsigned char) c);
+ c = pg_ascii_toupper((unsigned char) c);
```
Just curious. As isaplha() and toupper() come from the same header file ctype.h, if we replace toupper with pg_ascii_toupper, does isapha also need to be handled?
11 - 0010
```
- for (i = 0; i < len; i++)
- {
- unsigned char ch = (unsigned char) ident[i];
+ dstsize = len + 1;
+ result = palloc(dstsize);
- if (ch >= 'A' && ch <= 'Z')
- ch += 'a' - 'A';
- else if (enc_is_single_byte && IS_HIGHBIT_SET(ch) && isupper(ch))
- ch = tolower(ch);
- result[i] = (char) ch;
- }
- result[i] = '\0';
+ needed = pg_strfold_ident(result, dstsize, ident, len);
+ Assert(needed + 1 == dstsize);
+ Assert(needed == len);
```
I think assert both dstsize and len are redundant, because dstsize=len+1, and no place to change their values.
12 - 0010
```
+/*
+ * Fold an identifier using the database default locale.
+ *
+ * For historical reasons, does not use ordinary locale behavior. Should only
+ * be used for identifier folding. XXX: can we make this equivalent to
+ * pg_strfold(..., default_locale)?
+ */
+size_t
+pg_strfold_ident(char *dest, size_t destsize, const char *src, ssize_t srclen)
+{
+ if (default_locale == NULL || default_locale->ctype == NULL)
+ {
+ int i;
+
+ for (i = 0; i < srclen && i < destsize; i++)
+ {
+ unsigned char ch = (unsigned char) src[i];
+
+ if (ch >= 'A' && ch <= 'Z')
+ ch += 'a' - 'A';
+ dest[i] = (char) ch;
+ }
+
+ if (i < destsize)
+ dest[i] = '\0';
+
+ return srclen;
+ }
+ return default_locale->ctype->strfold_ident(dest, destsize, src, srclen,
+ default_locale);
+}
```
Given default_local can be NULL only at some specific moment, can we do something like
Local = default_local;
If (local == NULL || local->ctype == NULL)
Local = libc or other fallback;
Return default_locale->ctype->strfold_ident(dest, destsize, src, srclen, local);
This way avoids the duplicate code.
13 - 0011
```
+{ name => 'lc_collate', type => 'string', context => 'PGC_SUSET', group => 'CLIENT_CONN_LOCALE',
+ short_desc => 'Sets the locale for text ordering in extensions.',
```
I just feel the GUC name is very misleading. Without carefully reading the doc, users may very easy to consider lc_collate the system’s locale. If it only affects extensions, then let’s name it accordingly, for example, “extension_lc_collate”, or “legacy_lc_collate”.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/