Thread
-
Re: Remaining dependency on setlocale()
Jeff Davis <pgsql@j-davis.com> — 2025-11-21T00:58:16Z
On Wed, 2025-11-12 at 19:59 +0100, Peter Eisentraut wrote: > Many of these issues are pre-existing, but I just figured it has > reached > a point where we need to do something about it. I tried to simplify things in this patch series, assuming that we have some tolerance for small behavior changes. 0001: No behavior change here, same patch as before. Uncontroversial simplification, so I plan to commit this soon. 0002: change fuzzystrmatch to use ASCII semantics. As far as I can tell, this only affects the results of soundex(). Before the patch, in en_US.iso885915, soundex('réd') was 'RÉ30', after the patch it's 'Ré30'. I'm not sure whether the current behavior is intentional or not. Other functions (daitch_mokotoff, levenshtein, and metaphone) are unaffected as far as I can tell. 0003+0005: change ltree to use case folding instead of tolower(). I believe this is a bug fix, because the current code is inconsistent between ltree_strncasecmp() and ltree_crc32_sz(). 0006-0007: Remove char_tolower() API. This also removes the optimization for single-byte encodings with the libc provider and a non-C locale, but simplifies the code (the optimization is retained for the C locale). It's possible to make the lazy-folding optimization work for all locales without the char_tolower() API by doing something simlar to what 0004 does for ltree. But to make this work efficiently for Generic_Text_IC_like() would be a bit more complex: we'd need to adjust MatchText() to be able to fold the arguments lazily, and perhaps introduce some kind of casemapping iterator. That's already a pretty complex function, so I'm hesitant to do that work unless the optimization is important. These patches don't get us quite to the point of eliminating the LC_CTYPE dependency (there's still downcase_identifier() and pg_strcasecmp() to worry about, and some assorted isxyz() calls to examine), but they simplify things enough that the path forward will be easier. Regards, Jeff Davis