Thread

Re: Remaining dependency on setlocale()

Jeff Davis <pgsql@j-davis.com> — 2025-10-29T00:19:50Z
On Wed, 2025-07-23 at 19:11 -0700, Jeff Davis wrote:
> On Fri, 2025-07-11 at 11:48 +1200, Thomas Munro wrote:
> > On Fri, Jul 11, 2025 at 6:22 AM Jeff Davis <pgsql@j-davis.com>
> > wrote:
> > > I don't have a great windows development environment, and it
> > > appears CI
> > > and the buildfarm don't offer great coverage either. Can I ask
> > > for
> > > a
> > > volunteer to do the windows side of this work?
> > 
> > Me neither but I'm willing to help with that, and have done lots of
> > closely related things through trial-by-CI...

Attached a new patch series, v6.

Rather than creating new global locale_t objects, this series (along
with a separate patch for NLS[1]) removes the dependency on the global
LC_CTYPE entirely. It's a bunch of small patches that replace direct
calls to tolower()/toupper() with calls into the provider.

An assumption of these patches is that, in the UTF-8 encoding, the
logic in pg_tolower()/pg_toupper() is equivalent to
pg_ascii_tolower()/pg_ascii_toupper().

Generally these preserve existing behavior, but there are a couple
differences:

  * If using the builtin C locale (not C.UTF-8) along with a datctype
that's a non-C locale with single-byte encoding, it could affect the
results of downcase_identifier(), ltree, and fuzzystrmatch on
characters > 127. For ICU, I went to a bit of extra effort to preserve
the existing behavior here, because it's more likely to be used for
single-byte encodings.

  * When using ICU or builtin C.UTF-8, along with a datctype of
"tr_TR.UTF-8", then it will affect ltree's and fuzzystrmatch's
treatment of i/I.

If these are a concern we can fix them with some hacks, but those
behaviors seem fairly obscure to me.

Regards,
	Jeff Davis

[1]
https://www.postgresql.org/message-id/90f176c5b85b9da26a3265b2630ece3552068566.camel@j-davis.com