Thread

  1. Re: Remaining dependency on setlocale()

    Jeff Davis <pgsql@j-davis.com> — 2025-10-30T17:17:03Z

    On Tue, 2025-10-28 at 17:19 -0700, Jeff Davis wrote:
    > Attached a new patch series, v6.
    
    I'm eager to start committing this series so that we have plenty of
    time to sort out any problems. I welcome feedback before or after
    commit, and I can revert if necessary.
    
    The goal here is to do a permanent:
    
       setlocale(LC_CTYPE, "C")
    
    in the postmaster, and instead use _l() variants where necessary.
    
    Forcing the global LC_CTYPE to C will avoid platform-specific nuances
    spread throughout the code, and prevent new code from accidentally
    depending on platform-specific libc behavior. Instead, libc ctype
    behavior will only happen through a pg_locale_t object.
    
    It also takes us a step closer to thread safety.
    
    LC_COLLATE was already permenently set to "C" (5e6e42e4), and most of
    LC_CTYPE behavior already uses a pg_locale_t object. This series is
    about removing the last few places that rely on raw calls to
    tolower()/toupper() (sometimes through pg_tolower()). Where there isn't
    a pg_locale_t immediately available it uses the database default locale
    (which might or might not be libc).
    
    There's another thread for what to do about strerror_r[1], which
    depends on LC_CTYPE for the encoding:
    
    https://www.postgresql.org/message-id/90f176c5b85b9da26a3265b2630ece3552068566.camel@j-davis.com
    
    pg_localeconv_r() does depend on the LC_CTYPE for the encoding, but it
    already sets it from lc_monetary and lc_numeric, without using datctype
    or the global setting. Then PGLC_localeconv() converts to the database
    encoding, if necessary. So it's an exception to the rule that all ctype
    behavior goes through a pg_locale_t, but it's not a problem. (Aside: we
    could consider this approach as a narrower fix for strerror_r(), as
    well.)
    
    There may be a loose end around plperl, as well, but not sure if this
    will make it any worse.
    
    Some other LC_* settings still rely on setlocale(), which can be
    considered separately unless there's some interaction that I missed.
    
    Note that the datcollate and datctype fields are already mostly
    irrelevant for non-libc providers. We could set those to NULL, but for
    now I don't intend to do that.
    
    Regards,
    	Jeff Davis