Thread
-
[PATCH] Fix severe performance regression with gettext 0.20+ on Windows
Bryan Green <dbryan.green@gmail.com> — 2025-12-10T00:45:52Z
Hello hackers, I've been investigating a performance issue on Windows with recent gettext versions (0.20.1 and later) that causes exception-heavy workloads to run significantly slower than with gettext 0.19.8. Starting with gettext 0.20.1, the library changed its Windows locale handling in a way that conflicts with how PostgreSQL sets LC_MESSAGES. The performance regression manifests when raising many exceptions: - gettext 0.19.8: ~32 seconds for 1M exceptions - gettext 0.20.1+: ~180 seconds for 1M exceptions - gettext 0.2x.y+: ~39 seconds for 1M exceptions The root cause is a combination of three issues: 1. Locale format mismatch gettext 0.20.1+ introduced a get_lcid() function that expects Windows locale format ("English_United States.1252") rather than POSIX format ("en_US"). This function enumerates all Windows locales (~259) until a match is found, then uses the resulting LCID to determine the catalog path. PostgreSQL, however, has always used IsoLocaleName() to convert Windows locales to POSIX format before setting LC_MESSAGES. This means we're passing "en_US" to a function expecting "English_United States.1252". The enumeration doesn't find "en_US" among Windows locale names, returns 0, and gettext falls back to its internal locale resolution (which still works correctly - translations are not broken, just slow). 2. Missing cache on failure The get_lcid() function has a cache, but it only updates the cache when found_lcid > 0 (successful lookup). Failed lookups don't update the cache, causing the 259-locale enumeration to repeat on every gettext() call. This is the actual performance bug in gettext - even if we passed a valid Windows locale format, setting lc_messages to 'C' or 'POSIX' (common in scripts and automation) would trigger the same issue since these aren't Windows locale names. Please see the bug I opened with the gettext project [1]. 3. Empty string bug in early 0.2x.y gettext 0.20.1 introduced a setlocale_null() wrapper that returns "" instead of NULL when setlocale() fails. This causes get_lcid("") to be called, triggering the enumeration bug even when LC_MESSAGES is unset. The attached patch takes a pragmatic approach: for gettext 0.20.1+, we avoid triggering the bug by using Windows locale format instead of calling IsoLocaleName(). This works because gettext 0.20.1+ internally converts the Windows format back to POSIX for catalog lookups, whereas 0.19.8 and earlier need POSIX format directly. The patch uses LIBINTL_VERSION to detect the gettext version at compile time and adjusts behavior accordingly. When locale is NULL, empty, or set to 'C'/'POSIX', we fall back to using the LC_CTYPE value (which is already in Windows format and always set). For gettext 0.19.8 and earlier, the existing IsoLocaleName() path is retained to maintain compatibility. I don't have automated tests for this since we'd need to test against multiple versions of a third-party library. I'm open to suggestions if folks think we should add something to the buildfarm or CI. Manual testing can be done with this test case: -- Create test table CREATE TABLE sampletest ( a VARCHAR, b VARCHAR ); -- Insert 1 million rows with random data INSERT INTO sampletest (a, b) SELECT substr(md5(random()::text), 0, 15), (100000000 * random())::integer::varchar FROM generate_series(1, 1000000); -- Create function that converts string to float with exception handling CREATE OR REPLACE FUNCTION toFloat(str VARCHAR, val REAL) RETURNS REAL AS $$ BEGIN RETURN CASE WHEN str IS NULL THEN val ELSE str::REAL END; EXCEPTION WHEN OTHERS THEN RETURN val; END; $$ LANGUAGE plpgsql COST 1 IMMUTABLE; -- Test query to trigger 1M exceptions -- (all conversions will fail since we inserted random MD5 strings) \timing on SELECT MAX(toFloat(a, NULL)) FROM sampletest; The ~8 second difference is due to the initial enumeration and other coding changes that were made by gettext. Keep in mind that for 1M exceptions we are probably calling gettext 2-3 million times. -- Bryan Green EDB: https://www.enterprisedb.com [1] https://savannah.gnu.org/bugs/?67781