Re: Speed up ICU case conversion by using ucasemap_utf8To*()

Jeff Davis <pgsql@j-davis.com>

From: Jeff Davis <pgsql@j-davis.com>
To: Andreas Karlsson <andreas@proxel.se>, pgsql-hackers <pgsql-hackers@postgresql.org>
Date: 2024-12-20T19:24:04Z
Lists: pgsql-hackers
On Fri, 2024-12-20 at 06:20 +0100, Andreas Karlsson wrote:
> SELECT count(upper) FROM (SELECT upper(('Kålhuvud ' || i) COLLATE 
> "sv-SE-x-icu") FROM generate_series(1, 1000000) i);
> 
> master:  ~540 ms
> Patched: ~460 ms
> glibc:   ~410 ms

It looks like you are opening and closing the UCaseMap object each
time. Why not save it in pg_locale_t? That should speed it up even more
and hopefully beat libc.


Also, to support older ICU versions consistently, we need to fix up the
locale name to support "und"; cf. pg_ucol_open(). Perhaps factor out
that logic?

Regards,
	Jeff Davis