Re: Add CASEFOLD() function.

Robert Treat <rob@xzilla.net>

From: Robert Treat <rob@xzilla.net>
To: Thom Brown <thom@linux.com>
Cc: Peter Eisentraut <peter@eisentraut.org>, Jeff Davis <pgsql@j-davis.com>, Vik Fearing <vik@postgresfriends.org>, Joe Conway <mail@joeconway.com>, Ian Lawrence Barwick <barwick@gmail.com>, PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2025-06-19T16:15:48Z
Lists: pgsql-hackers
On Thu, Jun 19, 2025 at 11:37 AM Thom Brown <thom@linux.com> wrote:
> On Thu, 19 Jun 2025 at 15:51, Peter Eisentraut <peter@eisentraut.org> wrote:
> > On 19.06.25 06:03, Thom Brown wrote:
> > > Late to the party, but is there an argument for porting this to the
> > > citext type? Or supplementing the extension with an additional type
> > > ("cftext"? *shrug*). It currently uses lower(), so our current
> > > recommendation for dealing with all unicode characters is to use
> > > nondeterministic collations.
> >
> > What is the motivation for wanting a citext variant instead of using
> > nondeterministic collations?
>
> Ease of use, perhaps. It seems easier to use:
>
> column_name cftext
>
> rather than:
>
> CREATE COLLATION case_insensitive_collation (
>     PROVIDER = icu,
>     LOCALE = 'und-u-ks-level2',
>     DETERMINISTIC = FALSE
> );
>
> column_name text COLLATE case_insensitive_collation
>
> But I see the arguments against it. It creates an unnecessary
> dependency on an extension, and if someone wants to ignore both case
> and accents, they may resort to using 2 extensions (citext + unaccent)
> when none are needed. I guess I don't feel strongly about it either
> way.

Don't forget, if you have a defined insensitive / normalized
collations, you also enable on-the-fly collation based matching, a la
"SELECT 'Å' = 'A' COLLATE ignore_accent_case;" regardless of the
provided collations (which I think is much more common certain in
other databases)

Robert Treat
https://xzilla.net