Thread

  1. Re: Add CASEFOLD() function.

    Robert Treat <rob@xzilla.net> — 2025-06-19T16:51:05Z

    On Thu, Jun 19, 2025 at 12:33 PM Jeff Davis <pgsql@j-davis.com> wrote:
    >
    > On Thu, 2025-06-19 at 16:36 +0100, Thom Brown wrote:
    > > Ease of use, perhaps. It seems easier to use:
    > >
    > > column_name cftext
    > >
    > > rather than:
    > >
    > > CREATE COLLATION case_insensitive_collation (
    > >     PROVIDER = icu,
    > >     LOCALE = 'und-u-ks-level2',
    > >     DETERMINISTIC = FALSE
    > > );
    >
    > We could auto-create such a collation at initdb time for ICU-enabled
    > builds.
    >
    
    Providing a generic insensitive/non-deterministic collation by default
    would solve a number of different use cases, so +1 on the idea from
    me.
    And TBH I usually build --without-icu but this would likely cause me
    to change that.
    
    > > But I see the arguments against it. It creates an unnecessary
    > > dependency on an extension, and if someone wants to ignore both case
    > > and accents, they may resort to using 2 extensions (citext +
    > > unaccent)
    > > when none are needed.
    >
    > There are at least three ways to do case insensitivity (or other kinds
    > of equivalence):
    >
    > * Explicit function calls in queries, as well as index and constraint
    > definitions. E.g. expression index on LOWER(), queries that explicitly
    > do "LOWER(x) = ..."
    >
    > * Wrap those function calls up in a separate data type, like citext.
    >
    > * Non-deterministic collations.
    >
    > Given that we have collations, which are a way of organizing alternate
    > behaviors for existing data types, I'm not sure I see the need for
    > creating an entirely separate data type.
    >
    > > I guess I don't feel strongly about it either
    > > way.
    >
    > Are you a user of citext? I'm genuinely interested in the use cases,
    > and whether the separate-data-type approach has merits that are missing
    > in the other approaches.
    >
    
    Yeah, I'd be interested to hear if there is some missing bit that
    existing users have concerns over; as a former user of citext, it was
    a great workaround at the time, but there are "better ways" to handle
    those things now (imho).
    
    
    Robert Treat
    https://xzilla.net