Re: Add CASEFOLD() function.
Jeff Davis <pgsql@j-davis.com>
From: Jeff Davis <pgsql@j-davis.com>
To: Peter Eisentraut <peter@eisentraut.org>, Joe Conway
<mail@joeconway.com>, Ian Lawrence Barwick <barwick@gmail.com>
Cc: pgsql-hackers@postgresql.org
Date: 2025-01-18T00:34:43Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix PDF doc build.
- d2ca16bb509c 18.0 landed
-
Add SQL function CASEFOLD().
- bfc5992069cf 18.0 landed
-
Add support for Unicode case folding.
- 4e7f62bc386a 18.0 landed
Attachments
- v5-0001-Add-support-for-Unicode-case-folding.patch (text/x-patch) patch v5-0001
- v5-0002-Add-SQL-function-CASEFOLD.patch (text/x-patch) patch v5-0002
On Fri, 2025-01-10 at 16:27 -0800, Jeff Davis wrote: > New patch series attached. v5 attached. This version is rebased over the Full Case Mapping support, and supports Default Case Folding when using the PG_UNICODE_FAST collation. That means that "ẞ", "ß", "SS", "Ss", and "ss" all fold to "ss"; and "Σ", "σ", and "ς" all fold to "σ". CASEFOLD() is better (according to Unicode, anyway) than LOWER() for caseless matching, or in an expression index to enforce case- insensitive uniqueness without relying on ICU. Additionally, the infrastructure in this patch (as well as 286a365b9c) can be used in the future for better case-insensitive pattern matching, or casefolding identifiers in the parser without relying on libc. I feel this is about ready for commit. The main point of discussion was whether CASEFOLD() would do normalization, and if so, what the SQL API would look like. I concluded upthread that it was unnecessary to meet the Unicode Default Case Folding behavior, and we should just leave normalization as a separate process. If someone disagrees with reasoning, please let me know. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/610a56de2bd958e96c149ca60420db30e7d51588.camel%40j-davis.com