Thread
-
Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation
Heikki Linnakangas <hlinnaka@iki.fi> — 2025-12-02T16:36:06Z
On 02/12/2025 18:24, Laurenz Albe wrote: > On Tue, 2025-12-02 at 10:03 +0000, PG Bug reporting form wrote: >> PostgreSQL version: 18.1 >> >> When using a nondeterministic ICU collation, the replace() function fails to >> replace a substring when that substring appears at the end of the input >> string. >> >> Occurrences of the same substring earlier in the string are replaced >> normally. >> >> Specific collation used: >> create collation test_nondeterministic ( >> provider = icu, >> locale = 'und-u-ks-level2', >> deterministic = false >> ) >> >> -- Replace final character under nondeterministic collation >> SELECT replace( >> 'testx' COLLATE "test_nondeterministic", >> 'x' COLLATE "test_nondeterministic", >> 'y') AS res1; > > I can reproduce the problem, and the attached patch fixes it for me. +1, looks good to me. Let's also add a regression test for this. > I am not certain if it is safe to apply pg_mblen() to "haystack_end", though. It doesn't do that though, does it? There are two pg_mblen() calls in the vicinity: > for (const char *test_end = hptr; test_end <= haystack_end; test_end += pg_mblen(test_end)) > { > if (pg_strncoll(hptr, (test_end - hptr), needle, needle_len, state->locale) == 0) > { > state->last_match_len_tmp = (test_end - hptr); > result_hptr = hptr; > if (!state->greedy) > break; > } > } > if (result_hptr) > break; > > hptr += pg_mblen(hptr); Neither of those will get called with 'haystack_end' as far as I can see. - Heikki