Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation

Laurenz Albe <laurenz.albe@cybertec.at>

From: Laurenz Albe <laurenz.albe@cybertec.at>
To: Heikki Linnakangas <hlinnaka@iki.fi>, adam.warland@infor.com, pgsql-bugs@lists.postgresql.org
Date: 2025-12-02T17:18:11Z
Lists: pgsql-bugs

Attachments

On Tue, 2025-12-02 at 18:36 +0200, Heikki Linnakangas wrote:
> +1, looks good to me. Let's also add a regression test for this.

Right, done in the attached.

> > I am not certain if it is safe to apply pg_mblen() to "haystack_end", though.
> 
> It doesn't do that though, does it? There are two pg_mblen() calls in 
> the vicinity:
> 
> > 			for (const char *test_end = hptr; test_end <= haystack_end; test_end += pg_mblen(test_end))
> > 			{
> > 				if (pg_strncoll(hptr, (test_end - hptr), needle, needle_len, state->locale) == 0)
> > 				{
> > 					state->last_match_len_tmp = (test_end - hptr);
> > 					result_hptr = hptr;
> > 					if (!state->greedy)
> > 						break;
> > 				}
> > 			}
> > 			if (result_hptr)
> > 				break;
> > 
> > 			hptr += pg_mblen(hptr);
> 
> Neither of those will get called with 'haystack_end' as far as I can see.

During the last iteration of the loop, "test_end" will be equal to "haystack_end",
and the loop increment will call "pg_mblen(test_end)".

Yours,
Laurenz Albe