Thread

  1. Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation

    Laurenz Albe <laurenz.albe@cybertec.at> — 2025-12-03T07:51:22Z

    On Tue, 2025-12-02 at 15:53 -0500, Tom Lane wrote:
    > > The attached patch v3 turns it into a while loop to avoid
    > > the problem.
    > 
    > Looking at the code overall, I wonder if the outer loop doesn't have
    > the same issue.  The comments claim that we should be able to handle
    > zero-length matches, but if the overall haystack is of length zero,
    > we will fail to check for such a match.
    
    If you can find zero-length matches at all, you could find a
    zero-length match in a non-empty haystack.  Perhaps the function is
    never called with an empty haystack...
    
    > Also, since we have haystack <= haystack_end as a starting condition,
    > I think both loops could omit the initial test.  I'd be inclined
    > to code them like
    > 
    > 	test_ptr = start point;
    > 	for (;;)
    > 	{
    > 		...
    > 		if (test_ptr >= haystack_end)
    > 			break;
    > 		test_ptr += pg_mblen(test_ptr);
    > 	}
    
    True.  The attached v4 patch does it like that.
    
    > On the other hand ... is that comment really right about zero-length
    > match being possible?  If it is, the API for this function is in
    > need of redesign, because callers that try to find "the next match"
    > would go into an infinite loop re-finding the same zero-length
    > match over and over.
    
    Right.  I'll see if I can trigger such a case.
    
    Yours,
    Laurenz Albe