Thread
-
Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation
Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-03T15:12:07Z
Laurenz Albe <laurenz.albe@cybertec.at> writes: > On Tue, 2025-12-02 at 15:53 -0500, Tom Lane wrote: >> Looking at the code overall, I wonder if the outer loop doesn't have >> the same issue. The comments claim that we should be able to handle >> zero-length matches, but if the overall haystack is of length zero, >> we will fail to check for such a match. > If you can find zero-length matches at all, you could find a > zero-length match in a non-empty haystack. Perhaps the function is > never called with an empty haystack... After further thought, it seems to me that this comment is an unjustified extrapolation from what Peter actually said, which was that the match substring could be physically shorter than the needle. Which is certainly true, for instance case-folding or accent-stripping might shorten the string. But it doesn't follow that a nonempty needle could ever match an empty substring; and that does not seem like it could be sane behavior to me. We're considering string comparison here, not regexes. We do require callers to eliminate the empty-needle case, and given that I think we could assume that match substrings must be at least 1 byte long. That assumption is what justifies the current API for these functions, and perhaps we can also simplify this loop by using it. regards, tom lane