Thread

  1. Re: Remaining dependency on setlocale()

    Jeff Davis <pgsql@j-davis.com> — 2025-12-23T20:09:08Z

    On Wed, 2025-12-17 at 11:39 +0100, Peter Eisentraut wrote:
    > For Metaphone, I found the reference implementation linked from its 
    > Wikipedia page, and it looks like our implementation is pretty
    > closely 
    > aligned to that.  That reference implementation also contains the 
    > C-with-cedilla case explicitly.  The correct fix here would probably
    > be 
    > to change the implementation to work on wide characters.  But I think
    > for the moment you could try a shortcut like, use pg_ascii_toupper(),
    > but if the encoding is LATIN1 (or LATIN9 or whichever other encodings
    > also contain C-with-cedilla at that code point), then explicitly 
    > uppercase that one as well.  This would preserve the existing
    > behavior.
    
    Done, attached new patches.
    
    Interestingly, WIN1256 encodes only the SMALL LETTER C WITH CEDILLA. I
    think, for the purposes here, we can still consider it to "uppercase"
    to \xc7, so that it can still be treated as the same sound. Technically
    I think that would be an improvement over the current code in this edge
    case, and suggests that case folding would be a better approach than
    uppercasing.
    
    Regards,
    	Jeff Davis