Thread

  1. Re: Refactor: allow pg_strncoll(), etc., to accept -1 length for NUL-terminated cstrings.

    Andres Freund <andres@anarazel.de> — 2026-05-01T16:40:54Z

    Hi,
    
    On 2024-08-22 11:00:54 -0700, Jeff Davis wrote:
    > Like ICU, allow -1 length to mean that the input string is NUL-
    > terminated for pg_strncoll(), pg_strnxfrm(), and pg_strnxfrm_prefix().
    > 
    > This simplifies the API and code a bit.
    
    I don't really like this.  I was hacking on a patch that uses compiler
    annotations to tell the compiler what range of memory a function access. The
    compiler then can use that knowledge to give you both compile-time warnings
    and, more importantly, it makes ubsan much more accurate. It'll e.g. often be
    able to warn you if a function accesses more memory than its annotation would
    suggest, even if the memory is part of a larger memory allocation (something
    asan, valgrind etc can't warn about, yet are often the most security critical
    issues).  I found a bunch of issues that way already.
    
    But the annotations can't work if the access size is sometimes is -1.
    
    I also don't find this very convincing code-wise. You end up with lots of
    branches for -1. You have to support cases where one of the arguments is
    specifies as -1 and the other one with a real length, even though that's
    presumably a non-existing case.
    
    It seems reasonable to want the more efficient path for zero terminated
    strings with libc, but it seems like if we want that, we should add add a
    collate_method->strcoll, rather than have a strncoll that's not actually
    strncoll but strcoll.
    
    Greetings,
    
    Andres Freund