Thread

  1. Question about hashed ScalarArrayOpExpr equality semantics

    Ayush Tiwari <ayushtiwari.slg01@gmail.com> — 2026-05-11T04:34:38Z

    Hi,
    
    While looking at the hashed ScalarArrayOpExpr path, I noticed that the
    linear
    and hashed evaluation paths can give different answers if the comparison
    function returns NULL for non-NULL inputs.
    
    The part I am trying to understand is what assumption the hashed path is
    allowed to make about equality operators used for hashing.  Does declaring
    an
    equality operator as HASHES / putting it in a hash opclass imply that, for
    non-NULL inputs, the equality result is a definite true or false value?
    
    Reproduced on master.
    
    Setup:
    
    CREATE FUNCTION weird_strict_int_eq(int, int)
    RETURNS bool
    LANGUAGE sql IMMUTABLE STRICT
    AS $$
      SELECT CASE
               WHEN $1 = 42 AND $2 = 42 THEN NULL
               ELSE $1 = $2
             END
    $$;
    
    CREATE FUNCTION weird_strict_int_ne(int, int)
    RETURNS bool
    LANGUAGE sql IMMUTABLE STRICT
    AS $$
      SELECT NOT weird_strict_int_eq($1, $2)
    $$;
    
    CREATE OPERATOR === (
      LEFTARG = int,
      RIGHTARG = int,
      PROCEDURE = weird_strict_int_eq,
      COMMUTATOR = ===,
      NEGATOR = !==,
      HASHES
    );
    
    CREATE OPERATOR !== (
      LEFTARG = int,
      RIGHTARG = int,
      PROCEDURE = weird_strict_int_ne,
      COMMUTATOR = !==,
      NEGATOR = ===
    );
    
    CREATE OPERATOR CLASS weird_strict_int_hash_ops
    FOR TYPE int USING hash AS
      OPERATOR 1 ===,
      FUNCTION 1 hashint4(int);
    
    The hashed ScalarArrayOpExpr path activates once the constant array has at
    least MIN_ARRAY_SIZE_FOR_HASHED_SAOP (currently 9) elements, so the
    8-element
    variant uses the linear evaluation and the 9-element variant uses the hash
    table. I wrap the left-hand side in a non-immutable function call so the
    expression is not constant-folded at plan time.
    
    CREATE FUNCTION return_int_input(int) RETURNS int
    LANGUAGE sql STABLE AS $$ SELECT $1 $$;
    
    SELECT
      return_int_input(42) === ANY (ARRAY[1,2,3,4,5,6,7,42])   AS linear_any,
      return_int_input(42) === ANY (ARRAY[1,2,3,4,5,6,7,8,42]) AS hashed_any,
      return_int_input(42) !== ALL (ARRAY[1,2,3,4,5,6,7,42])   AS linear_not_in,
      return_int_input(42) !== ALL (ARRAY[1,2,3,4,5,6,7,8,42]) AS hashed_not_in;
    
    The linear path preserves the NULL (UNKNOWN) result from the comparison.
    The
    hashed path seems to treat a NULL from the comparison as a non-match,
    producing
    a different result.
    
    I realize this is a strange equality operator, and that may be the whole
    point.
    For a hash table lookup, it seems reasonable to need a definite answer to
    "does
    this stored key match the lookup key?".  On the other hand,
    ScalarArrayOpExpr
    itself has SQL three-valued semantics, and the linear path does preserve the
    comparison's NULL result.
    
    So my question is: is the hashed path allowed to assume that hashable
    equality
    operators never return NULL for non-NULL inputs, or should it preserve the
    same
    UNKNOWN result that the linear ScalarArrayOpExpr evaluation would produce?
    
    Regards,
    Ayush