Thread

  1. Re: Expanding HOT updates for expression and partial indexes

    Matthias van de Meent <boekewurm+postgres@gmail.com> — 2025-11-24T18:59:10Z

    On Sat, 22 Nov 2025, 22:30 Greg Burd, <greg@burd.me> wrote:
    > Thanks for pointing out the oversight for index-oriented scans (IOS),
    > you're right that the code in v22 doesn't handle that correctly.  I'll
    > fix that.  I still think that indexes that don't support IOS can and
    > should use the type-specific equality checks.  This opens the door to
    > HOT with custom types that have unusual equality rules (see BSON).
    
    Do you have specific examples why it would be safe to default to
    "unusual equality rules" for generally any index's data ingestion
    needs? Why e.g. BSON must always be compared with their special
    equality test (and not datumIsEqual), and why IOS-less indexes in
    general are never going to distinguish between binary distinct but
    btree-equal values, and why exact equality is the special case here?
    
    I understand that you want to maximize optimization for specific
    workloads that you have in mind, but lacking evidence to the contrary
    I am really not convinced that your workloads are sufficiently
    generalizable that they can (and should) be the baseline for these new
    HOT rules: I have not yet seen good arguments why we could relax
    "datum equality" to "type equality" without potentially breaking
    existing indexes.
    
    HOT was implemented quite conservatively to make sure that there are
    no issues where changed values are not reflected in indexes: each
    indexed TID represents a specific and unchanging set of indexed
    values, in both the key and non-key attributes of indexes. If a value
    changes however so slightly, that may be a cause for indexes to treat
    it differently, and thus HOT must not be used.
    
    Aside: The amsummarizing optimization gets around that check by
    realizing the TID itself isn't really indexed, so the rules can be
    relaxed around that, but it still needs to go through the effort to
    update the summarizing indexes if the relevant attributes were ever so
    slightly updated.
    
    This patch right now wants to change these rules and behaviour of HOT
    in two ways:
    
    1.) Instead of only testing attributes mentioned by indexed
    expressions for changes, it wants to test the output of the indexed
    expressions.
    I would consider this to be generally safe, as long as the expressions
    comply with the rules we have for indexed expressions. [Which, if not
    held, would break IOS and various other things, too, so relying on
    these rules isn't new or special].
    
    2.) Instead of datumIsEqual, it (by default) wants to do equality
    checks as provided by the type's default btree opclass' = operator.
    I have not seen evidence that this is safe. I have even explained with
    an example that IOS will return distinctly wrong results if only
    btree's = operator is used to determine if HOT can be applied, and
    that doesn't even begin to cover the issues related to indexes that
    may handle data differently from Btree.
    I also don't want indexes that at some point in the future invent
    support for IOS to return subtly incorrect results due to HOT checks
    that depended on the output of a previous version's amcanreturn
    output.
    
    So, IMV, tts_attr_equal is just a rather expensive version of
    datumIsEqual: the fast path cases would've been handled by
    datumIsEqual at least as fast (without a switch() statement with 18
    specific cases and a default branch); if there is no btree operator
    it'll still default to btree compare, and if not then if the slow path
    uses correctly implemented compare operators (for HOT, and potentially
    all other possible indexes), then these would have an output that is
    indistinguishable from datumIsEqual, with the only difference the
    address and performance of the called function and a lot of added
    catalog lookups.
    
    All together, I think it's best to remove the second component of the
    changes to the HOT rules (changing the type of matching done for
    indexed values with tts_attr_compare) from this patchset.
    If you believe this should be added regardless, I think it's best
    discussed separately in its own thread and patchset -- it should be
    relatively easy to introduce in both current and future versions of
    this code, and (if you're correct and this is safe) it would have some
    benefits even when committed on its own.
    
    
    Kind regards,
    
    Matthias van de Meent