Thread

  1. Re: Expanding HOT updates for expression and partial indexes

    Greg Burd <greg@burd.me> — 2025-12-15T21:46:11Z

    I've updated the patch set a tad and I've got some benchmark results
    (and questions).
    
    PATCHES
    ===========================================================
    
    * 0001 - Prepare heapam_tuple_update() and simple_heap_update() for divergence
    
    Unchanged.
    
    * 0002 - Track changed indexed columns in the executor during UPDATEs
    
     Bug/oversight minor fix related to partial index attributes.
    
    Also, I mistakenly said that v25 removed the $subject (ability to allow
    expression indexes to be HOT).  That's not true, they can go HOT with
    this patch provided that the result of the expression evaluated using
    the before/after attribute values are equal using datumIsEqual().  When
    that is the case, as can happen with updates to fields within JSONB
    columns when indexes are on other fields, the update can be HOT should
    the heap find room on the page to store the new tuple.
    
    * 0003 - Replace index_unchanged_by_update() with ri_ChangedIndexedCols
    
    Unchanged.
    
    * 0004 - Identify if partial indexes are impacted by an update
    
    This is the new piece, it existed in the v24 patch set and now it is
    back. This checks the before/after partial index expression and when
    both are outside the predicate then it is possible that heap can use the
    HOT path whereas in the past this couldn't happen.  In the past any
    update to an attribute in an index, even if it was outside the
    predicate, was (is) HOT blocking.
    
    
    SUMMARY
    ===========================================================
    
    I've just started to scratch the surface of performance testing for
    this, attached is a very simple comparison of master/patch for a basic
    update load that should always go HOT in either case.  It shows about 1%
    variance between the two (-O0), tests run on my laptop so that's
    essentially no difference despite more overhead of the new function and
    that it seems to be called more frequently due to (guessing here) more
    opportunity for TM_Updated to be the return from heapam_tuple_update. 
    Your thoughts welcome here, or best/worst case ideas for tests to run.
    
    Next up I plan to layer the controversial type-specific piece into this
    patch set if nothing else just as a record of what's left over.  Then
    I'll try to better isolate good/bad performance implications of this
    patch set.
    
    Ideally, this patch set and the one (under development) for catalog
    tuples could combine to completely restructure the heap update process
    and open the door to more HOT updates and faster catalog updates.  But,
    I still have to demonstrate that.  For JSONB heavy applications this
    should be a net win, for the rest it should be a minor or zero
    regression.  For other custom implementations of indexes over
    specialized types (as is the case for the new open sourced DocumentDB
    work) this opens the door for HOT updates when possible.  All of that is
    the the hope, it's time to measure hope against reality. :)
    
    This patch set does start to move the executor away from a heap-specific
    view of the world where updates are all/none/summarizing.  This
    potentially eases the integration of WARM or PHOT-like solutions where
    we only update those indexes that are materially impacted by an update. 
    It should be clear by now, that's my ultimate goal.
    
    best.
    
    -greg