Thread

  1. Re: Expanding HOT updates for expression and partial indexes

    Greg Burd <greg@burd.me> — 2025-10-14T17:46:01Z

    > On Oct 9, 2025, at 3:27 PM, Jeff Davis <pgsql@j-davis.com> wrote:
    > 
    > On Tue, 2025-10-07 at 17:36 -0400, Greg Burd wrote:
    >> After reviewing how updates work in the executor, I discovered that
    >> during execution the new tuple slot is populated with the information
    >> from ExecBuildUpdateProjection() and the old tuple, but that most
    >> importantly for this use case that function created a bitmap of the
    >> modified columns (the columns specified in the update).  This bitmap
    >> isn't the same as the one produced by HeapDetermineColumnsInfo() as
    >> the
    >> latter excludes attributes that are not changed after testing
    >> equality
    >> with the helper function heap_attr_equals() where as the former will
    >> include attributes that appear in the update but are the same value
    >> as
    >> before.  This, happily, is immaterial for the purposes of my function
    >> ExecExprIndexesRequireUpdates() which simply needs to check to see if
    >> index tuples generated are unchanged.  So I had all I needed to run
    >> the
    >> checks ahead of acquiring the lock on the buffer.
    > 
    > You're still calling ExecExprIndexesRequireUpdates() from within
    > heap_update(). Can't you do that inside of ExecUpdatePrologue() or
    > thereabouts?
    
    Hey Jeff,
    
    I'm trying to knit this into the executor layer but that is tricky because
    the concept of HOT is very heap-specific, so the executor should be
    ignorant of the heap's specific needs (right?). Right now, I am considering
    adding a step in ExecUpdatePrologue() just after opening the indexes.
    
    The idea I'm toying with is to have a new function on all TupleTableSlots
    that examines the before/after slots for an update and the set of updated
    attributes and returns a Bitmapset of the changed attributes that overlap
    with indexes and so should trigger index updates in ExecUpdateEpilogue().
    
    That way for heap we'd have something like:
    Bitmapset *tts_heap_getidxattr(ResultRelInfo *info,
    			TupleTableSlot *updated,
    			TupleTableSlot *existing,
    			Bitmapset *updated_attrs)
    {
    	some combo of HeapDeterminColumnsInfo() and
    	ExecExprIndexesRequireUpdates()
    
    	returns the set of indexed attrs that this update changed
    }
    
    So, attributes only referenced by expressions where the expression
    produces the same value for the updated and existing slots would be
    removed from the set.
    
    Interestingly, summarizing indexes that don't overlap with changed
    attributes won't be updated (and that's a good thing).
    
    Problem is we're not yet accounting for what is about to happen in
    ExecUpdateAct() when calling into the heap_update().  That's where
    heap tries to fit the new tuple onto the same page.  That might be
    possible with large tuples thanks to TOAST, it's impossible to say
    before getting into this function with the page locked.
    
    So, for updates we include the modified_attrs in the UpdateContext
    which is available to heap_update().  If the heap code decides to
    go HOT, great unset all attributes in the modified_attrs except any
    that are only summarizing.  If the heap can't go HOT, fine, add
    the indexed attrs back into modified_attrs which should trigger all
    indexes to be updated.
    
    This gets rid of TU_UpdateIndexes enum and allows only modified
    summarizing indexes to be updated on the HOT path.  Two additional
    benefits IMO.
    
    at least, that's what I'm trying out now,
    
    -greg
    
    > Regards,
    > Jeff Davis