Thread

  1. Re: Expanding HOT updates for expression and partial indexes

    Greg Burd <greg@burd.me> — 2025-11-19T18:21:51Z

    On Nov 19 2025, at 1:00 pm, Greg Burd <greg@burd.me> wrote:
    
    > 
    > On Nov 16 2025, at 1:53 pm, Greg Burd <greg@burd.me> wrote:
    > 
    >> 0004 - Enable HOT updates for expression and partial indexes
    >> 
    >> This finally gets us back to where this project started, but on much
    >> more firm ground than before because we're not going to
    >> self-deadlock. 
    >> The idea has grown from a small function into something larger, but only
    >> out of necessity.
    >> 
    >> In this patch I add ExecWhichIndexesRequireUpdates() in execIndexing.c
    >> which implements (c) finding the set of attributes that force new index
    >> updates.  This set can be very different from the modified indexed
    >> attributes.  We know that some attributes are not equal to their
    >> previous versions, but does that mean that the index that references
    >> that attribute needs a new index tuple?  It may, or it may not.  Here's
    >> the comment on that function that explains:
    >> 
    >> /*
    >> * ExecWhichIndexesRequireUpdates
    >> *
    >> * Determine which indexes need updating given modified indexed attributes.
    >> * This function is a companion to ExecCheckIndexedAttrsForChanges(). 
    >> On the
    >> * surface, they appear similar but they are doing two very different things.
    >> *
    >> * For a standard index on a set of attributes this is the
    >> intersection of
    >> * the mix_attrs and the index attrs (key, expression, but not predicate).
    >> *
    >> * For expression indexes and indexes which implement the amcomparedatums()
    >> * index AM API we'll need to form index datum and compare each
    >> attribute to
    >> * see if any actually changed.
    >> *
    >> * For expression indexes the result of the expression might not change
    >> at all,
    >> * this is common with JSONB columns which require expression indexes
    >> and where
    >> * it is commonplace to index a field within a document and have
    >> updates that
    >> * generally don't update that field.
    >> *
    >> * Partial indexes won't trigger index tuples when the old/new tuples
    >> are both
    >> * outside of the predicate range.
    >> *
    >> * For nbtree the amcomparedatums() API is critical as it requires
    >> that key
    >> * attributes are equal when they memcmp(), which might not be the
    >> case when
    >> * using type-specific comparison or factoring in collation which
    >> might make
    >> * an index case insensitive.
    >> *
    >> * All of this is to say that the goal is for the executor to know,
    >> ahead of
    >> * calling into the table AM for the update and before calling into
    >> the index
    >> * AM for inserting new index tuples, which attributes at a minimum will
    >> * necessitate a new index tuple.
    >> *
    >> ...
    >> */
    > 
    > Attached are rebased (d5b4f3a6d4e) patches with the only changes
    > happening in the last patch in the series.
    > 
    > 0004 - Enable HOT updates for expression and partial indexes
    > 
    > I was never happy with the dual functions
    > ExecCheckIndexedAttrsForChanges() and ExecWhichIndexesRequireUpdates(),
    > it felt like too much overhead and duplication of effort.  While
    > updating my tests, adding a few cases, I found that there was also a
    > flaw in the logic.  So, time to rewrite and combine them.
    > 
    > What did I discover?  Before the logic was to find the set of modified
    > indexed attributes then review all the indexes for changed attributes
    > using FormIndexDatum() and comparing before/after to see if expressions
    > really changed the value to be indexed or not.  The first pass didn't
    > take into account expressions, the second did.  So, an expression index
    > over JSONB data wouldn't extract and test the field within the document,
    > it was just comparing the entire document before/after using the jsonb
    > comparison function, no bueno.
    > 
    > This approach wraps both functions into one somewhat simplified
    > function. The logic is basically, iterate over the indexes reviewing
    > indexed attributes for changes.  Along the way we call into the new
    > index AM's comparison function when present, otherwise we find and use
    > the proper type-specific comparison function for the datum.  At the end
    > of the function we have our Bitmapset of attributes that should trigger
    > new index tuples.
    > 
    >> What's left undone?
    >> 
    >> * I need to check code coverage so that I might
    > 
    > I did this and it was quite good, I'll do it again for this new series
    > but it's nice to see that the tests are exercising the vast majority of
    > the code paths.
    > 
    >> * create tests covering all the new cases
    > 
    > I think the coverage is good, maybe even redundant or overly complex
    > in places.
    > 
    >> * update the README.HOT documentation, wiki, etc.
    > 
    > Soon, I hope to have this approach solid and under review before
    > solidifying the docs.
    > 
    >> * performance...
    > 
    > Still as yet unmeasured, I know that there is more work per-update to
    > perform these checks, so some overhead, but I don't know if that
    > overhead is more than before with HeapDetermineColumnsInfo() and
    > index_unchanged_by_update().  Those two functions did essentially the
    > same thing, only with binary comparison (datumIsEqual()). I need to
    > measure that.  What about doing all this work outside of the buffer lock
    > in heap_update()?  Surely that'll give back a bit or at least add to
    > concurrency.  Forming index tuples a few extra times and evaluating the
    > expressions 3 times rather than 1 is going to hurt, I think I can come
    > up with a way to cache the formed datum and use it later on, but is that
    > worth it?  Complex expressions, yes.  Also, what about expressions that
    > expect to be executed once... and now are 3x?  That's what forced my
    > update to the insert-conflict-specconflict.out test, but AFAICT there is
    > no way to test if an expression's value is going to change on update
    > without exercising it once for the old tuple and once for the new tuple.
    > Even if it were possible for an index to provide the key it might have
    > changed after the expression evaluation (as is the case in hash), so I
    > don't think this is avoidable.  Maybe that's reason enough to add a
    > reloption to disable the expression evaluation piece of this?  Given
    > that it might create a logic or performance regression.  The flip side
    > is the potential to use the HOT path, that's a real savings.
    > 
    > One concerning thing is that nbtree's assumption that key attributes for
    > TIDs must use binary comparison for equality.  This means that for our
    > common case (heap/btree) there is more work per-update than before,
    > which is why I need to measure.  I could look into eliminating the
    > nbtree requirement, I don't understand it too well as yet by I believe
    > that on page split there is an attempt to deduplicate TIDs into a
    > TIDBitmap and the test for when that's possible is datumIsEqual().  If
    > that were the same as in this new code, possibly evening using
    > tts_attr_equal(), then... I don't know, I'll have to investigate.  Chime
    > in here if you can educate me on this one. :)
    > 
    > best.
    > 
    > -greg
    
    Doh!
    
    I forgot to commit the fixed regression test expected output before
    formatting the patch set, here it is.
    
    -greg