Thread

  1. Disabling Heap-Only Tuples

    Thom Brown <thom@linux.com> — 2023-07-05T10:44:31Z

    Hi,
    
    Heap-Only Tuple (HOT) updates are a significant performance
    enhancement, as they prevent unnecessary page writes. However, HOT
    comes with a caveat: it means that if we have lots of available space
    earlier on in the relation, it can only be used for new tuples or in
    cases where there's insufficient space on a page for an UPDATE to use
    HOT.
    
    This mechanism limits our options for condensing tables, forcing us to
    resort to methods like running VACUUM FULL/CLUSTER or using external
    tools like pg_repack. These either require exclusive locks (which will
    be a deal-breaker on large tables on a production system), or there's
    risks involved. Of course we can always flood pages with new versions
    of a row until it's forced onto an early page, but that shouldn't be
    necessary.
    
    Considering these trade-offs, I'd like to propose an option to allow
    superusers to disable HOT on tables. The intent is to trade some
    performance benefits for the ability to reduce the size of a table
    without the typical locking associated with it.
    
    This feature could be used to shrink tables in one of two ways:
    temporarily disabling HOT until DML operations have compacted the data
    into a smaller area, or performing a mass update on later rows to
    relocate them to an earlier location, probably in stages. Of course,
    this would need to be used in conjunction with a VACUUM operation.
    
    Admittedly this isn't ideal, and it would be better if we had an
    operation that could do this (e.g. VACUUM COMPACT <table_name>), or an
    option that causes some operations to avoid HOT when it detects an
    amount of free space over a threshold, but in lieu of those, I thought
    this would at least allow users to help themselves when running into
    disk space issues.
    
    Thoughts?
    
    Thom