Thread

  1. Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

    Xuneng Zhou <xunengzhou@gmail.com> — 2025-12-19T03:38:24Z

    He Melanie,
    
    Thanks for working on this.
    
    On Wed, Dec 17, 2025 at 12:59 AM Melanie Plageman
    <melanieplageman@gmail.com> wrote:
    >
    > On Wed, Dec 3, 2025 at 6:07 PM Melanie Plageman
    > <melanieplageman@gmail.com> wrote:
    > >
    > > If we're just talking about the renaming, looking at procarray.c, it
    > > is full of the word "removable" because its functions were largely
    > > used to examine and determine if everyone can see an xmax as committed
    > > and thus if that tuple is removable from their perspective. But
    > > nothing about the code that I can see means it has to be an xmax. We
    > > could just as well use the functions to determine if everyone can see
    > > an xmin as committed.
    >
    > In the attached v27, I've removed the commit that renamed functions in
    > procarray.c. I've added a single wrapper GlobalVisTestXidNotRunning()
    > that is used in my code where I am testing live tuples. I think you'll
    > find that I've addressed all of your review comments now -- as I've
    > also gotten rid of the confusing blk_known_av logic through a series
    > of refactors.
    >
    > The one outstanding point is which commits should bump
    > XLOG_PAGE_MAGIC. (also review of the reworked patches).
    >
    > - Melanie
    
    I’ve done a basic review of patches 1 and 2. Here are some comments
    which may be somewhat immature, as this is a fairly large change set
    and I’m new to some parts of the code.
    
    1) Potential stale old_vmbits after VM repair n v2
    
    // Corruption check 1
    if (!PageIsAllVisible(page) &&
    (old_vmbits & VISIBILITYMAP_ALL_VISIBLE) != 0)
    {
    visibilitymap_clear(...); // VM now cleared to 0
    // but old_vmbits still holds ALL_VISIBLE
    }
    
    // ... later ...
    
    if (!presult.all_visible)
    return presult.ndeleted; // Not taken if presult.all_visible=true
    
    new_vmbits = VISIBILITYMAP_ALL_VISIBLE; // Want to set this
    
    if (old_vmbits == new_vmbits) // Stale old_vmbits=ALL_VISIBLE,
    new_vmbits=ALL_VISIBLE
      return presult.ndeleted; // issue: early return
    
    After corruption repair clears the VM, old_vmbits is stale. The early
    return can fire unexpectedly, leaving the VM cleared when it should be
    re-set. Should we reset old_vmbits = 0 after the visibilitymap_clear?
    
    2) Add Assert(BufferIsDirty(buf))
    
    Since the patch's core claim is "buffer must be dirty before WAL
    registration", an assertion encodes this invariant. Should we add:
    
    Assert(BufferIsValid(buf));
    Assert(BufferIsDirty(buf));
    
    right before the visibilitymap_set() call?
    
    3) Comment about "only scenario"
    
    The comment at lines:
    > "The only scenario where it is not already dirty is if the VM was removed…"
    
    This phrasing could become misleading after future refactors. Can we
    make it more direct like:
    
    > "We must mark the heap buffer dirty before calling visibilitymap_set(), because it may WAL-log the buffer and XLogRegisterBuffer() requires it."
    
    4) Comment clarity
    
    Current comment:
    
    > "Even if PD_ALL_VISIBLE is already set, we don't need to worry about unnecessarily dirtying the heap buffer, as it must be marked dirty before adding it to the WAL chain. The only scenario where it is not already dirty is if the VM was removed..."
    
    In this test we now call MarkBufferDirty() on the heap page even when
    only setting the VM, so the comments claiming “does not need to modify
    the heap buffer”/“no heap page modification” might be misleading. It
    might be better to say the test doesn’t need to modify heap
    tuples/page contents or doesn’t need to prune/freeze.
    
    --
    Best,
    Xuneng