Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Melanie Plageman <melanieplageman@gmail.com>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Remove table_scan_analyze_next_tuple unneeded parameter OldestXmin
- 284925508ae6 19 (unreleased) landed
-
Simplify visibility check in heap_page_would_be_all_visible()
- 3efe58febc3c 19 (unreleased) landed
-
Eliminate use of cached VM value in lazy_scan_prune()
- 648a7e28d7c2 19 (unreleased) landed
-
Combine visibilitymap_set() cases in lazy_scan_prune()
- 21796c267d0a 19 (unreleased) landed
-
Fix const qualification in prune_freeze_setup()
- 4877391ce894 19 (unreleased) landed
-
Simplify vacuum visibility assertion
- bd298f54a0d6 19 (unreleased) landed
-
Split heap_page_prune_and_freeze() into helpers
- e135e044572e 19 (unreleased) landed
-
Assert that cutoffs are provided if freezing will be attempted
- cd38b7e77315 19 (unreleased) landed
-
Split PruneFreezeParams initializers to one field per line
- 1e14edcea5e1 19 (unreleased) landed
-
Refactor heap_page_prune_and_freeze() parameters into a struct
- 1937ed70621e 19 (unreleased) landed
-
Make heap_page_is_all_visible independent of LVRelState
- 3e4705484e0c 19 (unreleased) landed
-
Inline TransactionIdFollows/Precedes[OrEquals]()
- 43b05b38ea4d 19 (unreleased) landed
-
Add helper for freeze determination to heap_page_prune_and_freeze
- c8dd6542bae4 19 (unreleased) landed
-
Bump XLOG_PAGE_MAGIC after xl_heap_prune change
- 4a8fb58671d3 19 (unreleased) landed
-
Correct prune WAL record opcode name in comment
- ae8ea7278c16 19 (unreleased) landed
-
Add error codes when vacuum discovers VM corruption
- 8ec97e78a771 19 (unreleased) landed
-
Remove unused xl_heap_prune member, reason
- 4b5f206de2bb 19 (unreleased) landed
-
Remove unneeded VM pin from VM replay
- 3399c265543e 19 (unreleased) landed
-
Add assert and log message to visibilitymap_set
- e3d5ddb7ca91 19 (unreleased) landed
-
Add error codes to some corruption log messages
- fd6ec93bf890 13.0 cited
Attachments
- v18-0001-Refactor-heap_page_prune_and_freeze-parameters-i.patch (text/x-patch) patch v18-0001
- v18-0002-Keep-all_frozen-updated-in-heap_page_prune_and_f.patch (text/x-patch) patch v18-0002
- v18-0003-Update-PruneState.all_-visible-frozen-earlier-in.patch (text/x-patch) patch v18-0003
- v18-0004-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch (text/x-patch) patch v18-0004
- v18-0005-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch (text/x-patch) patch v18-0005
- v18-0006-Remove-XLOG_HEAP2_VISIBLE-entirely.patch (text/x-patch) patch v18-0006
- v18-0007-Rename-GlobalVisTestIsRemovableXid-to-GlobalVisX.patch (text/x-patch) patch v18-0007
- v18-0008-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch (text/x-patch) patch v18-0008
- v18-0009-Unset-all_visible-sooner-if-not-freezing.patch (text/x-patch) patch v18-0009
- v18-0010-Allow-on-access-pruning-to-set-pages-all-visible.patch (text/x-patch) patch v18-0010
- v18-0011-Set-pd_prune_xid-on-insert.patch (text/x-patch) patch v18-0011
- v18-0012-Split-heap_page_prune_and_freeze-into-helpers.patch (text/x-patch) patch v18-0012
Thanks so much for the review! I've addressed all your feedback except
what is commented on inline below.
I've gone ahead and committed the preliminary patches that you thought
were ready to commit.
Attached v18 is what remains.
0001 - 0003: refactoring
0004 - 0006: finish eliminating XLOG_HEAP2_VISIBLE
0007 - 0009: refactoring
0010: Set VM on-access
0011: Set prune xid on insert
0012: Some refactoring for discussion
For 0001, I got feedback heap_page_prune_and_freeze() has too many
arguments, so I tried to address that. I'm interested to know if folks
like this more.
0011 still needs a bit of investigation to understand fully if
anything else in the index-killtuples test needs to be changed to make
sure we have the same coverage.
0012 is sort of WIP. I got feedback heap_page_prune_and_freeze() was
too long and should be split up into helpers. I want to know if this
split makes sense. I can pull it down the patch stack if so.
Only 0001 and 0012 are optional amongst the refactoring patches. The
others are required to make on-access VM-setting possible or viable.
On Thu, Oct 9, 2025 at 2:18 PM Andres Freund <andres@anarazel.de> wrote:
>
> > @@ -71,12 +84,12 @@ heap_xlog_prune_freeze(XLogReaderState *record)
> > }
> > - action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL,
> > - (xlrec.flags & XLHP_CLEANUP_LOCK) != 0,
> > - &buffer);
> > - if (action == BLK_NEEDS_REDO)
> > + if (XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL,
> > + (xlrec.flags & XLHP_CLEANUP_LOCK) != 0,
> > + &buffer) == BLK_NEEDS_REDO)
> > {
> > Page page = BufferGetPage(buffer);
> > OffsetNumber *redirected;
>
> Why move it around this way?
Because there will be an action for the visibility map
XLogReadBufferForRedoExtended(). I could have renamed it heap_action,
but it is being used only in one place, so I preferred to just cut it
to avoid any confusion.
> > Advance the page LSN
> > + * only if the record could include an FPI, since recovery skips
> > + * records <= the stamped LSN. Otherwise it might skip an earlier FPI
> > + * needed to repair a torn page.
> > + */
>
> This is confusing, should probably just reference the stuff we did in the
> !recovery case.
I fixed this and addressed all your feedback related to this before committing.
> > + if (do_prune || nplans > 0 ||
> > + ((vmflags & VISIBILITYMAP_VALID_BITS) && XLogHintBitIsNeeded()))
> > + PageSetLSN(page, lsn);
> > +
> > /*
> > * Note: we don't worry about updating the page's prunability hints.
> > * At worst this will cause an extra prune cycle to occur soon.
> > */
>
> Not your fault, but that seems odd? Why aren't we just doing the right thing?
The comment dates back to 6f10eb2. I imagine no one ever bothered to
fuss with extracting the XID. You could change
heap_page_prune_execute() to return the right value -- though that's a
bit ugly since it is used in normal operation as well as recovery.
> I wonder if the VM specific redo portion should be in a common helper? Might
> not be enough code to worry though...
I think it might be more code as a helper at this point.
> > @@ -2860,6 +2867,29 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
> > VACUUM_ERRCB_PHASE_VACUUM_HEAP, blkno,
> > InvalidOffsetNumber);
> >
> > + /*
> > + * Before marking dead items unused, check whether the page will become
> > + * all-visible once that change is applied.
>
> So the function is named _would_ but here you say will :)
I thought about it more and still feel that this function name should
contain "would". From vacuum's perspective it is "will" -- because it
knows it will remove those dead items, but from the function's
perspective it is hypothetical. I changed the comment though.
> > + if (heap_page_would_be_all_visible(vacrel, buffer,
> > + deadoffsets, num_offsets,
> > + &all_frozen, &visibility_cutoff_xid))
> > + {
> > + vmflags |= VISIBILITYMAP_ALL_VISIBLE;
> > + if (all_frozen)
> > + {
> > + vmflags |= VISIBILITYMAP_ALL_FROZEN;
> > + Assert(!TransactionIdIsValid(visibility_cutoff_xid));
> > + }
> > +
> > + /* Take the lock on the vmbuffer before entering a critical section */
> > + LockBuffer(vmbuffer, BUFFER_LOCK_EXCLUSIVE);
>
> It sure would be nice if we had documented the lock order between the heap
> page and the corresponding VM page anywhere. This is just doing what we did
> before, so it's not this patch's fault, but I did get worried about it for a
> moment.
Well, the comment above the visibilitymap_set* functions says what
expectations they have for the heap page being locked.
> > +static bool
> > +heap_page_would_be_all_visible(LVRelState *vacrel, Buffer buf,
> > + OffsetNumber *deadoffsets,
> > + int ndeadoffsets,
> > + bool *all_frozen,
> > + TransactionId *visibility_cutoff_xid)
> > +{
> > Page page = BufferGetPage(buf);
> Hm, what about an assert checking that matched_dead_count == ndeadoffsets at
> the end?
I was going to put an Assert(ndeadoffsets <= matched_dead_count), but
then I started wondering if there is a way we could end up with fewer
dead items than we collected during phase I.
I had thought about if we dropped an index and then did on-access
pruning -- but we don't allow setting LP_DEAD items LP_UNUSED in
on-access pruning. So, maybe this is safe... I can do a follow-on
commit to add the assert. But I'm just not 100% sure I've thought of
all the cases where we might end up with fewer dead items.
> > During vacuum's first and third phases, we examine tuples' visibility
> > to determine if we can set the page all-visible in the visibility map.
> >
> > Previously, this check compared tuple xmins against a single XID chosen at
> > the start of vacuum (OldestXmin). We now use GlobalVisState, which also
> > enables future work to set the VM during on-access pruning, since ordinary
> > queries have access to GlobalVisState but not OldestXmin.
> >
> > This also benefits vacuum directly: GlobalVisState may advance
> > during a vacuum, allowing more pages to become considered all-visible.
> > In the rare case that it moves backward, VACUUM falls back to OldestXmin
> > to ensure we don’t attempt to freeze a dead tuple that wasn’t yet
> > prunable according to the GlobalVisState.
>
> It could, but it currently won't advance in vacuum, right?
I thought it was possible for it to advance when calling
heap_prune_satisfies_vacuum() ->
GlobalVisTestIsRemovableXid()->...GlobalVisUpdate(). This case isn't
going to be common, but some things can cause us to update it.
We have talked about explicitly updating GlobalVisState more often
during vacuums of large tables. But I was under the impression that it
was at least possible for it to advance during vacuum now.
- Melanie