Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Melanie Plageman <melanieplageman@gmail.com>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Remove table_scan_analyze_next_tuple unneeded parameter OldestXmin
- 284925508ae6 19 (unreleased) landed
-
Simplify visibility check in heap_page_would_be_all_visible()
- 3efe58febc3c 19 (unreleased) landed
-
Eliminate use of cached VM value in lazy_scan_prune()
- 648a7e28d7c2 19 (unreleased) landed
-
Combine visibilitymap_set() cases in lazy_scan_prune()
- 21796c267d0a 19 (unreleased) landed
-
Fix const qualification in prune_freeze_setup()
- 4877391ce894 19 (unreleased) landed
-
Simplify vacuum visibility assertion
- bd298f54a0d6 19 (unreleased) landed
-
Split heap_page_prune_and_freeze() into helpers
- e135e044572e 19 (unreleased) landed
-
Assert that cutoffs are provided if freezing will be attempted
- cd38b7e77315 19 (unreleased) landed
-
Split PruneFreezeParams initializers to one field per line
- 1e14edcea5e1 19 (unreleased) landed
-
Refactor heap_page_prune_and_freeze() parameters into a struct
- 1937ed70621e 19 (unreleased) landed
-
Make heap_page_is_all_visible independent of LVRelState
- 3e4705484e0c 19 (unreleased) landed
-
Inline TransactionIdFollows/Precedes[OrEquals]()
- 43b05b38ea4d 19 (unreleased) landed
-
Add helper for freeze determination to heap_page_prune_and_freeze
- c8dd6542bae4 19 (unreleased) landed
-
Bump XLOG_PAGE_MAGIC after xl_heap_prune change
- 4a8fb58671d3 19 (unreleased) landed
-
Correct prune WAL record opcode name in comment
- ae8ea7278c16 19 (unreleased) landed
-
Add error codes when vacuum discovers VM corruption
- 8ec97e78a771 19 (unreleased) landed
-
Remove unused xl_heap_prune member, reason
- 4b5f206de2bb 19 (unreleased) landed
-
Remove unneeded VM pin from VM replay
- 3399c265543e 19 (unreleased) landed
-
Add assert and log message to visibilitymap_set
- e3d5ddb7ca91 19 (unreleased) landed
-
Add error codes to some corruption log messages
- fd6ec93bf890 13.0 cited
Attachments
- v24-0001-Simplify-vacuum-visibility-assertion.patch (text/x-patch) patch v24-0001
- v24-0002-Add-comment-about-PD_ALL_VISIBLE-and-VM-sync.patch (text/x-patch) patch v24-0002
- v24-0003-Combine-visibilitymap_set-cases-in-lazy_scan_pru.patch (text/x-patch) patch v24-0003
- v24-0004-Refactor-lazy_scan_prune-VM-set-logic-into-helpe.patch (text/x-patch) patch v24-0004
- v24-0005-Set-the-VM-in-heap_page_prune_and_freeze.patch (text/x-patch) patch v24-0005
- v24-0006-Move-VM-assert-into-prune-freeze-code.patch (text/x-patch) patch v24-0006
- v24-0007-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch (text/x-patch) patch v24-0007
- v24-0008-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch (text/x-patch) patch v24-0008
- v24-0009-Remove-XLOG_HEAP2_VISIBLE-entirely.patch (text/x-patch) patch v24-0009
- v24-0010-Rename-GlobalVisTestIsRemovableXid-to-GlobalVisX.patch (text/x-patch) patch v24-0010
- v24-0011-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch (text/x-patch) patch v24-0011
- v24-0012-Unset-all_visible-sooner-if-not-freezing.patch (text/x-patch) patch v24-0012
- v24-0013-Track-which-relations-are-modified-by-a-query.patch (text/x-patch) patch v24-0013
- v24-0014-Pass-down-information-on-table-modification-to-s.patch (text/x-patch) patch v24-0014
- v24-0015-Allow-on-access-pruning-to-set-pages-all-visible.patch (text/x-patch) patch v24-0015
- v24-0016-Set-pd_prune_xid-on-insert.patch (text/x-patch) patch v24-0016
On Thu, Dec 4, 2025 at 12:11 AM Chao Li <li.evan.chao@gmail.com> wrote:
>
> I resisted this patch again today. I reviewed 0001-0004, and got a few more comments:
Thanks for the review! v24 attached with updates you suggested as well
as the bug fix described below.
I realized my code didn't mark the heap buffer dirty if we were not
modifying it (i.e. only setting the VM). This trips an assert in
XLogRegisterBuffer() which requires that all buffers registered with
the WAL machinery are marked dirty unless REGBUF_NO_CHANGE is passed.
It wasn't possible to hit it in master because we unconditionally
dirtied the buffer if we found the VM not set in
find_next_unskippable_block() -- even if we made no changes to the
heap buffer. But my refactoring only dirtied the heap buffer if we
modified it (either pruning/freezing or setting PD_ALL_VISIBLE).
In attached v24, I once again always dirty the heap buffer before
registering it. We can't skip adding the heap buffer to the WAL chain
even if we didn't modify it, because we use it to update the freespace
map during recovery. We could pass REGBUF_NO_CHANGE when the heap
buffer is completely unmodified. But the delicate special case logic
doesn't seem worth the effort to maintain, as the only time the heap
buffer should be unmodified is when the VM has been truncated or
removed. I added a test to the commit doing this refactoring that
would have caught my mistake (0003).
I also split the refactoring of the VM setting logic into more commits
to help make it clearer (0003-0004). We could technically commit the
refactoring commits to master. I had not originally intended to do so
since they do not have independent value beyond clarity for the
reviewer.
In this set 0001 and 0002 are independent. 0003-0007 are all small
steps toward the single change in 0007 which combines the VM updates
into the same WAL record as pruning and freezing. 0008 and 0009 are
removing the rest of XLOG_HEAP2_VISIBLE. 0010 - 0012 are refactoring
needed to set the VM during on-access pruning. 0013 - 0015 are small
steps toward setting the VM on-access. And 0016 sets the prune xid on
insert so we may set the VM on-access for pages that have only new
data.
> +static bool
> +heap_page_will_set_vis(Relation relation,
>
> Actually, I wanted to comment on the new function name in last round of review, but I guess I missed that.
>
> I was very confused what “set_vis” means, and finally figured out “vis” should stand for “visibility”. Here “vis” actually means “visibility map bits”. There is the other “vis” in the last parameter’s name “do_set_pd_vis” where the “vis” should be mean “PD_ALL_VISIBLE” bit. So the two “vis” feels making things confusing.
>
> How about rename the function to “heap_page_will_set_vm_bits”, and rename the last parameter to “do_set_all_visible”?
I named it that way because it was responsible for telling us what we
should set the VM to _and_ if we should set PD_ALL_VISIBLE. However,
once I corrected the bug mentioned above, we always set PD_ALL_VISIBLE
if setting the VM, so I was able to remove this ambiguity. As such
I've renamed the function heap_page_will_set_vm() (and removed the
last parameter).
> + * Decide whether to set the visibility map bits for heap_blk, using
> + * information from PruneFreezeResult and all_visible_according_to_vm. This
> + * function does not actually set the VM bit or page-level hint,
> + * PD_ALL_VISIBLE.
> + *
> + * If it finds that the page-level visibility hint or VM is corrupted, it will
> + * fix them by clearing the VM bit and page hint. This does not need to be
> + * done in a critical section.
> + *
> + * Returns true if one or both VM bits should be set, along with the desired
> + * flags in *new_vmbits. Also indicates via do_set_pd_vis whether
> + * PD_ALL_VISIBLE should be set on the heap page.
> + */
> ```
>
> This function comment mentions PD_ALL_VISIBLE twice, but never mentions ALL_FROZEN. So “Returns true if one or both VM bits should be set” fells unclear. How about rephrase like "Returns true if the all-visible and/or all-frozen VM bits should be set.”
PD_ALL_VISIBLE is the page-level visibility hint (not the VM bit) and
there is no page level frozen hint. It doesn't mention that the VM
bits are all-visible and all-frozen, though, so I have modified the
comment a bit to make sure the all-frozen bit of the VM is mentioned.
> + * Now handle two potential corruption cases:
> + *
> + * These do not need to happen in a critical section and are not
> + * WAL-logged.
> + *
> + * As of PostgreSQL 9.2, the visibility map bit should never be set if the
> + * page-level bit is clear. However, it's possible that the bit got
> + * cleared after heap_vac_scan_next_block() was called, so we must recheck
> + * with buffer lock before concluding that the VM is corrupt.
> + */
> + else if (all_visible_according_to_vm && !PageIsAllVisible(heap_page) &&
> + visibilitymap_get_status(relation, heap_blk, &vmbuffer) != 0)
> + {
> + ereport(WARNING,
> + (errcode(ERRCODE_DATA_CORRUPTED),
> + errmsg("page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
> + RelationGetRelationName(relation), heap_blk)));
> +
> + visibilitymap_clear(relation, heap_blk, vmbuffer,
> + VISIBILITYMAP_VALID_BITS);
> + }
> ```
>
> Here in the comment and error message, I guess “visibility map bit” refers to “all visible bit”, can we be explicit?
This is an existing comment in lazy_scan_prune() that I simply moved.
It isn't valid for the all-frozen bit to be set unless the all-visible
bit is set. I'm not sure whether specifying which bits were set in the
warning will help users debug the corruption they are seeing. But I
think it is a reasonable suggestion to make. Perhaps it is worth
suggesting this (adding the specific vmbits to the warning message) in
a separate thread since it is an independent improvement on master?
- Melanie