Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

Melanie Plageman <melanieplageman@gmail.com>

From: Melanie Plageman <melanieplageman@gmail.com>

To: Chao Li <li.evan.chao@gmail.com>

Cc: Andres Freund <andres@anarazel.de>, Kirill Reshke <reshkekirill@gmail.com>, Robert Haas <robertmhaas@gmail.com>, Andrey Borodin <x4mmm@yandex-team.ru>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Heikki Linnakangas <hlinnaka@iki.fi>

Date: 2025-12-15T21:29:03Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Remove table_scan_analyze_next_tuple unneeded parameter OldestXmin
- 284925508ae6 19 (unreleased) landed
Simplify visibility check in heap_page_would_be_all_visible()
- 3efe58febc3c 19 (unreleased) landed
Eliminate use of cached VM value in lazy_scan_prune()
- 648a7e28d7c2 19 (unreleased) landed
Combine visibilitymap_set() cases in lazy_scan_prune()
- 21796c267d0a 19 (unreleased) landed
Fix const qualification in prune_freeze_setup()
- 4877391ce894 19 (unreleased) landed
Simplify vacuum visibility assertion
- bd298f54a0d6 19 (unreleased) landed
Split heap_page_prune_and_freeze() into helpers
- e135e044572e 19 (unreleased) landed
Assert that cutoffs are provided if freezing will be attempted
- cd38b7e77315 19 (unreleased) landed
Split PruneFreezeParams initializers to one field per line
- 1e14edcea5e1 19 (unreleased) landed
Refactor heap_page_prune_and_freeze() parameters into a struct
- 1937ed70621e 19 (unreleased) landed
Make heap_page_is_all_visible independent of LVRelState
- 3e4705484e0c 19 (unreleased) landed
Inline TransactionIdFollows/Precedes[OrEquals]()
- 43b05b38ea4d 19 (unreleased) landed
Add helper for freeze determination to heap_page_prune_and_freeze
- c8dd6542bae4 19 (unreleased) landed
Bump XLOG_PAGE_MAGIC after xl_heap_prune change
- 4a8fb58671d3 19 (unreleased) landed
Correct prune WAL record opcode name in comment
- ae8ea7278c16 19 (unreleased) landed
Add error codes when vacuum discovers VM corruption
- 8ec97e78a771 19 (unreleased) landed
Remove unused xl_heap_prune member, reason
- 4b5f206de2bb 19 (unreleased) landed
Remove unneeded VM pin from VM replay
- 3399c265543e 19 (unreleased) landed
Add assert and log message to visibilitymap_set
- e3d5ddb7ca91 19 (unreleased) landed
Add error codes to some corruption log messages
- fd6ec93bf890 13.0 cited

Attachments

v26-0001-Combine-visibilitymap_set-cases-in-lazy_scan_pru.patch (text/x-patch) patch v26-0001
v26-0002-Eliminate-use-of-cached-VM-value-in-lazy_scan_pr.patch (text/x-patch) patch v26-0002
v26-0003-Refactor-lazy_scan_prune-VM-clear-logic-into-hel.patch (text/x-patch) patch v26-0003
v26-0004-Set-the-VM-in-heap_page_prune_and_freeze.patch (text/x-patch) patch v26-0004
v26-0005-Move-VM-assert-into-prune-freeze-code.patch (text/x-patch) patch v26-0005
v26-0006-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch (text/x-patch) patch v26-0006
v26-0007-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch (text/x-patch) patch v26-0007
v26-0008-Remove-XLOG_HEAP2_VISIBLE-entirely.patch (text/x-patch) patch v26-0008
v26-0009-Rename-GlobalVisTestIsRemovableXid-to-GlobalVisX.patch (text/x-patch) patch v26-0009
v26-0010-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch (text/x-patch) patch v26-0010
v26-0011-Unset-all_visible-sooner-if-not-freezing.patch (text/x-patch) patch v26-0011
v26-0012-Track-which-relations-are-modified-by-a-query.patch (text/x-patch) patch v26-0012
v26-0013-Pass-down-information-on-table-modification-to-s.patch (text/x-patch) patch v26-0013
v26-0014-Allow-on-access-pruning-to-set-pages-all-visible.patch (text/x-patch) patch v26-0014
v26-0015-Set-pd_prune_xid-on-insert.patch (text/x-patch) patch v26-0015

Hi,

Attached v26 includes a new patch, 0002, which gets rid of
all_visible_according_to_vm in lazy_scan_prune(). We've kept this
cached copy of the all-visible bit since the VM was added way back in
608195a3a365. Back then, the VM wasn't pinned unless
all_visible_according_to_vm was false. Now that we unconditionally
have the VM page pinned, there isn't much performance benefit to using
that cached value. I did some testing of the worst possible case and
saw no difference in timing. By removing that, we simplify heap vacuum
code now.  And we improve clarity once the VM update is combined into
the prune/freeze WAL record and when the VM is set on-access.

I think 0001 and 0002 (and maybe 0003) are worthwhile clarity
improvements on their own.

On Wed, Dec 10, 2025 at 11:07 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
> A few more small comments. Sorry for keeping come out new comments. Actually I learned a lot about vacuum from reviewing this patch.

Thanks for the continued review. Your feedback is improving the patchset.

> The last vacuum is expected to set vm bits, but the test doesn’t verify that. Should we verify that like:
> ```
> evantest=# SELECT blkno, all_visible, all_frozen FROM pg_visibility_map('test_vac_unmodified_heap');
>  blkno | all_visible | all_frozen
> -------+-------------+------------
>      0 | t           | t
> (1 row)

I've done this. I've actually added three such verifications -- one
after each step where the VM is expected to change. It shouldn't be
very expensive, so I think it is okay. The way the test would fail if
the buffer wasn't correctly dirtied is that it would assert out -- so
the visibility map test wouldn't even have a chance to fail. But, I
think it is also okay to confirm that the expected things are
happening with the VM -- it just gives us extra coverage.

>                 if (presult.all_frozen)
>                 {
> +                       /*
> +                        * We can pass InvalidTransactionId as our cutoff_xid, since a
> +                        * snapshotConflictHorizon sufficient to make everything safe for
> +                        * REDO was logged when the page's tuples were frozen.
> +                        */
>                         Assert(!TransactionIdIsValid(presult.vm_conflict_horizon));
> -                       flags |= VISIBILITYMAP_ALL_FROZEN;
> +                       new_vmbits |= VISIBILITYMAP_ALL_FROZEN;
>                 }
>
> The comment here is a little confusing. In the old code, the Assert() as immediately above the call visibilitymap_set(), and cutoff_xid is a parameter to the call. But the new code moves the Assert() as well as the comment far away from the call visibilitymap_set(), so I think the comment should stay together with the call of visibilitymap_set().

Good point. I've moved it closer to visibilitymap_set() and modified
and moved the assert so that it is together with the comment. I think
the comment makes little sense without the assertion.

>  * If it finds that the page-level visibility hint or VM is corrupted, it will
> * fix them by clearing the VM bits and visibility page hint. This does not
>
> In the second line, “visibility page hint” is understandable but feels not quite good. I know it’s actually “page-level visibility hint”, so how about just “visibility hint”.

I've changed this.

>         /*
> -        * As of PostgreSQL 9.2, the visibility map bit should never be set if the
> -        * page-level bit is clear.  However, it's possible that the bit got
> -        * cleared after heap_vac_scan_next_block() was called, so we must recheck
> -        * with buffer lock before concluding that the VM is corrupt.
> +        * For the purposes of logging, count whether or not the page was newly
> +        * set all-visible and, potentially, all-frozen.
>          */
> -       else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
> -                        visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
> +       if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 &&
> +               (new_vmbits & VISIBILITYMAP_ALL_VISIBLE) != 0)
>         {
> ```
>
> Without do_set_vm==true, old_vmbits will only be 0, thus this “if-elseif” that uses old_vmbits should be moved into “if (do_set_vm)”. From this perspective, if not do_set_vm, this function can return early, like:

Good point. I've actually gone ahead in 0002 and refactored this whole
section a bit (I got rid of all_visible_according_to_vm). 0002 is a
new patch in this attached v26, and it needs review. I think this
refactoring makes the code quite a bit clearer -- especially once we
start setting the VM on-access. It does, amongst other things, return
early if all_visible is false, like you suggested.

> + * Returns true if one or both VM bits should be set, along with returning the
> + * desired what bits should be set in the VM in *new_vmbits.
> ```
>
> Looks like a typo: “returning the desired what bits should be set”, maybe change to “returning the desired bits to be set”.

Fixed.

- Melanie