Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

Melanie Plageman <melanieplageman@gmail.com>

From: Melanie Plageman <melanieplageman@gmail.com>
To: Chao Li <li.evan.chao@gmail.com>
Cc: Andres Freund <andres@anarazel.de>, Kirill Reshke <reshkekirill@gmail.com>, Robert Haas <robertmhaas@gmail.com>, Andrey Borodin <x4mmm@yandex-team.ru>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Heikki Linnakangas <hlinnaka@iki.fi>
Date: 2025-12-15T21:29:03Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Remove table_scan_analyze_next_tuple unneeded parameter OldestXmin

  2. Simplify visibility check in heap_page_would_be_all_visible()

  3. Eliminate use of cached VM value in lazy_scan_prune()

  4. Combine visibilitymap_set() cases in lazy_scan_prune()

  5. Fix const qualification in prune_freeze_setup()

  6. Simplify vacuum visibility assertion

  7. Split heap_page_prune_and_freeze() into helpers

  8. Assert that cutoffs are provided if freezing will be attempted

  9. Split PruneFreezeParams initializers to one field per line

  10. Refactor heap_page_prune_and_freeze() parameters into a struct

  11. Make heap_page_is_all_visible independent of LVRelState

  12. Inline TransactionIdFollows/Precedes[OrEquals]()

  13. Add helper for freeze determination to heap_page_prune_and_freeze

  14. Bump XLOG_PAGE_MAGIC after xl_heap_prune change

  15. Correct prune WAL record opcode name in comment

  16. Add error codes when vacuum discovers VM corruption

  17. Remove unused xl_heap_prune member, reason

  18. Remove unneeded VM pin from VM replay

  19. Add assert and log message to visibilitymap_set

  20. Add error codes to some corruption log messages

Attachments

Hi,

Attached v26 includes a new patch, 0002, which gets rid of
all_visible_according_to_vm in lazy_scan_prune(). We've kept this
cached copy of the all-visible bit since the VM was added way back in
608195a3a365. Back then, the VM wasn't pinned unless
all_visible_according_to_vm was false. Now that we unconditionally
have the VM page pinned, there isn't much performance benefit to using
that cached value. I did some testing of the worst possible case and
saw no difference in timing. By removing that, we simplify heap vacuum
code now.  And we improve clarity once the VM update is combined into
the prune/freeze WAL record and when the VM is set on-access.

I think 0001 and 0002 (and maybe 0003) are worthwhile clarity
improvements on their own.

On Wed, Dec 10, 2025 at 11:07 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
> A few more small comments. Sorry for keeping come out new comments. Actually I learned a lot about vacuum from reviewing this patch.

Thanks for the continued review. Your feedback is improving the patchset.

> The last vacuum is expected to set vm bits, but the test doesn’t verify that. Should we verify that like:
> ```
> evantest=# SELECT blkno, all_visible, all_frozen FROM pg_visibility_map('test_vac_unmodified_heap');
>  blkno | all_visible | all_frozen
> -------+-------------+------------
>      0 | t           | t
> (1 row)

I've done this. I've actually added three such verifications -- one
after each step where the VM is expected to change. It shouldn't be
very expensive, so I think it is okay. The way the test would fail if
the buffer wasn't correctly dirtied is that it would assert out -- so
the visibility map test wouldn't even have a chance to fail. But, I
think it is also okay to confirm that the expected things are
happening with the VM -- it just gives us extra coverage.

>                 if (presult.all_frozen)
>                 {
> +                       /*
> +                        * We can pass InvalidTransactionId as our cutoff_xid, since a
> +                        * snapshotConflictHorizon sufficient to make everything safe for
> +                        * REDO was logged when the page's tuples were frozen.
> +                        */
>                         Assert(!TransactionIdIsValid(presult.vm_conflict_horizon));
> -                       flags |= VISIBILITYMAP_ALL_FROZEN;
> +                       new_vmbits |= VISIBILITYMAP_ALL_FROZEN;
>                 }
>
> The comment here is a little confusing. In the old code, the Assert() as immediately above the call visibilitymap_set(), and cutoff_xid is a parameter to the call. But the new code moves the Assert() as well as the comment far away from the call visibilitymap_set(), so I think the comment should stay together with the call of visibilitymap_set().

Good point. I've moved it closer to visibilitymap_set() and modified
and moved the assert so that it is together with the comment. I think
the comment makes little sense without the assertion.

>  * If it finds that the page-level visibility hint or VM is corrupted, it will
> * fix them by clearing the VM bits and visibility page hint. This does not
>
> In the second line, “visibility page hint” is understandable but feels not quite good. I know it’s actually “page-level visibility hint”, so how about just “visibility hint”.

I've changed this.

>         /*
> -        * As of PostgreSQL 9.2, the visibility map bit should never be set if the
> -        * page-level bit is clear.  However, it's possible that the bit got
> -        * cleared after heap_vac_scan_next_block() was called, so we must recheck
> -        * with buffer lock before concluding that the VM is corrupt.
> +        * For the purposes of logging, count whether or not the page was newly
> +        * set all-visible and, potentially, all-frozen.
>          */
> -       else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
> -                        visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
> +       if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 &&
> +               (new_vmbits & VISIBILITYMAP_ALL_VISIBLE) != 0)
>         {
> ```
>
> Without do_set_vm==true, old_vmbits will only be 0, thus this “if-elseif” that uses old_vmbits should be moved into “if (do_set_vm)”. From this perspective, if not do_set_vm, this function can return early, like:

Good point. I've actually gone ahead in 0002 and refactored this whole
section a bit (I got rid of all_visible_according_to_vm). 0002 is a
new patch in this attached v26, and it needs review. I think this
refactoring makes the code quite a bit clearer -- especially once we
start setting the VM on-access. It does, amongst other things, return
early if all_visible is false, like you suggested.

> + * Returns true if one or both VM bits should be set, along with returning the
> + * desired what bits should be set in the VM in *new_vmbits.
> ```
>
> Looks like a typo: “returning the desired what bits should be set”, maybe change to “returning the desired bits to be set”.

Fixed.

- Melanie