Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

Melanie Plageman <melanieplageman@gmail.com>

From: Melanie Plageman <melanieplageman@gmail.com>

To: Kirill Reshke <reshkekirill@gmail.com>

Cc: Andres Freund <andres@anarazel.de>, Robert Haas <robertmhaas@gmail.com>, Andrey Borodin <x4mmm@yandex-team.ru>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>, Heikki Linnakangas <hlinnaka@iki.fi>, Chao Li <li.evan.chao@gmail.com>

Date: 2025-12-18T00:30:01Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Remove table_scan_analyze_next_tuple unneeded parameter OldestXmin
- 284925508ae6 19 (unreleased) landed
Simplify visibility check in heap_page_would_be_all_visible()
- 3efe58febc3c 19 (unreleased) landed
Eliminate use of cached VM value in lazy_scan_prune()
- 648a7e28d7c2 19 (unreleased) landed
Combine visibilitymap_set() cases in lazy_scan_prune()
- 21796c267d0a 19 (unreleased) landed
Fix const qualification in prune_freeze_setup()
- 4877391ce894 19 (unreleased) landed
Simplify vacuum visibility assertion
- bd298f54a0d6 19 (unreleased) landed
Split heap_page_prune_and_freeze() into helpers
- e135e044572e 19 (unreleased) landed
Assert that cutoffs are provided if freezing will be attempted
- cd38b7e77315 19 (unreleased) landed
Split PruneFreezeParams initializers to one field per line
- 1e14edcea5e1 19 (unreleased) landed
Refactor heap_page_prune_and_freeze() parameters into a struct
- 1937ed70621e 19 (unreleased) landed
Make heap_page_is_all_visible independent of LVRelState
- 3e4705484e0c 19 (unreleased) landed
Inline TransactionIdFollows/Precedes[OrEquals]()
- 43b05b38ea4d 19 (unreleased) landed
Add helper for freeze determination to heap_page_prune_and_freeze
- c8dd6542bae4 19 (unreleased) landed
Bump XLOG_PAGE_MAGIC after xl_heap_prune change
- 4a8fb58671d3 19 (unreleased) landed
Correct prune WAL record opcode name in comment
- ae8ea7278c16 19 (unreleased) landed
Add error codes when vacuum discovers VM corruption
- 8ec97e78a771 19 (unreleased) landed
Remove unused xl_heap_prune member, reason
- 4b5f206de2bb 19 (unreleased) landed
Remove unneeded VM pin from VM replay
- 3399c265543e 19 (unreleased) landed
Add assert and log message to visibilitymap_set
- e3d5ddb7ca91 19 (unreleased) landed
Add error codes to some corruption log messages
- fd6ec93bf890 13.0 cited

Attachments

v28-0001-Combine-visibilitymap_set-cases-in-lazy_scan_pru.patch (text/x-patch) patch v28-0001
v28-0002-Eliminate-use-of-cached-VM-value-in-lazy_scan_pr.patch (text/x-patch) patch v28-0002
v28-0003-Refactor-lazy_scan_prune-VM-clear-logic-into-hel.patch (text/x-patch) patch v28-0003
v28-0004-Set-the-VM-in-heap_page_prune_and_freeze.patch (text/x-patch) patch v28-0004
v28-0005-Move-VM-assert-into-prune-freeze-code.patch (text/x-patch) patch v28-0005
v28-0006-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch (text/x-patch) patch v28-0006
v28-0007-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch (text/x-patch) patch v28-0007
v28-0008-Remove-XLOG_HEAP2_VISIBLE-entirely.patch (text/x-patch) patch v28-0008
v28-0009-Simplify-heap_page_would_be_all_visible-visibili.patch (text/x-patch) patch v28-0009
v28-0010-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch (text/x-patch) patch v28-0010
v28-0011-Unset-all_visible-sooner-if-not-freezing.patch (text/x-patch) patch v28-0011
v28-0012-Track-which-relations-are-modified-by-a-query.patch (text/x-patch) patch v28-0012
v28-0013-Pass-down-information-on-table-modification-to-s.patch (text/x-patch) patch v28-0013
v28-0014-Allow-on-access-pruning-to-set-pages-all-visible.patch (text/x-patch) patch v28-0014
v28-0015-Set-pd_prune_xid-on-insert.patch (text/x-patch) patch v28-0015

Thanks for the review!

In addition to addressing your feedback, attached v28 includes a
number of small fixes to comments, commit messages, and other things.
Notably, I've added one new refactoring patch 0009, which reduces the
diff of 0010 -- using the GlobalVisState instead of OldestXmin for
page visibility -- even further.

On Wed, Dec 17, 2025 at 1:27 PM Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> > I've done this. I've actually added three such verifications -- one
> > after each step where the VM is expected to change. It shouldn't be
> > very expensive, so I think it is okay. The way the test would fail if
> > the buffer wasn't correctly dirtied is that it would assert out -- so
> > the visibility map test wouldn't even have a chance to fail. But, I
> > think it is also okay to confirm that the expected things are
> > happening with the VM -- it just gives us extra coverage.
>
> +1 on extra coverage. Should we also do sql-level check that the VM
> indeed does not need to set PD_ALL_VISIBLE (check header bytes using
> pageinspect?).

That's an interesting idea. I checked and, AFAICT, there are no tests
currently directly comparing the flags column returned by the
pageinspect page_header() function to one of the flag values. I've
added the following to attached v28.

SELECT (flags & x'0004'::int) <> 0
        FROM page_header(get_raw_page('test_vac_unmodified_heap', 0));

But I'm not sure if it is weird/confusing to be comparing the flag
directly to the number 4 like this. I don't really want to bother with
adding another function to pageinspect returning the status of
PD_ALL_VISIBLE (like page_visible() or something).

> v27-0003 & v27-0004: I did not get the exact reason we introduced
> `identify_and_fix_vm_corruption` in 0003 and moved code in 0004 to
> another place. I can see we have this starting v25 of patch set. Well,
> maybe this is not an issue at all...

It's mostly for ease of review. This is a pretty sensitive area of
code, so I thought it would be easier for the reviewer to confirm
correctness if I split it up. Andres had mentioned that the commit was
hard to review because so many different things were happening.

In v27, 0003 moves the VM clear code into a helper. 0004 and 0005
moves all the VM setting/clearing code to
heap_page_prune_and_freeze(). And 0006 actually sets the VM in the
same critical section as pruning/freezing and emits a single WAL
record.

I'm not really sure which commits should stay independent in the final
version I push to master.

> in v27-0005. This patch changes code which is not exercised in
> tests[0]. I spent some time understanding the conditions when we
> entered this. There is a comment about non-finished relation
> extension, but I got no success trying to reproduce this. I ended up
> modifying code to lose PageSetAllVisible in proper places and running
> vacuum. Looks like everything works as expected. I will spend some
> more time on this, maybe I will be successful in writing an
> injection-point-based TAP test which hits this...

Based on the coverage report link you provided, that code is changed
by v27 0007, not 0005. 0005 is about moving an assertion out of
lazy_scan_prune(). 0007 changes lazy_scan_new_or_empty() (the code in
question).

Regarding 0007, it looks like what is uncovered (the orange bits in
the coverage report are uncovered, I assume) is empty pages _without_
PD_ALL_VISIBLE set. I don't see anywhere where PageSetAllVisible() is
called except vacuum and COPY FREEZE.

If I was trying to guess how empty pages with PD_ALL_VISIBLE set are
getting vacuumed, I would think it is due to SKIP_PAGES_THRESHOLD
causing us to vacuum an all-frozen empty page.

Then the question is, why wouldn't we have coverage of the empty page
first being set all-visible/all-frozen? It can't be COPY FREEZE
because the page is empty. And it can't be vacuum, because then we
would have coverage. It's very mysterious.

It would be good to have coverage for this case. I don't think you'll
need an injection point for the main case of "empty page not yet set
all-visible is vacuumed for the first time" (unless I'm
misunderstanding something).

I'm not sure how you'll test the "vacuuming an empty, previously
uninitialized page" case described in this comment, though.

             * It's possible that another backend has extended the heap,
             * initialized the page, and then failed to WAL-log the page due
             * to an ERROR.  Since heap extension is not WAL-logged, recovery
             * might try to replay our record setting the page all-visible and
             * find that the page isn't initialized, which will cause a PANIC.
             * To prevent that, check whether the page has been previously
             * WAL-logged, and if not, do that now.

You'd want to force an error during relation extension and then vacuum
the page. I don't know if you need an injection point to force the
error -- depends on what kind of error, I think.

So that I know for attribution, did you review 0003-0005?

- Melanie