A few patches to clarify snapshot management, part 2
Heikki Linnakangas <hlinnaka@iki.fi>
From: Heikki Linnakangas <hlinnaka@iki.fi>
To: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Date: 2025-12-18T23:30:05Z
Lists: pgsql-hackers
Attachments
- 0001-Use-a-proper-type-for-RestoreTransactionSnapshot-s-P.patch (text/x-patch)
- 0002-Split-SnapshotData-into-separate-structs-for-each-ki.patch (text/x-patch)
- 0003-Simplify-historic-snapshot-refcounting.patch (text/x-patch)
- 0004-Add-an-explicit-valid-flag-to-MVCCSnapshotData.patch (text/x-patch)
- 0005-Replace-static-snapshot-pointers-with-the-valid-flag.patch (text/x-patch)
- 0006-WIP-Make-RestoreSnapshot-register-the-snapshot-with-.patch (text/x-patch)
- 0007-WIP-Replace-the-RegisteredSnapshot-pairing-heap-with.patch (text/x-patch)
- 0008-WIP-Don-t-serialize-the-snapshot-for-parallel-scans.patch (text/x-patch)
- 0009-WIP-Split-MVCCSnapshot-into-inner-and-outer-parts.patch (text/x-patch)
This is a continuation of the earlier thread with the same subject [1], and related to the CSN work [2]. I'm pretty happy patches 0001-0005. They make the snapshot management more clear in many ways: Patch 0001: Use a proper type for RestoreTransactionSnapshot's PGPROC arg Minor cleanup, independent of the rest of the patches Patch 0002: Split SnapshotData into separate structs for each kind of snapshot This implements the long-standing TODO and splits SnapshotData up into multiple structs. This makes it more clear which fields are used with which kind of snapshot. For example, we now have properly named fields for the XID arrays in historic snapshots. Previously, they abused the 'xip' and 'subxip' arrays to mean something different than what they mean in regular MVCC snapshots. This introduces some new casts between Snapshots and the new MVCCSnapshots. I struggled to decide which functions should use the new MVCCSnapshot type and which should continue to use Snapshot. It's a balancing act between code churn and where we want to have casts. For example, PushActiveSnapshot() takes a Snapshot argument, but assumes that it's an MVCC snapshot, so it really should take MVCCSnapshot. But most of its callers have a Snapshot rather than MVCCSnapshot. Another example: the executor assumes that it's dealing with MVCC snapshots, so it would be appropriate to use MVCCSnapshot for EState->es_snapshot. But it'd cause a lot code churn and casts elsewhere. I think the patch is a reasonable middle ground. Patch 0003: Simplify historic snapshot refcounting Little refactoring in logical decoding's snapshot tracking. Patch 0004: Add an explicit 'valid' flag to MVCCSnapshotData Makes it more clear when the "static" snapshots returned by GetTransactionSnapshot(), GetLatestSnapshot() and GetCatalogSnapshot() are valid. Patch 0005: Replace static snapshot pointers with the 'valid' flags Refactor snapmgr.c a little, taking advantage of the new 'valid' flags The rest of the patches are less polished, but please take a look. The idea is to split MVCCSnapshot further into a "shared" part that contains the xmin, xmax and the 'xip' array, and an outer shell that contains the mutable 'curcid' field. This allows reusing the shared part in more cases, avoiding copying, although I'm not sure what would be a practical scenario where it'd make a performance difference. It becomes more important with the CSN patch, however, because it adds a cache to the shared snapshot struct, which can potentially grow very large. [1] https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb%40iki.fi [2] https://www.postgresql.org/message-id/80f254d3-8ee9-4cde-a7e3-ee99998154da%40iki.fi - Heikki