Thread

  1. Re: A few patches to clarify snapshot management, part 2

    Heikki Linnakangas <hlinnaka@iki.fi> — 2025-12-19T11:51:09Z

    On 19/12/2025 07:15, Chao Li wrote:
    >> On Dec 19, 2025, at 07:30, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
    >>
    >> Patch 0002: Split SnapshotData into separate structs for each kind of snapshot
    >> >> This implements the long-standing TODO and splits SnapshotData up
    >> into multiple structs. This makes it more clear which fields are
    >> used with which kind of snapshot. For example, we now have
    >> properly named fields for the XID arrays in historic snapshots.
    >> Previously, they abused the 'xip' and 'subxip' arrays to mean
    >> something different than what they mean in regular MVCC snapshots.
    >> 
    >> This introduces some new casts between Snapshots and the new
    >> MVCCSnapshots. I struggled to decide which functions should use
    >> the new MVCCSnapshot type and which should continue to use
    >> Snapshot. It's a balancing act between code churn and where we
    >> want to have casts. For example, PushActiveSnapshot() takes a
    >> Snapshot argument, but assumes that it's an MVCC snapshot, so it
    >> really should take MVCCSnapshot. But most of its callers have a
    >> Snapshot rather than MVCCSnapshot. Another example: the executor
    >> assumes that it's dealing with MVCC snapshots, so it would be
    >> appropriate to use MVCCSnapshot for EState->es_snapshot. But it'd
    >> cause a lot code churn and casts elsewhere. I think the patch is a
    >> reasonable middle ground.
    > > The core of 0002 is the new union Snapshot:
    > ```
    > +/*
    > + * Generic union representing all kind of possible snapshots.  Some have
    > + * type-specific structs.
    > + */
    > +typedef union SnapshotData
    > +{
    > +	SnapshotType snapshot_type; /* type of snapshot */
    > +
    > +	MVCCSnapshotData mvcc;
    > +	DirtySnapshotData dirty;
    > +	HistoricMVCCSnapshotData historic_mvcc;
    > +	NonVacuumableSnapshotData nonvacuumable;
    >   } SnapshotData;
    > ```
    > > And my big concern is here. This union definition looks unusual,
    > snapshot_type shares the same space with real snapshot bodies, so
    > that once snapshot is assigned to the union, that type info is lost,
    > there would be no way to decide what exact snapshot is stored in
    > SnapshotData.
    
    Each of the structs, MVCCSnapshotData, DirtySnapshotData, etc., contain 
    'snapshot_type' as the first field, so it's always available.
    
    > I think SnapshotData should be a structure and utilize anonymous union technique:
    > ```
    > typedef struct SnapshotData
    > {
    >      SnapshotType snapshot_type; /* type of snapshot */
    > 
    >      union {
    >         MVCCSnapshotData mvcc;
    >         DirtySnapshotData dirty;
    >         HistoricMVCCSnapshotData historic_mvcc;
    >         NonVacuumableSnapshotData nonvacuumable;
    >      }
    > ```
    
    Hmm, yeah, maybe that would be more clear. But then you cannot cast an 
    "MVCCSnapshotData *" to "SnapshotData *", or vice versa.
    
    > If you agree with this, then 0002 will need some rework. A set of helper micros/functions would need to be added, e.g. InitDirtySnapshot(), DirtySnapshotGetSnapshot()...
    
    Right, I feel those helper macros / functions would be excessive.
    
    - Heikki