Thread

  1. Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

    Hannu Krosing <hannuk@google.com> — 2025-12-04T21:03:33Z

    I just sent a small patch for logical decoding to pgsql-hackers@
    exposing to logical decoding old and new tuple ids and a boolean
    telling if an UPDATE is HOT.
    
    Feel free to test if this helps here as well
    
    On Thu, Dec 4, 2025 at 8:15 PM Antonin Houska <ah@cybertec.at> wrote:
    >
    > Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
    >
    > > On Thu, 4 Dec 2025 at 09:34, Antonin Houska <ah@cybertec.at> wrote:
    > > >
    > > > ISTM that what you consider a problem is copying the table using PGPROC-based
    > > > snapshot and applying logically decoded commits to the result - is that what
    > > > you mean?
    > >
    > > Correct.
    > >
    > > > In fact, LR (and also REPACK) uses snapshots generated by the logical decoding
    > > > system. The information on running/committed transactions is based here on
    > > > replaying WAL, not on PGPROC.
    > >
    > > OK, that's good to know. For reference, do you know where this is
    > > documented, explained, or implemented?
    >
    > All my knowledge of these things is from source code.
    >
    > > I'm asking, because the code that I could find didn't seem use any
    > > special snapshot (tablesync.c uses
    > > `PushActiveSnapshot(GetTransactionSnapshot())`),
    >
    > My understanding is that this is what happens on the subscription side. Some
    > lines above that however, walrcv_create_slot(..., CRS_USE_SNAPSHOT, ...) is
    > called which in turn calls CreateReplicationSlot(..., CRS_USE_SNAPSHOT, ...)
    > on the publication side and it sets that snapshot for the transaction on the
    > remote (publication) side:
    >
    >         else if (snapshot_action == CRS_USE_SNAPSHOT)
    >         {
    >                 Snapshot        snap;
    >
    >                 snap = SnapBuildInitialSnapshot(ctx->snapshot_builder);
    >                 RestoreTransactionSnapshot(snap, MyProc);
    >         }
    >
    > > and the other
    > > reference to LR's snapshots (snapbuild.c, and inside
    > > `GetTransactionSnapshot()`) explicitly said that its snapshots are
    > > only to be used for catalog lookups, never for general-purpose
    > > queries.
    >
    > I think the reason is that snapbuild.c only maintains snapshots for catalog
    > scans, because in logical decoding you only need to scan catalog tables. This
    > is especially to find out which tuple descriptor was valid when particular
    > data change (INSERT / UPDATE / DELETE) was WAL-logged - the output plugin
    > needs the correct version of tuple descriptor to deform each tuple. However
    > there is no need to scan non-catalog tables: as long as wal_level=logical, the
    > WAL records contains all the information needed for logical replication
    > (including key values). So snapbuild.c only keeps track of transactions that
    > modify system catalog and uses this information to create the snapshots.
    >
    > A special case is if you pass need_full_snapshot=true to
    > CreateInitDecodingContext(). In this case the snapshot builder tracks commits
    > of all transactions, but only does so until SNAPBUILD_CONSISTENT state is
    > reached. Thus, just before the actual decoding starts, you can get a snapshot
    > to scan even non-catalog tables (SnapBuildInitialSnapshot() creates that, like
    > in the code above). (For REPACK, I'm trying to teach snapbuild.c recognize
    > that transaction changed one particular non-catalog table, so it can build
    > snapshots to scan this one table anytime.)
    >
    > Another reason not to use those snapshots for non-catalog tables is that
    > snapbuild.c creates snapshots of the kind SNAPSHOT_HISTORIC_MVCC. If you used
    > this for non-catalog tables, HeapTupleSatisfiesHistoricMVCC() would be used
    > for visibility checks instead of HeapTupleSatisfiesMVCC(). The latter can
    > handle tuples surviving from older version of postgres, but the earlier
    > cannot:
    >
    >         /* Used by pre-9.0 binary upgrades */
    >         if (tuple->t_infomask & HEAP_MOVED_OFF)
    >
    > No such tuples should appear in the catalog because initdb always creates it
    > from scratch.
    >
    > For LR, SnapBuildInitialSnapshot() takes care of the conversion from
    > SNAPSHOT_HISTORIC_MVCC to SNAPSHOT_MVCC.
    >
    > --
    > Antonin Houska
    > Web: https://www.cybertec-postgresql.com
    >
    >