Thread

  1. RE: Newly created replication slot may be invalidated by checkpoint

    Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> — 2025-11-21T06:14:38Z

    On Thursday, November 20, 2025 6:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
    > 
    > On Thu, Nov 20, 2025 at 4:07 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
    > wrote:
    > >
    > > On Thursday, November 20, 2025 4:26 PM Vitaly Davydov
    > <v.davydov@postgrespro.ru> wrote:
    > >
    > > >
    > > > Concerning reserve_wal_for_local_slot. It seems it is used in
    > > > synchronization of failover logical slots. For me, it is tricky to
    > > > change restart_lsn of a synced logical slot to RedoRecPtr, because
    > > > it may lead to problems with logical replication using such slot
    > > > after the replica promotion. But it seems it is the architectural
    > > > problem and it is not related to the problems, solved by the patch.
    > >
    > > I think this is not an issue because if we use the redo pointer
    > > instead of the remote restart_lsn as the initial value, the synced
    > > slot won't be marked as sync-ready, so user cannot use it after
    > > promotion (also well documented). This is also the existing behavior
    > > before the patch, e.g., if the required WALs were removed, the oldest
    > > available WAL was used as the initial value, similarly resulting in the slot not
    > being sync-ready.
    > >
    > 
    > Would it be better to discuss this in a separate thread? Though this is related
    > to original problem but still in a separate part of code
    > (slotsync) which I think can have a separate fix especially when the fix is also
    > somewhat different.
    > 
    > > >
    > > > The change of lock mode to EXCLUSIVE in
    > > > ReplicationSlotsComputeRequiredLSN may affect the performance when
    > a
    > > > lot of slots are advanced during some small period of time. It may
    > > > affect the walsender performance. It advances the logical or
    > > > physical slots when receive a confirmation from the replica. I
    > > > guess, the slot advancement may be pretty frequent operation.
    > >
    > > Yes, I had the same thought and considered a simple alternative
    > > (similar to your suggestion below): use an exclusive lock only when
    > > updating the slot.restart_lsn during WAL reservation, while continuing
    > > to use a shared lock in the computation function. Additionally, place
    > XLogSetReplicationSlotMinimumLSN() under the lock.
    > > This approach will also help serialize the process.
    > >
    > 
    > Can we discuss this as well in a separate thread?
    
    OK, I think it makes sense to start separate threads.
    
    I have split the patches based on the different bugs they
    address and am sharing them here for reference.
    
    Best Regards,
    Hou zj