Thread

  1. Re: How can end users know the cause of LR slot sync delays?

    Shlok Kyal <shlok.kyal.oss@gmail.com> — 2025-10-30T12:16:02Z

    On Mon, 20 Oct 2025 at 14:27, Hayato Kuroda (Fujitsu)
    <kuroda.hayato@fujitsu.com> wrote:
    >
    > Dear Shlok,
    >
    > > > 01.
    > > > ```
    > > > +       /* Update the slot sync reason */
    > > > +       SpinLockAcquire(&slot->mutex);
    > > > +       if (slot->slot_sync_skip_reason != skip_reason)
    > > > +               slot->slot_sync_skip_reason = skip_reason;
    > > > +       SpinLockRelease(&slot->mutex);
    > > > ```
    > > >
    > > > Per my understanding, spinlock is acquired when the attribute on the shared
    > > memory
    > > > is updating. Can you check other parts and follow the rukle?
    > > >
    > > > 02.
    > > > ```
    > > > +                       SpinLockAcquire(&slot->mutex);
    > > > +                       synced = slot->data.synced;
    > > > +                       SpinLockRelease(&slot->mutex);
    > > > ```
    > > >
    > > > Same as 1.
    > > >
    > > I checked and found the following comment:
    > >  * - Individual fields are protected by mutex where only the backend owning
    > >  * the slot is authorized to update the fields from its own slot.  The
    > >  * backend owning the slot does not need to take this lock when reading its
    > >  * own fields, while concurrent backends not owning this slot should take the
    > >  * lock when reading this slot's data.
    > >
    I realised that the patch does not entirely follow the above rule. As
    per my understanding rule says:
    1. If we want to update a field of a slot we need to own the slot (or
    we can say acquire the slot)
    2. In the above case we also need to take a Spinlock on the slot to
    update any field.
    3. If we want to read a field and if we own the slot we do not need a
    spinlock on the slot.
    4. If we want to read a field and if we do not own the slot we need a
    spinlock on the slot.
    
    For the current patch (v5), in the function "synchronize_one_slot" we
    are calling "update_slot_sync_skip_stats" even if we do not own the
    slot.
    So, as per the rule I have updated  "update_slot_sync_skip_stats" to
    own the slot before updating.
    
    > > So for the above two cases we are updating the
    > > 'slot->slot_sync_skip_reason' and reading 'slot->data.synced' and this
    > > can happen before the slot sync worker acquires the slot or owns the
    > > slot.
    > > Also in the same code at a later stage we are again checking the
    > > synced flag and we do that while holding a spin lock. Based on these
    > > observations I think we should take Spinlock in both cases.
    >
    > Hmm, regarding the update_slot_sync_skip_stats(), the replication slot has already been
    > acquired except synchronize_one_slot() case.
    > Can we avoid acquiring the spinlock as much as possible by adding an argument?
    > Or it just introduces additional complexity?
    >
    After updating the code as per rule, I think we always have to take a
    Spinlock on the slot when we are updating any field.
    
    > > > 09.
    > > > ```
    > > > +my $connstr_1 = $primary->connstr;
    > > > ```
    > > >
    > > > Since this is an only connection string in the test, suffix _1 is not needed.
    > > >
    > > Fixed
    >
    > Same as the comment, can you replace "standby1" to "stanby"?
    >
    Fixed
    
    > >
    > > > 10.
    > > > ```
    > > > +# Simulate standby connection failure by modifying pg_hba.conf
    > > > +unlink($primary->data_dir . '/pg_hba.conf');
    > > > +$primary->append_conf('pg_hba.conf',
    > > > +       qq{local   all             all
    > > trust}
    > > > +);
    > > > ```
    > > >
    > > > What if the system does not have Unix domain socket? I'm afraid all connections
    > > > could be brocked in this case.
    > > >
    > > I have used an injection point to simulate this scenario instead of
    > > changing the contents of pg_hba.conf files.
    >
    > Can you clarify the reason why you used the injection point?
    > I'm not sure the injection point is beneficial here. I feel the point can be added
    > when we handle the timing issue, race condition etc, but walreceiver may not have
    > strong reasons to stop exact at that point.
    >
    > Regarding the content of pg_hba.conf, I felt below lines might be enough:
    > ```
    > local   all             all                                     trust
    > host    all             all             127.0.0.1/32            trust
    > ```
    >
    I checked this.
    By default pg_hba.conf has contents as:
    ```
    # "local" is for Unix domain socket connections only
    local   all             all                                     trust
    # IPv4 local connections:
    host    all             all             127.0.0.1/32            trust
    # IPv6 local connections:
    host    all             all             ::1/128                 trust
    # Allow replication connections from localhost, by a user with the
    # replication privilege.
    local   replication     all                                     trust
    host    replication     all             127.0.0.1/32            trust
    host    replication     all             ::1/128                 trust
    ```
    Now for our test to prevent the streaming replication we can set pg_hba.conf to
    ```
    local   all             all                                     trust
    host    all             all             127.0.0.1/32            trust
    host    all             all             ::1/128                 trust
    ```
    And then to restore streaming replication we can add following to pg_hba.conf:
    ```
    local   replication     all                                     trust
    host    replication     all             127.0.0.1/32            trust
    host    replication     all             ::1/128                 trust
    ```
    I think this would be sufficient for our testing. Thoughts?
    
    > Also, here are comments for v5.
    >
    > ```
    > +      <para>
    > +       Reason of the last slot synchronization skip.
    > +      </para></entry>
    > ```
    >
    > Possible values must be clarified. This was posted in [1] but seemed to be missed.
    Sorry, I missed it. I have updated it in the latest patch.
    
    >
    > ```
    > +       /* Update the slot sync reason */
    > ```
    >
    > It is better to clarify updating the *skip* reason
    Fixed
    
    >
    > ```
    > -       ReplicationSlot *slot;
    > +       ReplicationSlot *slot = NULL;
    > ```
    >
    > No need to initialize as NULL.
    >
    Fixed
    
    > ```
    > +#include "utils/injection_point.h"
    > ...
    > +                               INJECTION_POINT("walreceiver", NULL);
    > ```
    >
    > As I told above, I have a concern to add the injection point. I want to hear
    > other's opinion as well.
    >
    Removed it for now as per my analysis we can modify pg_hba.conf to
    simulate the scenario.
    
    > ```
    > +                       else
    > +                       {
    > +                               /* Update the slot sync stats */
    > +                               Assert(!found_consistent_snapshot ||
    > +                                          *found_consistent_snapshot);
    > +                               update_slot_sync_skip_stats(slot, SS_SKIP_NONE);
    > +                       }
    > ```
    >
    > Your patch may have another issue; if both confirmed_flush_lsn are the same
    > but we do not have the consistent snapshot yet, we would get the assertion failure.
    > (Again, not sure it can really happen)
    > Can we use the condition as another if part? At that time we must clarify why
    > it is OK to pass in case of found_consistent_snapshot == NULL.
    >
    Fixed
    
    > ```
    > +# Attach injection point to simulate wait
    > +$standby_psql->query_safe(
    > +       q(select injection_points_attach('slot-sync-skip','wait')));
    > ```
    >
    > I have been considering whether we can remove the injection point here or not.
    > I think the point is used because the being synchronized slot is still temporary
    > one; they would be cleaned up by ReplicationSlotCleanup().
    > Can we reproduce the skip event for the permanent slot? I cannot come up with,
    > but if possible no need to introduce the injection point.
    I tried reproducing it but was not able to come up with a test without
    injection point. Will further try to reproduce it without injection
    point.
    
    >
    > [1]: https://www.postgresql.org/message-id/OSCPR01MB14966A618A8C61EC3DEE486A4F517A%40OSCPR01MB14966.jpnprd01.prod.outlook.com
    >
    
    I have attached the latest patch.
    
    Thanks,
    Shlok Kyal