Thread

  1. could sent_lsn be lower than write/flush/replay_lsn?

    Jaime Casanova <jcasanov@systemguards.com.ec> — 2025-12-26T17:49:33Z

    Hi,
    
    We have a customer that for the second time have most of its logical
    replicas (13 of 16) in a catchcup state, they have been working fine
    for some time now and suddenly the pg_stat_replication view shows
    something like this for all of the replicas in catchup state:
    
    """
    pid              | 2667517
    state            | catchup
    sent_lsn         | 38B4/67C403A8
    write_lsn        | 38B7/D2C9C038
    flush_lsn        | 38B7/D2C9C038
    replay_lsn       | 38B7/D2C9C038
    """
    
    This doesn't make sense for me. This is 16.9 btw.
    
    The pg_stat_activity says:
    
    """
    wait_event_type  | IO
    wait_event       | ReorderBufferWrite
    state            | active
    backend_xid      |
    backend_xmin     |
    query_id         |
    query            | START_REPLICATION SLOT "sub_down_tables" LOGICAL
    38B7/CEBC9330 (proto_version '4', origin 'any', publication_names
    '"pub_down_tables"')
    backend_type     | walsender
    """
    
    And the logs keeps showing this:
    
    """
    2025-12-26 12:17:41.861 -05 [pid=2667517;l=1;tx=0] LOG:  38B7/CEBC9330
    has been already streamed, forwarding to 38B7/D2C9C038
    2025-12-26 12:17:41.861 -05 [pid=2667517;l=2;tx=0] STATEMENT:
    START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL
    38B7/CEBC9330 (proto_version '4', origin 'any', publication_names
    '"pub_elipsys_cresio_down_tables"')
    2025-12-26 12:17:41.867 -05 [pid=2667517;l=3;tx=0] LOG:  starting
    logical decoding for slot "sub_down_tables_central_trx001"
    2025-12-26 12:17:41.867 -05 [pid=2667517;l=4;tx=0] DETAIL:  Streaming
    transactions committing after 38B7/D2C9C038, reading WAL from
    38B0/2261B890.
    2025-12-26 12:17:41.867 -05 [pid=2667517;l=5;tx=0] STATEMENT:
    START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL
    38B7/CEBC9330 (proto_version '4', origin 'any', publication_names
    '"pub_elipsys_cresio_down_tables"')
    2025-12-26 12:17:41.868 -05 [pid=2667517;l=6;tx=0] LOG:  logical
    decoding found consistent point at 38B0/2261B890
    2025-12-26 12:17:41.868 -05 [pid=2667517;l=7;tx=0] DETAIL:  Logical
    decoding will begin using saved snapshot.
    2025-12-26 12:17:41.868 -05 [pid=2667517;l=8;tx=0] STATEMENT:
    START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL
    38B7/CEBC9330 (proto_version '4', origin 'any', publication_names
    '"pub_elipsys_cresio_down_tables"')
    2025-12-26 12:30:35.953 -05 [pid=2678504;l=1;tx=0] ERROR:  replication
    slot "sub_down_tables_central_trx001" is active for PID 2667517
    2025-12-26 12:30:40.959 -05 [pid=2678564;l=1;tx=0] ERROR:  replication
    slot "sub_down_tables_central_trx001" is active for PID 2667517
    """
    
    any idea what to check?
    
    --
    Jaime Casanova
    SYSTEMGUARDS S.A.