Thread

  1. Re: Sending unflushed WAL in physical replication

    Rahila Syed <rahilasyed90@gmail.com> — 2025-09-30T04:41:18Z

    Hi,
    
    
    
    > At the high level idea LGTM.
    >
    >
    
    Thank you for looking into it.
    
    
    >> Observations from the benchmark:
    >> 1. The patch improves TPS by ~13% in the sync replication setup. In
    >> repeated runs,
    >> I see that the TPS increase is anywhere between 5% to 13% .
    >> 2. WAL sender reads significantly less WAL from disk, indicating more
    >> efficient use
    >> of WAL buffers and reduced disk I/O
    >>
    >
    > Can you please measure the transaction commit latency improvement as well.
    > Commit latency = Primary_Disk_Flush_time +  Standby_disk_fluish_time +
    > network_roundtrip_time
    >
    >
    
    The pgbench average latency should capture this, since it measures the time
    from
    the start to the end of a transaction. In synchronous replication, each
    transaction waits
    for write confirmation from the standby before commiting, and that
    additional wait time is
    included in the latency measurement. I will post that with the next
    benchmark results.
    
    What happens in crash recovery scenarios? For example, when a standby crash
    > restart,
    > it replays until the end of WAL. In this case, it may end up replaying WAL
    > that was
    > never flushed on the primary (if primary does a crash recovery).
    > Shouldn't archive on standby not upload WAL before WAL gets flushed on the
    > primary?
    > Same applicable for pg_receivewal.
    >
    
    The current solution isn’t sufficient for situations where we rely solely
    on the WAL files to identify
    what needs to be replayed. In these cases, we need to either write the
    unflushed WAL data to a buffer and
    then to temporary files until the primary flush occurs or store the flush
    pointer so that the recovery process
    knows up to which point it should replay the WAL.
    
    As mentioned in the TODO section of my previous email, I am currently
    working on a more robust method to
    manage unflushed WAL on the receiver. The goal is to ensure this does not
    disrupt recovery or affect tools that
    expect the WAL files on standby to only contain WAL records that have
    already been flushed on the primary.
    
    Thank you,
    Rahila Syed