Thread

  1. Re: Parallel Apply

    Nisha Moond <nisha.moond412@gmail.com> — 2025-08-30T05:12:56Z

    Hi,
    
    I ran tests to compare the performance of logical synchronous
    replication with parallel-apply against physical synchronous
    replication.
    
    Highlights
    ===============
    On pgHead:(current behavior)
     - With synchronous physical replication set to remote_apply, the
    Primary’s TPS drops by ~60% (≈2.5x slower than asynchronous).
     - With synchronous logical replication set to remote_apply, the
    Publisher’s TPS drops drastically by ~94% (≈16x slower than
    asynchronous).
    
    With proposed Parallel-Apply Patch(v1):
     - Parallel apply significantly improves logical synchronous
    replication performance by 5-6×.
     - With 40 parallel workers on the subscriber, the Publisher achieves
    30045.82 TPS, which is 5.5× faster than the no-patch case (5435.46
    TPS).
     - With the patch, the Publisher’s performance is only ~3x slower than
    asynchronous, bringing it much closer to the physical replication
    case.
    
    Machine details
    ===============
    Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz CPU(s) :88 cores, - 503 GiB RAM
    
    Source code:
    ===============
     - pgHead(e9a31c0cc60) and v1 patch
    
    Test-01: Physical replication:
    ======================
     - To measure the physical synchronous replication performance on pgHead.
    
    Setup & Workload:
    -----------------
    Primary --> Standby
     - Two nodes created in physical (primary-standby) replication setup.
     - Default pgbench (read-write) was run on the Primary with scale=300,
    #clients=40, run duration=20 minutes.
     - The TPS is measured with the synchronous_commit set as "off" vs
    "remote_apply" on pgHead.
    
    Results:
    ---------
    synchronous_commit    Primary_TPS    regression
    OFF        90466.57743    -
    remote_apply(run1)    35848.6558    -60%
    remote_apply(run2)    35306.25479    -61%
    
     - on phHead, when synchronous_commit is set to "remote_apply" during
    physical replication, the Primary experiences a 60–61% reduction in
    TPS, which is ~2.5 times slower.
    ~~~
    
    Test-02: Logical replication:
    =====================
     - To measure the logical synchronous replication performance on
    pgHead and with parallel-apply patch.
    
    Setup & Workload:
    -----------------
    Publisher --> Subscriber
     - Two nodes created in logical (publisher-subscriber) replication setup.
     - Default pgbench (read-write) was run on the Pub with scale=300,
    #clients=40, run duration=20 minutes.
     - The TPS is measured on pgHead and with the parallel-apply v1 patch.
     - The number of parallel workers was varied as 2, 4, 8, 16, 32, 40.
    
    case-01: pgHead
    -------------------
    Results:
    synchronous_commit    Primary_TPS    regression
    pgHead(OFF)      89138.14626    --
    pgHead(remote_apply)    5435.464525    -94%
    
     - By default(pgHead), the synchronous logical replication sees a 94%
    drop in TPS which is -
     a) 16.4 times slower than the logical async case and,
     b) 6.6 times slower than physical sync replication case.
    
    case-02: patched
    ---------------------
     - synchronous_commit = 'remote_apply'
     - measured the performance by varying #parallel workers as 2, 4, 8, 16, 32, 40
    
    Results:
    #workers    Primary_TPS      Improvement_with_patch    faster_than_no-patch
       2     9679.077736    78%     1.78x
       4     14329.64073    164%    2.64x
       8     21832.04285    302%    4.02x
      16    27676.47085    409%    5.09x
      32    29718.40090    447%    5.47x
      40    30045.82365    453%    5.53x
    
    - The TPS on the publisher improves significantly as the number of
    parallel workers increases.
    - At 40 workers, the TPS reaches 30045.82, which is about 5.5x higher
    than the no-patch case..
    - With 40 parallel workers, logical sync replication is only about
    1.2x slower than physical sync replication.
    ~~~
    
    The scripts used for the tests are attached. We'll do tests with
    larger data sets later and share results.
    
    --
    Thanks,
    Nisha