Thread

  1. RE: Parallel Apply

    Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> — 2025-10-31T10:36:29Z

    Dear hackers,
    
    > TODO - potential improvement to use shared hash table for tracking
    > dependencies.
    
    I measured the performance data for the shared hash table approach. Based on the result,
    local hash table approach seems better.
    
    Abstract
    ========
    No good performance improvement was observed by the shared hash, it had 1-2% regression.
    The trend was not changed by number of parallel apply workers.
    
    Machine details
    ===============
    Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz CPU(s) :88 cores, - 503 GiB RAM
    
    Used patch
    ==========
    0001 is same as Hou posted on -hackers [1], and 0002 is the patch for shared hash.
    
    0002 introduces a shared hash table dependency_dshash. 0002 introduces a shared
    hash table dependency_dshash. Since the length of shared hash key must be fixed
    value, it is computed from the replica identity of tuples. When the parallel apply
    worker receives changes, it computes the hash key again and remember it by the list.
    At the commit time it iterates the list and remove hash entries based on the keys.
    0001 has the mechanism to clean up the local hash but it was removed.
    
    Workload
    ========
    Setup:
    ---------
    Pub --> Sub
     - Two nodes created in pub-sub synchronous logical replication setup.
     - Both nodes have same set of pgbench tables created with scale=100.
     - The Sub node is subscribed to all the changes from the Pub's pgbench tables
    
    Workload Run:
    --------------------
     - Run built-in pgbench(simple-update)[2] only on Pub with #clients=40 and run duration=5 minutes
    
    Results:
    --------------------
    Number of worker is changed to 4, 8 or 16. In any cases 0001 has better performance.
    
    #worker = 4:
    ------------
    	0001	0001+0002	diff
    TPS	14499.33387	14097.74469	3%
    	14361.7166	14359.87781	0%
    	14467.91344	14153.53934	2%
    	14451.8596	14381.70987	0%
    	14646.90346	14239.4712	3%
    	14530.66788	14298.33845	2%
    	14733.35987	14189.41794	4%
    	14543.9252	14373.21266	1%
    	14945.57568	14249.46787	5%
    	14638.6342	14125.87626	4%
    AVE	14581.988979	14246.865608	2%
    MEDIAN	14537.296540	14244.469536	2%
    
    #worker=8
    ---------
    	0001	0001+0002	diff
    TPS	21531.08712	21443.68765	0%
    	22337.60439	21383.94778	4%
    	21806.70504	21097.42874	3%
    	22192.99695	21424.78921	4%
    	21721.95472	21470.8714	1%
    	21450.6779	21265.89539	1%
    	21397.51433	21606.51486	-1%
    	21551.09391	21306.97061	1%
    	21455.89699	21351.38868	0%
    	21849.52528	21304.42329	3%
    AVE	21729.505662	21365.591761	2%
    MEDIAN	21636.524316	21367.668229	1%
    
    
    #worker=16
    -----------
    	0001	0001+0002	diff
    TPS	28034.64652	28129.85068	0%
    	27839.10942	27364.40725	2%
    	27693.94576	27871.80199	-1%
    	27717.83971	27129.96132	2%
    	28453.25381	27439.77526	4%
    	28083.73208	27201.0004	3%
    	27842.19262	27226.43813	2%
    	27729.44205	27459.01256	1%
    	28103.76727	27385.80016	3%
    	27688.52482	27485.67209	1%
    AVE	27918.645405	27469.371982	2%
    MEDIAN	27840.651020	27412.787708	2%
    
    [1]: https://www.postgresql.org/message-id/OS0PR01MB5716D43CB68DB8FFE73BF65D942AA%40OS0PR01MB5716.jpnprd01.prod.outlook.com
    [2]: https://www.postgresql.org/docs/current/pgbench.html#PGBENCH-OPTION-BUILTIN
    
    Best regards,
    Hayato Kuroda
    FUJITSU LIMITED