Re: WIP/PoC for parallel backup
Amit Kapila <amit.kapila16@gmail.com>
From: Amit Kapila <amit.kapila16@gmail.com>
To: Suraj Kharage <suraj.kharage@enterprisedb.com>
Cc: David Zhang <david.zhang@highgo.ca>, Ahsan Hadi <ahsan.hadi@gmail.com>, Asif Rehman <asifr.rehman@gmail.com>,
Kashif Zeeshan <kashif.zeeshan@enterprisedb.com>, Robert Haas <robertmhaas@gmail.com>, Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>, Jeevan Chalke <jeevan.chalke@enterprisedb.com>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Date: 2020-04-30T10:45:13Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix failures in incremental_sort due to number of workers
- 23ba3b5ee278 13.0 cited
-
In jsonb_plpython.c, suppress warning message from gcc 10.
- a06921816370 13.0 cited
-
Fix minor problems with non-exclusive backup cleanup.
- 303640199d04 13.0 cited
On Wed, Apr 29, 2020 at 6:11 PM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote: > > Hi, > > We at EnterpriseDB did some performance testing around this parallel backup to check how this is beneficial and below are the results. In this testing, we run the backup - > 1) Without Asif’s patch > 2) With Asif’s patch and combination of workers 1,2,4,8. > > We run those test on two setup > > 1) Client and Server both on the same machine (Local backups) > > 2) Client and server on a different machine (remote backups) > > > Machine details: > > 1: Server (on which local backups performed and used as server for remote backups) > > 2: Client (Used as a client for remote backups) > > ... > > > Client & Server on the same machine, the result shows around 50% improvement in parallel run with worker 4 and 8. We don’t see the huge performance improvement with more workers been added. > > > Whereas, when the client and server on a different machine, we don’t see any major benefit in performance. This testing result matches the testing results posted by David Zhang up thread. > > > > We ran the test for 100GB backup with parallel worker 4 to see the CPU usage and other information. What we noticed is that server is consuming the CPU almost 100% whole the time and pg_stat_activity shows that server is busy with ClientWrite most of the time. > > Was this for a setup where the client and server were on the same machine or where the client was on a different machine? If it was for the case where both are on the same machine, then ideally, we should see ClientRead events in a similar proportion? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com