Re: WIP/PoC for parallel backup
Amit Kapila <amit.kapila16@gmail.com>
From: Amit Kapila <amit.kapila16@gmail.com>
To: Rushabh Lathia <rushabh.lathia@gmail.com>
Cc: Ahsan Hadi <ahsan.hadi@gmail.com>,
Suraj Kharage <suraj.kharage@enterprisedb.com>, David Zhang <david.zhang@highgo.ca>, Asif Rehman <asifr.rehman@gmail.com>, Kashif Zeeshan <kashif.zeeshan@enterprisedb.com>,
Robert Haas <robertmhaas@gmail.com>, Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>, Jeevan Chalke <jeevan.chalke@enterprisedb.com>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Date: 2020-05-21T06:53:56Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix failures in incremental_sort due to number of workers
- 23ba3b5ee278 13.0 cited
-
In jsonb_plpython.c, suppress warning message from gcc 10.
- a06921816370 13.0 cited
-
Fix minor problems with non-exclusive backup cleanup.
- 303640199d04 13.0 cited
On Thu, May 21, 2020 at 11:36 AM Rushabh Lathia <rushabh.lathia@gmail.com> wrote: > > On Thu, May 21, 2020 at 10:47 AM Ahsan Hadi <ahsan.hadi@gmail.com> wrote: >> >>>> >>>> During an offlist discussion with Robert, he pointed out that current >>>> basebackup's code doesn't account for the wait event for the reading >>>> of files which can change what pg_stat_activity shows? Can you please >>>> apply his latest patch to improve basebackup.c's code [1] which will >>>> take care of that waitevent before getting the data again? >>>> >>>> [1] - https://www.postgresql.org/message-id/CA%2BTgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA%40mail.gmail.com >>> >>> >>> >>> Sure, we can try out this and do a similar run to collect the pg_stat_activity output. >> >> >> Have you had the chance to try this out? > > > Yes. My colleague Suraj tried this and here are the pg_stat_activity output files. > > Captured wait events after every 3 seconds during the backup for - > 1: parallel backup for 100GB data with 4 workers (pg_stat_activity_normal_backup_100GB.txt) > 2: Normal backup (without parallel backup patch) for 100GB data (pg_stat_activity_j4_100GB.txt) > > Here is the observation: > > The total number of events (pg_stat_activity) captured during above runs: > - 314 events for normal backups > - 316 events for parallel backups (-j 4) > > BaseBackupRead wait event numbers: (newly added) > 37 - in normal backups > 25 - in the parallel backup (-j 4) > > ClientWrite wait event numbers: > 175 - in normal backup > 1098 - in parallel backups > > ClientRead wait event numbers: > 0 - ClientRead in normal backup > 326 - ClientRead in parallel backups for diff processes. (all in idle state) > It might be interesting to see why ClientRead/ClientWrite has increased so much and can we reduce it? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com