Re: WIP/PoC for parallel backup
David Zhang <david.zhang@highgo.ca>
From: David Zhang <david.zhang@highgo.ca>
To: Amit Kapila <amit.kapila16@gmail.com>, Ahsan Hadi <ahsan.hadi@gmail.com>
Cc: Asif Rehman <asifr.rehman@gmail.com>,
Kashif Zeeshan <kashif.zeeshan@enterprisedb.com>,
Robert Haas <robertmhaas@gmail.com>,
Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>,
Jeevan Chalke <jeevan.chalke@enterprisedb.com>,
PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Date: 2020-04-27T16:53:16Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix failures in incremental_sort due to number of workers
- 23ba3b5ee278 13.0 cited
-
In jsonb_plpython.c, suppress warning message from gcc 10.
- a06921816370 13.0 cited
-
Fix minor problems with non-exclusive backup cleanup.
- 303640199d04 13.0 cited
Attachments
- perf-report-parallel_backup_v15.zip (application/zip)
Hi, Here is the parallel backup performance test results with and without the patch "parallel_backup_v15" on AWS cloud environment. Two "t2.xlarge" machines were used: one for Postgres server and the other one for pg_basebackup with the same machine configuration showing below. Machine configuration: Instance Type :t2.xlarge Volume type :io1 Memory (MiB) :16GB vCPU # :4 Architecture :x86_64 IOP :6000 Database Size (GB) :108 Performance test results: without patch: real 18m49.346s user 1m24.178s sys 7m2.966s 1 worker with patch: real 18m43.201s user 1m55.787s sys 7m24.724s 2 worker with patch: real 18m47.373s user 2m22.970s sys 11m23.891s 4 worker with patch: real 18m46.878s user 2m26.791s sys 13m14.716s As required, I didn't have the pgbench running in parallel like we did in the previous benchmark. The perf report files for both Postgres server and pg_basebackup sides are attached. The files are listed like below. i.e. without patch 1 worker, and with patch 1, 2, 4 workers. perf report on Postgres server side: perf.data-postgres-without-parallel_backup_v15.txt perf.data-postgres-with-parallel_backup_v15-j1.txt perf.data-postgres-with-parallel_backup_v15-j2.txt perf.data-postgres-with-parallel_backup_v15-j4.txt perf report on pg_basebackup side: perf.data-pg_basebackup-without-parallel_backup_v15.txt perf.data-pg_basebackup-with-parallel_backup_v15-j1.txt perf.data-pg_basebackup-with-parallel_backup_v15-j2.txt perf.data-pg_basebackup-with-parallel_backup_v15-j4.txt If any more information required please let me know. On 2020-04-21 7:12 a.m., Amit Kapila wrote: > On Tue, Apr 21, 2020 at 5:26 PM Ahsan Hadi <ahsan.hadi@gmail.com> wrote: >> On Tue, Apr 21, 2020 at 4:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >>> On Tue, Apr 21, 2020 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >>>> On Tue, Apr 21, 2020 at 1:00 PM Asif Rehman <asifr.rehman@gmail.com> wrote: >>>>> I did some tests a while back, and here are the results. The tests were done to simulate >>>>> a live database environment using pgbench. >>>>> >>>>> machine configuration used for this test: >>>>> Instance Type: t2.xlarge >>>>> Volume Type : io1 >>>>> Memory (MiB) : 16384 >>>>> vCPU # : 4 >>>>> Architecture : X86_64 >>>>> IOP : 16000 >>>>> Database Size (GB) : 102 >>>>> >>>>> The setup consist of 3 machines. >>>>> - one for database instances >>>>> - one for pg_basebackup client and >>>>> - one for pgbench with some parallel workers, simulating SELECT loads. >>>>> >>>>> basebackup | 4 workers | 8 Workers | 16 workers >>>>> Backup Duration(Min): 69.25 | 20.44 | 19.86 | 20.15 >>>>> (pgbench running with 50 parallel client simulating SELECT load) >>>>> >>>>> Backup Duration(Min): 154.75 | 49.28 | 45.27 | 20.35 >>>>> (pgbench running with 100 parallel client simulating SELECT load) >>>>> >>>> Thanks for sharing the results, these show nice speedup! However, I >>>> think we should try to find what exactly causes this speed up. If you >>>> see the recent discussion on another thread related to this topic, >>>> Andres, pointed out that he doesn't think that we can gain much by >>>> having multiple connections[1]. It might be due to some internal >>>> limitations (like small buffers) [2] due to which we are seeing these >>>> speedups. It might help if you can share the perf reports of the >>>> server-side and pg_basebackup side. >>>> >>> Just to be clear, we need perf reports both with and without patch-set. >> >> These tests were done a while back, I think it would be good to run the benchmark again with the latest patches of parallel backup and share the results and perf reports. >> > Sounds good. I think we should also try to run the test with 1 worker > as well. The reason it will be good to see the results with 1 worker > is that we can know if the technique to send file by file as is done > in this patch is better or worse than the current HEAD code. So, it > will be good to see the results of an unpatched code, 1 worker, 2 > workers, 4 workers, etc. > -- David Software Engineer Highgo Software Inc. (Canada) www.highgo.ca