Re: WIP/PoC for parallel backup

Ibrar Ahmed <ibrar.ahmad@gmail.com>

From: Ibrar Ahmed <ibrar.ahmad@gmail.com>

To:

Cc: Asif Rehman <asifr.rehman@gmail.com>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>

Date: 2019-08-23T13:03:10Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Fix failures in incremental_sort due to number of workers
- 23ba3b5ee278 13.0 cited
In jsonb_plpython.c, suppress warning message from gcc 10.
- a06921816370 13.0 cited
Fix minor problems with non-exclusive backup cleanup.
- 303640199d04 13.0 cited

On Fri, Aug 23, 2019 at 3:18 PM Asim R P <apraveen@pivotal.io> wrote:

> Hi Asif
>
> Interesting proposal.  Bulk of the work in a backup is transferring files
> from source data directory to destination.  Your patch is breaking this
> task down in multiple sets of files and transferring each set in parallel.
> This seems correct, however, your patch is also creating a new process to
> handle each set.  Is that necessary?  I think we should try to achieve this
> using multiple asynchronous libpq connections from a single basebackup
> process.  That is to use PQconnectStartParams() interface instead of
> PQconnectdbParams(), wich is currently used by basebackup.  On the server
> side, it may still result in multiple backend processes per connection, and
> an attempt should be made to avoid that as well, but it seems complicated.
>
> What do you think?
>
> The main question is what we really want to solve here. What is the
bottleneck? and which HW want to saturate?. Why I am saying that because
there are multiple H/W involve while taking the backup (Network/CPU/Disk).
If we
already saturated the disk then there is no need to add parallelism because
we will be blocked on disk I/O anyway.  I implemented the parallel backup
in a sperate
application and has wonderful results. I just skim through the code and have
some reservation that creating a separate process only for copying data is
overkill.
There are two options, one is non-blocking calls or you can have some
worker threads.
But before doing that need to see the pg_basebackup bottleneck, after that,
we
can see what is the best way to solve that. Some numbers may help to
understand the
actual benefit.


-- 
Ibrar Ahmed