Re: WIP/PoC for parallel backup
Robert Haas <robertmhaas@gmail.com>
From: Robert Haas <robertmhaas@gmail.com>
To: Asif Rehman <asifr.rehman@gmail.com>
Cc: dipesh.pandit@gmail.com, Kashif Zeeshan <kashif.zeeshan@enterprisedb.com>, Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>, Jeevan Chalke <jeevan.chalke@enterprisedb.com>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Date: 2020-04-22T16:27:35Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix failures in incremental_sort due to number of workers
- 23ba3b5ee278 13.0 cited
-
In jsonb_plpython.c, suppress warning message from gcc 10.
- a06921816370 13.0 cited
-
Fix minor problems with non-exclusive backup cleanup.
- 303640199d04 13.0 cited
On Wed, Apr 22, 2020 at 10:18 AM Asif Rehman <asifr.rehman@gmail.com> wrote: > I don't foresee memory to be a challenge here. Assuming a database containing 10240 > relation files (that max reach to 10 TB of size), the list will occupy approximately 102MB > of space in memory. This obviously can be reduced, but it doesn’t seem too bad either. > One way of doing it is by fetching a smaller set of files and clients can result in the next > set if the current one is processed; perhaps fetch initially per table space and request for > next one once the current one is done with. The more concerning case is when someone has a lot of small files. > Okay have added throttling_counter as atomic. however a lock is still required > for throttling_counter%=throttling_sample. Well, if you can't get rid of the lock, using a atomics is pointless. >> + sendFile(file, file + basepathlen, &statbuf, >> true, InvalidOid, NULL, NULL); >> >> Maybe I'm misunderstanding, but this looks like it's going to write a >> tar header, even though we're not writing a tarfile. > > sendFile() always sends files with tar header included, even if the backup mode > is plain. pg_basebackup also expects the same. That's the current behavior of > the system. > > Otherwise, we will have to duplicate this function which would be doing the pretty > much same thing, except the tar header. Well, as I said before, the solution to that problem is refactoring, not crummy interfaces. You're never going to persuade any committer who understands what that code actually does to commit it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company