Re: WIP/PoC for parallel backup
Asif Rehman <asifr.rehman@gmail.com>
From: Asif Rehman <asifr.rehman@gmail.com>
To: Robert Haas <robertmhaas@gmail.com>
Cc: Ibrar Ahmed <ibrar.ahmad@gmail.com>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Date: 2019-11-01T15:26:02Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix failures in incremental_sort due to number of workers
- 23ba3b5ee278 13.0 cited
-
In jsonb_plpython.c, suppress warning message from gcc 10.
- a06921816370 13.0 cited
-
Fix minor problems with non-exclusive backup cleanup.
- 303640199d04 13.0 cited
Attachments
- 0002-Rename-sizeonly-to-dryrun-for-few-functions-in-baseb_v4.patch (application/octet-stream) patch v4-0002
- 0001-remove-PG_ENSURE_ERROR_CLEANUP-macro-from-basebackup_v4.patch (application/octet-stream) patch v4-0001
- 0005-pg_basebackup-changes-for-parallel-backup_v4.patch (application/octet-stream) patch v4-0005
- 0004-backend-changes-for-parallel-backup_v4.patch (application/octet-stream) patch v4-0004
- 0003-Refactor-some-basebackup-code-to-increase-reusabilit_v4.patch (application/octet-stream) patch v4-0003
- 0006-parallel-backup-testcase_v4.patch (application/octet-stream) patch v4-0006
On Wed, Oct 30, 2019 at 7:16 PM Asif Rehman <asifr.rehman@gmail.com> wrote: > > > On Mon, Oct 28, 2019 at 8:29 PM Robert Haas <robertmhaas@gmail.com> wrote: > >> On Mon, Oct 28, 2019 at 10:03 AM Asif Rehman <asifr.rehman@gmail.com> >> wrote: >> > I have updated the patch to include the changes suggested by Jeevan. >> This patch also implements the thread workers instead of >> > processes and fetches a single file at a time. The tar format has been >> disabled for first version of parallel backup. >> >> Looking at 0001-0003: >> >> It's not clear to me what the purpose of the start WAL location is >> supposed to be. As far as I can see, SendBackupFiles() stores it in a >> variable which is then used for exactly nothing, and nothing else uses >> it. It seems like that would be part of a potential incremental >> backup feature, but I don't see what it's got to do with parallel full >> backup. >> > > 'startptr' is used by sendFile() during checksum verification. Since > SendBackupFiles() is using sendFIle we have to set a valid WAL location. > > >> The tablespace_path option appears entirely unused, and I don't know >> why that should be necessary here, either. >> > > This is to calculate the basepathlen. We need to exclude the tablespace > location (or > base path) from the filename before it is sent to the client with sendFile > call. I added > this option primarily to avoid performing string manipulation on filename > to extract the > tablespace location and then calculate the basepathlen. > > Alternatively we can do it by extracting the base path from the received > filename. What > do you suggest? > > >> >> STORE_BACKUPFILE() seems like maybe it should be a function rather >> than a macro, and also probably be renamed, because it doesn't store >> files and the argument's not necessarily a file. >> > Sure. > > >> >> SendBackupManifest() does not send a backup manifest in the sense >> contemplated by the email thread on that subject. It sends a file >> list. That seems like the right idea - IMHO, anyway - but you need to >> do a thorough renaming. >> > > I'm considering the following command names: > START_BACKUP > - Starts the backup process > > SEND_BACKUP_FILELIST (Instead of SEND_BACKUP_MANIFEST) > - Sends the list of all files (along with file information such as > filename, file type (directory/file/link), > file size and file mtime for each file) to be backed up. > > SEND_BACKUP_FILES > - Sends one or more files to the client. > > STOP_BACKUP > - Stops the backup process. > > I'll update the function names accordingly after your confirmation. Of > course, suggestions for > better names are welcome. > > >> >> I think it would be fine to decide that this facility won't support >> exclusive-mode backup. >> > > Sure. Will drop this patch. > > >> >> I don't think much of having both sendDir() and sendDir_(). The latter >> name is inconsistent with any naming convention we have, and there >> seems to be no reason not to just add an argument to sendDir() and >> change the callers. > > >> I think we should rename - perhaps as a preparatory patch - the >> sizeonly flag to dryrun, or something like that. >> > > Sure, will take care of it. > > >> The resource cleanup does not look right. You've included calls to >> PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, 0) in both StartBackup() >> and StopBackup(), but what happens if there is an error or even a >> clean shutdown of the connection in between? I think that there needs > > to be some change here to ensure that a walsender will always call >> base_backup_cleanup() when it exits; I think that'd probably remove >> the need for any PG_ENSURE_ERROR_CLEANUP calls at all, including ones >> we have already. This might also be something that could be done as a >> separate, prepatory refactoring patch. >> > > You're right. I didn't handle this case properly. I will removed > PG_ENSURE_ERROR_CLEANUP > calls and replace it with before_shmem_exit handler. This way > whenever backend process exits, > base_backup_cleanup will be called: > - If it exists before calling the do_pg_stop_backup, base_backup_cleanup > will take care of cleanup. > - otherwise in case of a clean shutdown (after calling do_pg_stop_backup) > then base_backup_cleanup > will simply return without doing anything. > > > The updated patches are attached. Thanks, -- Asif Rehman Highgo Software (Canada/China/Pakistan) URL : www.highgo.ca