Re: backup manifests
David Steele <david@pgmasters.net>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Try to avoid compiler warnings in optimized builds.
- 05021a2c0cd2 13.0 landed
-
Fix option related issues in pg_verifybackup.
- 0a89e93bfaa6 13.0 landed
-
Add index term for backup manifest in documentation.
- 4db819ba4039 13.0 landed
-
Code review for backup manifest.
- a2ac73e7be7a 13.0 landed
-
Document the backup manifest file format.
- 149f2ae88ab0 13.0 landed
-
Fix typo in pg_validatebackup documentation.
- c4f82a779d26 13.0 landed
-
Exclude backup_manifest file that existed in database, from BASE_BACKUP.
- 1ec50a81ec0a 13.0 landed
-
Msys2 tweaks for pg_validatebackup corruption test
- c3e4cbaab936 13.0 landed
-
Fix resource management bug with replication=database.
- 3e0d80fd8d3d 13.0 cited
-
Be more careful about time_t vs. pg_time_t in basebackup.c.
- db1531cae009 13.0 cited
-
pg_validatebackup: Fix 'make clean' to remove tmp_check.
- 9f8f881caa0f 13.0 landed
-
pg_validatebackup: Also use perl2host in TAP tests.
- 460314db08e8 13.0 landed
-
Generate backup manifests for base backups, and validate them.
- 0d8c9c1210c4 13.0 landed
-
Add checksum helper functions.
- c12e43a2e0d4 13.0 landed
-
pg_waldump: Add a --quiet option.
- ac44367efbef 13.0 landed
-
Catversion bump for b9b408c48724
- afb5465e0cfc 13.0 cited
-
pg_basebackup: Refactor code for reading COPY and tar data.
- 431ba7bebf13 13.0 landed
-
Use a ResourceOwner to track buffer pins in all cases.
- 3cb646264e8c 12.0 cited
-
Use ARMv8 CRC instructions where available.
- f044d71e331d 11.0 cited
-
Logical replication support for initial data copy
- 7c4f52409a8c 10.0 cited
-
Use Intel SSE 4.2 CRC instructions where available.
- 3dc2d62d0486 9.5.0 cited
-
Switch to CRC-32C in WAL and other places.
- 5028f22f6eb0 9.5.0 cited
-
Remove support for 64-bit CRC.
- 404bc51cde9d 9.5.0 cited
-
Change CRCs in WAL records from 64bit to 32bit for performance reasons.
- 21fda22ec46d 8.1.0 cited
On 3/27/20 3:29 PM, Robert Haas wrote: > On Fri, Mar 27, 2020 at 11:26 AM Stephen Frost <sfrost@snowman.net> wrote: >>> Seems better to (later?) add support for generating manifests for WAL >>> files, and then have a tool that can verify all the manifests required >>> to restore a base backup. >> >> I'm not trying to expand on the feature set here or move the goalposts >> way down the road, which is what seems to be what's being suggested >> here. To be clear, I don't have any objection to adding a generic tool >> for validating WAL as you're talking about here, but I also don't think >> that's required for pg_validatebackup. What I do think we need is a >> check of the WAL that's fetched when people use pg_basebackup -Xstream >> or -Xfetch. pg_basebackup itself has that check because it's critical >> to the backup being successful and valid. Not having that basic >> validation of a backup really just isn't ok- there's a reason >> pg_basebackup has that check. > > I don't understand how this could be done without significantly > complicating the architecture. As I said before, -Xstream sends WAL > over a separate connection that is unrelated to the one running > BASE_BACKUP, so the base-backup connection doesn't know what to > include in the manifest. Now you could do something like: once all of > the WAL files have been fetched, the client checksums all of those and > sends their names and checksums to the server, which turns around and > puts them into the manifest, which it then sends back to the client. > But that is actually quite a bit of additional complexity, and it's > pretty strange, too, because now you have the client checksumming some > files and the server checksumming others. I know you mentioned a few > different ideas before, but I think they all kinda have some problem > along these lines. > > I also kinda disagree with the idea that the WAL should be considered > an integral part of the backup. I don't know how pgbackrest does > things, We checksum each WAL file while it is read and transmitted to the repo by the archive_command. Then at the end of the backup we ensure that all the WAL required to make the backup consistent has made it to the repo. > but BART stores each backup in a separate directly without any > associated WAL, and then keeps all the WAL together in a different > directory. I imagine that people who are using continuous archiving > also tend to use -Xnone, or if they do backups by copying the files > rather than using pg_backrest, they exclude pg_wal. In fact, for > people with big, important databases, I'd assume that would be the > normal pattern. You presumably wouldn't want to keep one copy of the > WAL files taken during the backup with the backup itself, and a > separate copy in the archive. pgBackRest does provide the option to copy WAL into the backup directory for the super-paranoid, though it is not the default. It is pretty handy for moving individual backups some other medium like tape, though. If -Xnone is specified then it seems like pg_validatebackup is completely off the hook. But in the case of -Xstream or -Xfetch couldn't we at least verify that the expected WAL segments are present and the correct size? Storing the start/stop lsn in the manifest would be a nice thing to have anyway and that would make this feature pretty trivial. Yeah, that's in the backup_label file as well but the manifest is so much easier to read. Regards, -- -David david@pgmasters.net