Re: backup manifests
Andres Freund <andres@anarazel.de>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Try to avoid compiler warnings in optimized builds.
- 05021a2c0cd2 13.0 landed
-
Fix option related issues in pg_verifybackup.
- 0a89e93bfaa6 13.0 landed
-
Add index term for backup manifest in documentation.
- 4db819ba4039 13.0 landed
-
Code review for backup manifest.
- a2ac73e7be7a 13.0 landed
-
Document the backup manifest file format.
- 149f2ae88ab0 13.0 landed
-
Fix typo in pg_validatebackup documentation.
- c4f82a779d26 13.0 landed
-
Exclude backup_manifest file that existed in database, from BASE_BACKUP.
- 1ec50a81ec0a 13.0 landed
-
Msys2 tweaks for pg_validatebackup corruption test
- c3e4cbaab936 13.0 landed
-
Fix resource management bug with replication=database.
- 3e0d80fd8d3d 13.0 cited
-
Be more careful about time_t vs. pg_time_t in basebackup.c.
- db1531cae009 13.0 cited
-
pg_validatebackup: Fix 'make clean' to remove tmp_check.
- 9f8f881caa0f 13.0 landed
-
pg_validatebackup: Also use perl2host in TAP tests.
- 460314db08e8 13.0 landed
-
Generate backup manifests for base backups, and validate them.
- 0d8c9c1210c4 13.0 landed
-
Add checksum helper functions.
- c12e43a2e0d4 13.0 landed
-
pg_waldump: Add a --quiet option.
- ac44367efbef 13.0 landed
-
Catversion bump for b9b408c48724
- afb5465e0cfc 13.0 cited
-
pg_basebackup: Refactor code for reading COPY and tar data.
- 431ba7bebf13 13.0 landed
-
Use a ResourceOwner to track buffer pins in all cases.
- 3cb646264e8c 12.0 cited
-
Use ARMv8 CRC instructions where available.
- f044d71e331d 11.0 cited
-
Logical replication support for initial data copy
- 7c4f52409a8c 10.0 cited
-
Use Intel SSE 4.2 CRC instructions where available.
- 3dc2d62d0486 9.5.0 cited
-
Switch to CRC-32C in WAL and other places.
- 5028f22f6eb0 9.5.0 cited
-
Remove support for 64-bit CRC.
- 404bc51cde9d 9.5.0 cited
-
Change CRCs in WAL records from 64bit to 32bit for performance reasons.
- 21fda22ec46d 8.1.0 cited
Hi, On 2020-03-27 15:29:02 -0400, Robert Haas wrote: > On Fri, Mar 27, 2020 at 11:26 AM Stephen Frost <sfrost@snowman.net> wrote: > > > Seems better to (later?) add support for generating manifests for WAL > > > files, and then have a tool that can verify all the manifests required > > > to restore a base backup. > > > > I'm not trying to expand on the feature set here or move the goalposts > > way down the road, which is what seems to be what's being suggested > > here. To be clear, I don't have any objection to adding a generic tool > > for validating WAL as you're talking about here, but I also don't think > > that's required for pg_validatebackup. What I do think we need is a > > check of the WAL that's fetched when people use pg_basebackup -Xstream > > or -Xfetch. pg_basebackup itself has that check because it's critical > > to the backup being successful and valid. Not having that basic > > validation of a backup really just isn't ok- there's a reason > > pg_basebackup has that check. > > I don't understand how this could be done without significantly > complicating the architecture. As I said before, -Xstream sends WAL > over a separate connection that is unrelated to the one running > BASE_BACKUP, so the base-backup connection doesn't know what to > include in the manifest. Now you could do something like: once all of > the WAL files have been fetched, the client checksums all of those and > sends their names and checksums to the server, which turns around and > puts them into the manifest, which it then sends back to the client. > But that is actually quite a bit of additional complexity, and it's > pretty strange, too, because now you have the client checksumming some > files and the server checksumming others. I know you mentioned a few > different ideas before, but I think they all kinda have some problem > along these lines. How about having separate manifests for segments? And have them stay separate? And then have an option to verify the manifests for all the WAL files that are required for a specific restore? The easiest way would be to just add a separate manifest file for each segment, and name them accordingly. But inventing a naming pattern that specifies both start-end segments wouldn't be hard either, and result in fewer manifests. Base backups (in the backup sense, not for bringing up replicas etc) without the ability to apply newer WAL are fairly pointless imo. And if newer WAL is applied, there's not much point in just verifying the WAL that's necessary to restore the base backup. Instead you'd want to be able to verify all the WAL since the base backup to the "current" point (or the next base backup). For me having something inside pg_basebackup (or the server, for -Xfetch) that somehow includes the WAL files in the manifest doesn't really gain us much - it's obviously not something that'll help us to verify all the WAL that needs to be applied (to either get the base backup into a consistent state, or to roll forward to the desired point). > I also kinda disagree with the idea that the WAL should be considered > an integral part of the backup. I don't know how pgbackrest does > things, but BART stores each backup in a separate directly without any > associated WAL, and then keeps all the WAL together in a different > directory. I imagine that people who are using continuous archiving > also tend to use -Xnone, or if they do backups by copying the files > rather than using pg_backrest, they exclude pg_wal. In fact, for > people with big, important databases, I'd assume that would be the > normal pattern. You presumably wouldn't want to keep one copy of the > WAL files taken during the backup with the backup itself, and a > separate copy in the archive. +1 I also don't see them as being as important, due to the already existing checksums (which are of a much much much higher quality than what we have for database pages, both by being wider, and by being much more frequent in most cases). There's obviously a need to validate the WAL in a nicer way than scripting pg_waldump - but that seems separate anyway. Greetings, Andres Freund