Re: backup manifests
Noah Misch <noah@leadboat.com>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Try to avoid compiler warnings in optimized builds.
- 05021a2c0cd2 13.0 landed
-
Fix option related issues in pg_verifybackup.
- 0a89e93bfaa6 13.0 landed
-
Add index term for backup manifest in documentation.
- 4db819ba4039 13.0 landed
-
Code review for backup manifest.
- a2ac73e7be7a 13.0 landed
-
Document the backup manifest file format.
- 149f2ae88ab0 13.0 landed
-
Fix typo in pg_validatebackup documentation.
- c4f82a779d26 13.0 landed
-
Exclude backup_manifest file that existed in database, from BASE_BACKUP.
- 1ec50a81ec0a 13.0 landed
-
Msys2 tweaks for pg_validatebackup corruption test
- c3e4cbaab936 13.0 landed
-
Fix resource management bug with replication=database.
- 3e0d80fd8d3d 13.0 cited
-
Be more careful about time_t vs. pg_time_t in basebackup.c.
- db1531cae009 13.0 cited
-
pg_validatebackup: Fix 'make clean' to remove tmp_check.
- 9f8f881caa0f 13.0 landed
-
pg_validatebackup: Also use perl2host in TAP tests.
- 460314db08e8 13.0 landed
-
Generate backup manifests for base backups, and validate them.
- 0d8c9c1210c4 13.0 landed
-
Add checksum helper functions.
- c12e43a2e0d4 13.0 landed
-
pg_waldump: Add a --quiet option.
- ac44367efbef 13.0 landed
-
Catversion bump for b9b408c48724
- afb5465e0cfc 13.0 cited
-
pg_basebackup: Refactor code for reading COPY and tar data.
- 431ba7bebf13 13.0 landed
-
Use a ResourceOwner to track buffer pins in all cases.
- 3cb646264e8c 12.0 cited
-
Use ARMv8 CRC instructions where available.
- f044d71e331d 11.0 cited
-
Logical replication support for initial data copy
- 7c4f52409a8c 10.0 cited
-
Use Intel SSE 4.2 CRC instructions where available.
- 3dc2d62d0486 9.5.0 cited
-
Switch to CRC-32C in WAL and other places.
- 5028f22f6eb0 9.5.0 cited
-
Remove support for 64-bit CRC.
- 404bc51cde9d 9.5.0 cited
-
Change CRCs in WAL records from 64bit to 32bit for performance reasons.
- 21fda22ec46d 8.1.0 cited
On Sun, Mar 29, 2020 at 08:42:35PM -0400, Robert Haas wrote: > On Sat, Mar 28, 2020 at 11:40 PM Noah Misch <noah@leadboat.com> wrote: > > I think this functionality doesn't belong in its own program. If you suspect > > pg_basebackup or pg_restore will eventually gain the ability to merge > > incremental backups into a recovery-ready base backup, I would put the > > functionality in that program. Otherwise, I would put it in pg_checksums. > > For me, part of the friction here is that the program description indicates > > general verification, but the actual functionality merely checks hashes on a > > directory tree that happens to represent a PostgreSQL base backup. > > Suraj's original patch made this part of pg_basebackup, but I didn't > really like that, because I wanted it to have its own set of options. > I still think all the options I've added are pretty useful ones, and I > can think of other things somebody might want to do. It feels very > uncomfortable to make pg_basebackup, or pg_checksums, take either > options from set A and do thing X, or options from set B and do thing > Y. pg_checksums does already have that property, for what it's worth. (More specifically, certain options dictate the mode, and it reports an error if another option is incompatible with the mode.) > But it feels clear that the name pg_validatebackup is not going > over very well with anyone. I think I should rename it to > pg_validatemanifest. Between those two, I would use "pg_validatebackup" if there's a fair chance it will end up doing the pg_waldump check. Otherwise, I would use "pg_validatemanifest". I still most prefer delivering this as a mode of an existing program. > > > + parse->pathname = palloc(raw_length + 1); > > > > I don't see this freed anywhere; is it? (It's useful to make peak memory > > consumption not grow in proportion to the number of files backed up.) > > We need the hash table to remain populated for the whole run time of > the tool, because we're essentially doing a full join of the actual > directory contents against the manifest contents. That's a bit > unfortunate but it doesn't seem simple to improve. I think the only > people who are really going to suffer are people who have an enormous > pile of empty or nearly-empty relations. People who have large > databases for the normal reason - i.e. a reasonable number of tables > that hold a lot of data - will have manifests of very manageable size. Okay.