Re: backup manifests
Robert Haas <robertmhaas@gmail.com>
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Try to avoid compiler warnings in optimized builds.
- 05021a2c0cd2 13.0 landed
-
Fix option related issues in pg_verifybackup.
- 0a89e93bfaa6 13.0 landed
-
Add index term for backup manifest in documentation.
- 4db819ba4039 13.0 landed
-
Code review for backup manifest.
- a2ac73e7be7a 13.0 landed
-
Document the backup manifest file format.
- 149f2ae88ab0 13.0 landed
-
Fix typo in pg_validatebackup documentation.
- c4f82a779d26 13.0 landed
-
Exclude backup_manifest file that existed in database, from BASE_BACKUP.
- 1ec50a81ec0a 13.0 landed
-
Msys2 tweaks for pg_validatebackup corruption test
- c3e4cbaab936 13.0 landed
-
Fix resource management bug with replication=database.
- 3e0d80fd8d3d 13.0 cited
-
Be more careful about time_t vs. pg_time_t in basebackup.c.
- db1531cae009 13.0 cited
-
pg_validatebackup: Fix 'make clean' to remove tmp_check.
- 9f8f881caa0f 13.0 landed
-
pg_validatebackup: Also use perl2host in TAP tests.
- 460314db08e8 13.0 landed
-
Generate backup manifests for base backups, and validate them.
- 0d8c9c1210c4 13.0 landed
-
Add checksum helper functions.
- c12e43a2e0d4 13.0 landed
-
pg_waldump: Add a --quiet option.
- ac44367efbef 13.0 landed
-
Catversion bump for b9b408c48724
- afb5465e0cfc 13.0 cited
-
pg_basebackup: Refactor code for reading COPY and tar data.
- 431ba7bebf13 13.0 landed
-
Use a ResourceOwner to track buffer pins in all cases.
- 3cb646264e8c 12.0 cited
-
Use ARMv8 CRC instructions where available.
- f044d71e331d 11.0 cited
-
Logical replication support for initial data copy
- 7c4f52409a8c 10.0 cited
-
Use Intel SSE 4.2 CRC instructions where available.
- 3dc2d62d0486 9.5.0 cited
-
Switch to CRC-32C in WAL and other places.
- 5028f22f6eb0 9.5.0 cited
-
Remove support for 64-bit CRC.
- 404bc51cde9d 9.5.0 cited
-
Change CRCs in WAL records from 64bit to 32bit for performance reasons.
- 21fda22ec46d 8.1.0 cited
Attachments
- v15-0001-Add-checksum-helper-functions.patch (application/octet-stream) patch v15-0001
- v15-0002-Generate-backup-manifests-for-base-backups-and-v.patch (application/octet-stream) patch v15-0002
On Fri, Mar 27, 2020 at 3:51 PM David Steele <david@pgmasters.net> wrote:
> There appear to be conflicts with 67e0adfb3f98:
Rebased.
> > + Specifies the algorithm that should be used to checksum
> each file
> > + for purposes of the backup manifest. Currently, the available
>
> perhaps "for inclusion in the backup manifest"? Anyway, I think this
> sentence is awkward.
I changed it to "Specifies the checksum algorithm that should be
applied to each file included in the backup manifest." I hope that's
better. I also added, in both of the places where this text occurs, an
explanation a little higher up of what a backup manifest actually is.
> > + because the files themselves do not need to read.
>
> should be "need to be read".
Fixed.
> > + the manifest itself will always contain a
> <literal>SHA256</literal>
>
> I think just "the manifest will always contain" is fine.
OK.
> > + manifeste itself, and is therefore ignored. Note that the
> manifest
>
> typo "manifeste", perhaps remove itself.
OK, fixed.
> > { "Path": "backup_label", "Size": 224, "Last-Modified": "2020-03-27
> 18:33:18 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "b914bec9" },
>
> Storing the checksum type with each file seems pretty redundant.
> Perhaps that could go in the header? You could always override if a
> specific file had a different checksum type, though that seems unlikely.
>
> In general it might be good to go with shorter keys: "mod", "chk", etc.
> Manifests can get pretty big and that's a lot of extra bytes.
>
> I'm also partial to using epoch time in the manifest because it is
> generally easier for programs to work with. But, human-readable doesn't
> suck, either.
It doesn't seem impossible for it to come up; for example, consider a
file-level incremental backup facility. You might retain whatever
checksums you have for the unchanged files (to avoid rereading them)
and add checksums for modified or added files.
I am not convinced that minimizing the size of the file here is a
particularly important goal, because I don't think it's going to get
that big in normal cases. I also think having the keys and values be
easily understandable by human being is a plus. If we really want a
minimal format without redundancy, we should've gone with what I
proposed before (though admittedly that could've been tamped down even
further if we'd cared to squeeze, which I didn't think was important
then either).
>
> > if (maxrate > 0)
> > maxrate_clause = psprintf("MAX_RATE %u", maxrate);
> > + if (manifest)
>
> A linefeed here would be nice.
Added.
> > + manifestfile *tabent;
>
> This is an odd name. A holdover from the tab-delimited version?
No, it was meant to stand for table entry. (Now we find out what
happens when I break my own rule against using abbreviated words.)
> > + printf(_("Usage:\n %s [OPTION]... BACKUPDIR\n\n"), progname);
>
> When I ran pg_validatebackup I expected to use -D to specify the backup
> dir since pg_basebackup does. On the other hand -D is weird because I
> *really* expect that to be the pg data dir.
>
> But, do we want this to be different from pg_basebackup?
I think it's pretty distinguishable, because pg_basebackup needs an
input (server) and an output (directory), whereas pg_validatebackup
only needs one. I don't really care if we want to change it, but I was
thinking of this as being more analogous to, say, pg_resetwal.
Granted, that's a danger-don't-use-this tool and this isn't, but I
don't think we want the -D-is-optional behavior that tools like pg_ctl
have, because having a tool that isn't supposed to be used on a
running cluster default to $PGDATA seems inadvisable. And if the
argument is mandatory then it's not clear to me why we should make
people type -D in front of it.
> > + checksum_length = checksum_string_length / 2;
>
> This check is defeated if a single character is added the to checksum.
>
> Not too big a deal since you still get an error, but still.
I don't see what the problem is here. We speculatively divide by two
and allocate memory assuming the value that it was even, but then
before doing anything critical we bail out if it was actually odd.
That's harmless. We could get around it by saying:
if (checksum_string_length % 2 != 0)
context->error_cb(...);
checksum_length = checksum_string_length / 2;
checksum_payload = palloc(checksum_length);
if (!hexdecode_string(...))
context->error_cb(...);
...but that would be adding additional code, and error messages, for
what's basically a can't-happen-unless-the-user-is-messing-with-us
case.
> > + * Verify that the manifest checksum is correct.
>
> This is not working the way I would expect -- I could freely modify the
> manifest without getting a checksum error on the manifest. For example:
>
> $ /home/vagrant/test/pg/bin/pg_validatebackup test/backup3
> pg_validatebackup: fatal: invalid checksum for file "backup_label":
> "408901e0814f40f8ceb7796309a59c7248458325a21941e7c55568e381f53831?"
>
> So, if I deleted the entry above, I got a manifest checksum error. But
> if I just modified the checksum I get a file checksum error with no
> manifest checksum error.
>
> I would prefer a manifest checksum error in all cases where it is wrong,
> unless --exit-on-error is specified.
I think I would too, but I'm confused as to what you're doing, because
if I just modified the manifest -- by deleting a file, for example, or
changing the checksum of a file, I just get:
pg_validatebackup: fatal: manifest checksum mismatch
I'm confused as to why you're not seeing that. What's the exact
sequence of steps?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company