Re: backup manifests

David Steele <david@pgmasters.net>

From: David Steele <david@pgmasters.net>

To: Robert Haas <robertmhaas@gmail.com>

Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>

Date: 2019-09-20T23:11:47Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Try to avoid compiler warnings in optimized builds.
- 05021a2c0cd2 13.0 landed
Fix option related issues in pg_verifybackup.
- 0a89e93bfaa6 13.0 landed
Add index term for backup manifest in documentation.
- 4db819ba4039 13.0 landed
Code review for backup manifest.
- a2ac73e7be7a 13.0 landed
Document the backup manifest file format.
- 149f2ae88ab0 13.0 landed
Fix typo in pg_validatebackup documentation.
- c4f82a779d26 13.0 landed
Exclude backup_manifest file that existed in database, from BASE_BACKUP.
- 1ec50a81ec0a 13.0 landed
Msys2 tweaks for pg_validatebackup corruption test
- c3e4cbaab936 13.0 landed
Fix resource management bug with replication=database.
- 3e0d80fd8d3d 13.0 cited
Be more careful about time_t vs. pg_time_t in basebackup.c.
- db1531cae009 13.0 cited
pg_validatebackup: Fix 'make clean' to remove tmp_check.
- 9f8f881caa0f 13.0 landed
pg_validatebackup: Also use perl2host in TAP tests.
- 460314db08e8 13.0 landed
Generate backup manifests for base backups, and validate them.
- 0d8c9c1210c4 13.0 landed
Add checksum helper functions.
- c12e43a2e0d4 13.0 landed
pg_waldump: Add a --quiet option.
- ac44367efbef 13.0 landed
Catversion bump for b9b408c48724
- afb5465e0cfc 13.0 cited
pg_basebackup: Refactor code for reading COPY and tar data.
- 431ba7bebf13 13.0 landed
Use a ResourceOwner to track buffer pins in all cases.
- 3cb646264e8c 12.0 cited
Use ARMv8 CRC instructions where available.
- f044d71e331d 11.0 cited
Logical replication support for initial data copy
- 7c4f52409a8c 10.0 cited
Use Intel SSE 4.2 CRC instructions where available.
- 3dc2d62d0486 9.5.0 cited
Switch to CRC-32C in WAL and other places.
- 5028f22f6eb0 9.5.0 cited
Remove support for 64-bit CRC.
- 404bc51cde9d 9.5.0 cited
Change CRCs in WAL records from 64bit to 32bit for performance reasons.
- 21fda22ec46d 8.1.0 cited

On 9/20/19 2:55 PM, Robert Haas wrote:
> On Fri, Sep 20, 2019 at 11:09 AM David Steele <david@pgmasters.net> wrote:
>>
>> It sucks to make that a prereq for this project but the longer we kick
>> that can down the road...
> 
> There are no doubt many patches that would benefit from having more
> backend infrastructure exposed in frontend contexts, and I think we're
> slowly moving in that direction, but I generally do not believe in
> burdening feature patches with major infrastructure improvements.

The hardest part about technical debt is knowing when to incur it.  It
is never a cut-and-dried choice.

>> This talk was good fun.  The largest number of tables we've seen is a
>> few hundred thousand, but that still adds up to more than a million
>> files to backup.
> 
> A quick survey of some of my colleagues turned up a few examples of
> people with 2-4 million files to backup, so similar kind of ballpark.
> Probably not big enough for the manifest to hit the 1GB mark, but
> getting close.

I have so many doubts about clusters with this many tables, but we do
support it, so...

>>> I hear you saying that this is going to end up being just as complex
>>> in the end, but I don't think I believe it.  It sounds to me like the
>>> difference between spending a couple of hours figuring this out and
>>> spending a couple of months trying to figure it out and maybe not
>>> actually getting anywhere.
>>
>> Maybe the initial implementation will be easier but I am confident we'll
>> pay for it down the road.  Also, don't we want users to be able to read
>> this file?  Do we really want them to need to cook up a custom parser in
>> Perl, Go, Python, etc.?
> 
> Well, I haven't heard anybody complain that they can't read a
> backup_label file because it's too hard to cook up a parser.  And I
> think the reason is pretty clear: such files are not hard to parse.
> Similarly for a pg_hba.conf file.  This case is a little more
> complicated than those, but AFAICS, not enormously so. Actually, it
> seems like a combination of those two cases: it has some fixed
> metadata fields that can be represented with one line per field, like
> a backup_label, and then a bunch of entries for files that are
> somewhat like entries in a pg_hba.conf file, in that they can be
> represented by a line per record with a certain number of fields on
> each line.

Yeah, they are not hard to parse, but *everyone* has to cook up code for
it.  A bit of a bummer, that.

> I attach here a couple of patches.  The first one does some
> refactoring of relevant code in pg_basebackup, and the second one adds
> checksum manifests using a format that I pulled out of my ear. It
> probably needs some adjustment but I don't think it's crazy.  Each
> file gets a line that looks like this:
> 
> File $FILENAME $FILESIZE $FILEMTIME $FILECHECKSUM

We also include page checksum validation failures in the file record.
Not critical for the first pass, perhaps, but something to keep in mind.

> Right now, the file checksums are computed using SHA-256 but it could
> be changed to anything else for which we've got code. On my system,
> shasum -a256 $FILE produces the same answer that shows up here.  At
> the bottom of the manifest there's a checksum of the manifest itself,
> which looks like this:
> 
> Manifest-Checksum
> 385fe156a8c6306db40937d59f46027cc079350ecf5221027d71367675c5f781
> 
> That's a SHA-256 checksum of the file contents excluding the final
> line. It can be verified by feeding all the file contents except the
> last line to shasum -a256. I can't help but observe that if the file
> were defined to be a JSONB blob, it's not very clear how you would
> include a checksum of the blob contents in the blob itself, but with a
> format based on a bunch of lines of data, it's super-easy to generate
> and super-easy to write tools that verify it.

You can do this in JSON pretty easily by handling the terminating
brace/bracket:

{
<some json contents>*,
"checksum":<sha256>
}

But of course a linefeed-delimited file is even easier.

> This is just a prototype so I haven't written a verification tool, and
> there's a bunch of testing and documentation and so forth that would
> need to be done aside from whatever we've got to hammer out in terms
> of design issues and file formats.  But I think it's cool, and perhaps
> some discussion of how it could be evolved will get us closer to a
> resolution everybody can at least live with.

I had a quick look and it seems pretty reasonable.  I'll need to
generate a manifest to see if I can spot any obvious gotchas.

-- 
-David
david@pgmasters.net