Re: backup manifests

David Steele <david@pgmasters.net>

From: David Steele <david@pgmasters.net>
To: Robert Haas <robertmhaas@gmail.com>
Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Date: 2019-09-20T23:11:47Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Try to avoid compiler warnings in optimized builds.

  2. Fix option related issues in pg_verifybackup.

  3. Add index term for backup manifest in documentation.

  4. Code review for backup manifest.

  5. Document the backup manifest file format.

  6. Fix typo in pg_validatebackup documentation.

  7. Exclude backup_manifest file that existed in database, from BASE_BACKUP.

  8. Msys2 tweaks for pg_validatebackup corruption test

  9. Fix resource management bug with replication=database.

  10. Be more careful about time_t vs. pg_time_t in basebackup.c.

  11. pg_validatebackup: Fix 'make clean' to remove tmp_check.

  12. pg_validatebackup: Also use perl2host in TAP tests.

  13. Generate backup manifests for base backups, and validate them.

  14. Add checksum helper functions.

  15. pg_waldump: Add a --quiet option.

  16. Catversion bump for b9b408c48724

  17. pg_basebackup: Refactor code for reading COPY and tar data.

  18. Use a ResourceOwner to track buffer pins in all cases.

  19. Use ARMv8 CRC instructions where available.

  20. Logical replication support for initial data copy

  21. Use Intel SSE 4.2 CRC instructions where available.

  22. Switch to CRC-32C in WAL and other places.

  23. Remove support for 64-bit CRC.

  24. Change CRCs in WAL records from 64bit to 32bit for performance reasons.

On 9/20/19 2:55 PM, Robert Haas wrote:
> On Fri, Sep 20, 2019 at 11:09 AM David Steele <david@pgmasters.net> wrote:
>>
>> It sucks to make that a prereq for this project but the longer we kick
>> that can down the road...
> 
> There are no doubt many patches that would benefit from having more
> backend infrastructure exposed in frontend contexts, and I think we're
> slowly moving in that direction, but I generally do not believe in
> burdening feature patches with major infrastructure improvements.

The hardest part about technical debt is knowing when to incur it.  It
is never a cut-and-dried choice.

>> This talk was good fun.  The largest number of tables we've seen is a
>> few hundred thousand, but that still adds up to more than a million
>> files to backup.
> 
> A quick survey of some of my colleagues turned up a few examples of
> people with 2-4 million files to backup, so similar kind of ballpark.
> Probably not big enough for the manifest to hit the 1GB mark, but
> getting close.

I have so many doubts about clusters with this many tables, but we do
support it, so...

>>> I hear you saying that this is going to end up being just as complex
>>> in the end, but I don't think I believe it.  It sounds to me like the
>>> difference between spending a couple of hours figuring this out and
>>> spending a couple of months trying to figure it out and maybe not
>>> actually getting anywhere.
>>
>> Maybe the initial implementation will be easier but I am confident we'll
>> pay for it down the road.  Also, don't we want users to be able to read
>> this file?  Do we really want them to need to cook up a custom parser in
>> Perl, Go, Python, etc.?
> 
> Well, I haven't heard anybody complain that they can't read a
> backup_label file because it's too hard to cook up a parser.  And I
> think the reason is pretty clear: such files are not hard to parse.
> Similarly for a pg_hba.conf file.  This case is a little more
> complicated than those, but AFAICS, not enormously so. Actually, it
> seems like a combination of those two cases: it has some fixed
> metadata fields that can be represented with one line per field, like
> a backup_label, and then a bunch of entries for files that are
> somewhat like entries in a pg_hba.conf file, in that they can be
> represented by a line per record with a certain number of fields on
> each line.

Yeah, they are not hard to parse, but *everyone* has to cook up code for
it.  A bit of a bummer, that.

> I attach here a couple of patches.  The first one does some
> refactoring of relevant code in pg_basebackup, and the second one adds
> checksum manifests using a format that I pulled out of my ear. It
> probably needs some adjustment but I don't think it's crazy.  Each
> file gets a line that looks like this:
> 
> File $FILENAME $FILESIZE $FILEMTIME $FILECHECKSUM

We also include page checksum validation failures in the file record.
Not critical for the first pass, perhaps, but something to keep in mind.

> Right now, the file checksums are computed using SHA-256 but it could
> be changed to anything else for which we've got code. On my system,
> shasum -a256 $FILE produces the same answer that shows up here.  At
> the bottom of the manifest there's a checksum of the manifest itself,
> which looks like this:
> 
> Manifest-Checksum
> 385fe156a8c6306db40937d59f46027cc079350ecf5221027d71367675c5f781
> 
> That's a SHA-256 checksum of the file contents excluding the final
> line. It can be verified by feeding all the file contents except the
> last line to shasum -a256. I can't help but observe that if the file
> were defined to be a JSONB blob, it's not very clear how you would
> include a checksum of the blob contents in the blob itself, but with a
> format based on a bunch of lines of data, it's super-easy to generate
> and super-easy to write tools that verify it.

You can do this in JSON pretty easily by handling the terminating
brace/bracket:

{
<some json contents>*,
"checksum":<sha256>
}

But of course a linefeed-delimited file is even easier.

> This is just a prototype so I haven't written a verification tool, and
> there's a bunch of testing and documentation and so forth that would
> need to be done aside from whatever we've got to hammer out in terms
> of design issues and file formats.  But I think it's cool, and perhaps
> some discussion of how it could be evolved will get us closer to a
> resolution everybody can at least live with.

I had a quick look and it seems pretty reasonable.  I'll need to
generate a manifest to see if I can spot any obvious gotchas.

-- 
-David
david@pgmasters.net