Re: backup manifests

Stephen Frost <sfrost@snowman.net>

From: Stephen Frost <sfrost@snowman.net>
To: Amit Kapila <amit.kapila16@gmail.com>
Cc: Noah Misch <noah@leadboat.com>, Andres Freund <andres@anarazel.de>, Robert Haas <robertmhaas@gmail.com>, David Steele <david@pgmasters.net>, Suraj Kharage <suraj.kharage@enterprisedb.com>, tushar <tushar.ahuja@enterprisedb.com>, Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>, Rushabh Lathia <rushabh.lathia@gmail.com>, Tels <nospam-pg-abuse@bloodgate.com>, Andrew Dunstan <andrew.dunstan@2ndquadrant.com>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>, Jeevan Chalke <jeevan.chalke@enterprisedb.com>, vignesh C <vignesh21@gmail.com>
Date: 2020-03-31T11:58:15Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Try to avoid compiler warnings in optimized builds.

  2. Fix option related issues in pg_verifybackup.

  3. Add index term for backup manifest in documentation.

  4. Code review for backup manifest.

  5. Document the backup manifest file format.

  6. Fix typo in pg_validatebackup documentation.

  7. Exclude backup_manifest file that existed in database, from BASE_BACKUP.

  8. Msys2 tweaks for pg_validatebackup corruption test

  9. Fix resource management bug with replication=database.

  10. Be more careful about time_t vs. pg_time_t in basebackup.c.

  11. pg_validatebackup: Fix 'make clean' to remove tmp_check.

  12. pg_validatebackup: Also use perl2host in TAP tests.

  13. Generate backup manifests for base backups, and validate them.

  14. Add checksum helper functions.

  15. pg_waldump: Add a --quiet option.

  16. Catversion bump for b9b408c48724

  17. pg_basebackup: Refactor code for reading COPY and tar data.

  18. Use a ResourceOwner to track buffer pins in all cases.

  19. Use ARMv8 CRC instructions where available.

  20. Logical replication support for initial data copy

  21. Use Intel SSE 4.2 CRC instructions where available.

  22. Switch to CRC-32C in WAL and other places.

  23. Remove support for 64-bit CRC.

  24. Change CRCs in WAL records from 64bit to 32bit for performance reasons.

Greetings,

* Amit Kapila (amit.kapila16@gmail.com) wrote:
> On Tue, Mar 31, 2020 at 11:10 AM Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Mar 30, 2020 at 12:16:31PM -0700, Andres Freund wrote:
> > > On 2020-03-30 15:04:55 -0400, Robert Haas wrote:
> > > > I guess I'd like to be clear here that I have no fundamental
> > > > disagreement with taking this tool in any direction that people would
> > > > like it to go. For me it's just a question of timing. Feature freeze
> > > > is now a week or so away, and nothing complicated is going to get done
> > > > in that time. If we can all agree on something simple based on
> > > > Andres's recent proposal, cool, but I'm not yet sure that will be the
> > > > case, so what's plan B? We could decide that what I have here is just
> > > > too little to be a viable facility on its own, but I think Stephen is
> > > > the only one taking that position. We could release it as
> > > > pg_validatemanifest with a plan to rename it if other backup-related
> > > > checks are added later. We could release it as pg_validatebackup with
> > > > the idea to avoid having to rename it when more backup-related checks
> > > > are added later, but with a greater possibility of confusion in the
> > > > meantime and no hard guarantee that anyone will actually develop such
> > > > checks. We could put it in to pg_checksums, but I think that's really
> > > > backing ourselves into a corner: if backup validation develops other
> > > > checks that are not checksum-related, what then? I'd much rather
> > > > gamble on keeping things together by topic (backup) than technology
> > > > used internally (checksum). Putting it into pg_basebackup is another
> > > > option, and would avoid that problem, but it's not my preferred
> > > > option, because as I noted before, I think the command-line options
> > > > will get confusing.
> > >
> > > I'm mildly inclined to name it pg_validate, pg_validate_dbdir or
> > > such. And eventually (definitely not this release) subsume pg_checksums
> > > in it. That way we can add other checkers too.
> >
> > Works for me; of those two, I prefer pg_validate.
> 
> pg_validate sounds like a tool with a much bigger purpose.  I think
> even things like amcheck could also fall under it.

Yeah, I tend to agree with this.

> This patch has two parts (a) Generate backup manifests for base
> backups, and (b) Validate backup (manifest).  It seems to me that
> there are not many things pending for (a), can't we commit that first
> or is it the case that (a) depends on (b)?  This is *not* a suggestion
> to leave pg_validatebackup from this release rather just to commit if
> something is ready and meaningful on its own.

I suspect the idea here is that we don't really want to commit something
that nothing is actually using, and that's understandable and justified
here- consider that even in this recent discussion there was talk that
maybe we should have included permissions and ownership in the manifest,
or starting and ending WAL positions, so that they'd be able to be
checked by this tool more easily (and because it's just useful to have
all that info in one place...  I don't really agree with the concerns
that it's an issue for static information like that to be duplicated).

In other words, while the manifest creation code might be something we
could commit, without a tool to use it (which does all the things that
we think it needs to, to perform some high-level task, such as "validate
a backup") we don't know that the manifest that's actually generated is
really up to snuff and has what it needs to have to perform that task.

I had been hoping that the discussion Andres was leading regarding
leveraging pg_waldump (or maybe just code from it..) would get us to a
point where pg_validatebackup would check that we have all of the WAL
needed for the backup to be consistent and that it would then verify the
internal checksums of the WAL.  That would certainly be a good solution
for this time around, in my view, and is already all existing
client-side code.  I do think we'd want to have a note about how we
verify pg_wal differently from the other files which are in the
manifest.

Thanks,

Stephen