Re: backup manifests

Andres Freund <andres@anarazel.de>

From: Andres Freund <andres@anarazel.de>
To: Robert Haas <robertmhaas@gmail.com>
Cc: David Steele <david@pgmasters.net>, Noah Misch <noah@leadboat.com>, Stephen Frost <sfrost@snowman.net>, Amit Kapila <amit.kapila16@gmail.com>, Suraj Kharage <suraj.kharage@enterprisedb.com>, tushar <tushar.ahuja@enterprisedb.com>, Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>, Rushabh Lathia <rushabh.lathia@gmail.com>, Tels <nospam-pg-abuse@bloodgate.com>, Andrew Dunstan <andrew.dunstan@2ndquadrant.com>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>, Jeevan Chalke <jeevan.chalke@enterprisedb.com>, vignesh C <vignesh21@gmail.com>
Date: 2020-03-31T22:50:34Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Try to avoid compiler warnings in optimized builds.

  2. Fix option related issues in pg_verifybackup.

  3. Add index term for backup manifest in documentation.

  4. Code review for backup manifest.

  5. Document the backup manifest file format.

  6. Fix typo in pg_validatebackup documentation.

  7. Exclude backup_manifest file that existed in database, from BASE_BACKUP.

  8. Msys2 tweaks for pg_validatebackup corruption test

  9. Fix resource management bug with replication=database.

  10. Be more careful about time_t vs. pg_time_t in basebackup.c.

  11. pg_validatebackup: Fix 'make clean' to remove tmp_check.

  12. pg_validatebackup: Also use perl2host in TAP tests.

  13. Generate backup manifests for base backups, and validate them.

  14. Add checksum helper functions.

  15. pg_waldump: Add a --quiet option.

  16. Catversion bump for b9b408c48724

  17. pg_basebackup: Refactor code for reading COPY and tar data.

  18. Use a ResourceOwner to track buffer pins in all cases.

  19. Use ARMv8 CRC instructions where available.

  20. Logical replication support for initial data copy

  21. Use Intel SSE 4.2 CRC instructions where available.

  22. Switch to CRC-32C in WAL and other places.

  23. Remove support for 64-bit CRC.

  24. Change CRCs in WAL records from 64bit to 32bit for performance reasons.

Hi,

On 2020-03-31 14:10:34 -0400, Robert Haas wrote:
> I made an attempt to implement this.

Awesome!


> In the attached patch set, 0001 I'm going to work on those things. I
> would appreciate *very timely* feedback on anything people do or do
> not like about this, because I want to commit this patch set by the
> end of the work week and that isn't very far away. I would also
> appreciate if people would bear in mind the principle that half a loaf
> is better than none, and further improvements can be made in future
> releases.
> 
> As part of my light testing, I tried promoting a standby that was
> running pg_basebackup, and found that pg_basebackup failed like this:
> 
> pg_basebackup: error: could not get COPY data stream: ERROR:  the
> standby was promoted during online backup
> HINT:  This means that the backup being taken is corrupt and should
> not be used. Try taking another online backup.
> pg_basebackup: removing data directory "/Users/rhaas/pgslave2"
> 
> My first thought was that this error message is hard to reconcile with
> this comment:
> 
>         /*
>          * Send timeline history files too. Only the latest timeline history
>          * file is required for recovery, and even that only if there happens
>          * to be a timeline switch in the first WAL segment that contains the
>          * checkpoint record, or if we're taking a base backup from a standby
>          * server and the target timeline changes while the backup is taken.
>          * But they are small and highly useful for debugging purposes, so
>          * better include them all, always.
>          */
> 
> But then it occurred to me that this might be a cascading standby.

Yea. The check just prevents the walsender's database from being
promoted:

		/*
		 * Check if the postmaster has signaled us to exit, and abort with an
		 * error in that case. The error handler further up will call
		 * do_pg_abort_backup() for us. Also check that if the backup was
		 * started while still in recovery, the server wasn't promoted.
		 * do_pg_stop_backup() will check that too, but it's better to stop
		 * the backup early than continue to the end and fail there.
		 */
		CHECK_FOR_INTERRUPTS();
		if (RecoveryInProgress() != backup_started_in_recovery)
			ereport(ERROR,
					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
					 errmsg("the standby was promoted during online backup"),
					 errhint("This means that the backup being taken is corrupt "
							 "and should not be used. "
							 "Try taking another online backup.")));
and

	if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
		ereport(ERROR,
				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
				 errmsg("the standby was promoted during online backup"),
				 errhint("This means that the backup being taken is corrupt "
						 "and should not be used. "
						 "Try taking another online backup.")));

So that just prevents promotions of the current node, afaict.



> Regardless, it seems like a really good idea to store a list of WAL
> ranges rather than a single start/end/timeline, because even if it's
> impossible today it might become possible in the future.

Indeed.


> Still, unless there's an easy way to set up a test scenario where
> multiple WAL ranges need to be verified, it may be hard to test that
> this code actually behaves properly.

I think it'd be possible to test without a fully cascading setup, by
creating an initial base backup, then do some work to create a bunch of
new timelines, and then start the initial base backup. That'd have to
follow all those timelines.  Not sure that's better than a cascading
setup though.


> +/*
> + * Add information about the WAL that will need to be replayed when restoring
> + * this backup to the manifest.
> + */
> +static void
> +AddWALInfoToManifest(manifest_info *manifest, XLogRecPtr startptr,
> +					 TimeLineID starttli, XLogRecPtr endptr, TimeLineID endtli)
> +{
> +	List *timelines = readTimeLineHistory(endtli);

should probably happen after the manifest->buffile check.


> +	ListCell *lc;
> +	bool	first_wal_range = true;
> +	bool	found_ending_tli = false;
> +
> +	/* If there is no buffile, then the user doesn't want a manifest. */
> +	if (manifest->buffile == NULL)
> +		return;

Not really about this patch/function specifically: I wonder if this'd
look better if you added ManifestEnabled() macro instead of repeating
the comment repeatedly.



> +	/* Unless --no-parse-wal was specified, we will need pg_waldump. */
> +	if (!no_parse_wal)
> +	{
> +		int		ret;
> +
> +		pg_waldump_path = pg_malloc(MAXPGPATH);
> +		ret = find_other_exec(argv[0], "pg_waldump",
> +							  "pg_waldump (PostgreSQL) " PG_VERSION "\n",
> +							 pg_waldump_path);
> +		if (ret < 0)
> +		{
> +			char	full_path[MAXPGPATH];
> +
> +			if (find_my_exec(argv[0], full_path) < 0)
> +				strlcpy(full_path, progname, sizeof(full_path));
> +			if (ret == -1)
> +				pg_log_fatal("The program \"%s\" is needed by %s but was\n"
> +							 "not found in the same directory as \"%s\".\n"
> +							 "Check your installation.",
> +							 "pg_waldump", "pg_validatebackup", full_path);
> +			else
> +				pg_log_fatal("The program \"%s\" was found by \"%s\" but was\n"
> +							 "not the same version as %s.\n"
> +							 "Check your installation.",
> +							 "pg_waldump", full_path, "pg_validatebackup");
> +		}
> +	}

ISTM, and this can definitely wait for another time, that we should have
one wrapper doing all of this, instead of having quite a few copies of
very similar logic to the above.


> +/*
> + * Attempt to parse the WAL files required to restore from backup using
> + * pg_waldump.
> + */
> +static void
> +parse_required_wal(validator_context *context, char *pg_waldump_path,
> +				   char *wal_directory, manifest_wal_range *first_wal_range)
> +{
> +	manifest_wal_range *this_wal_range = first_wal_range;
> +
> +	while (this_wal_range != NULL)
> +	{
> +		char *pg_waldump_cmd;
> +
> +		pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%X --end=%X/%X\n",
> +			   pg_waldump_path, wal_directory, this_wal_range->tli,
> +			   (uint32) (this_wal_range->start_lsn >> 32),
> +			   (uint32) this_wal_range->start_lsn,
> +			   (uint32) (this_wal_range->end_lsn >> 32),
> +			   (uint32) this_wal_range->end_lsn);
> +		if (system(pg_waldump_cmd) != 0)
> +			report_backup_error(context,
> +								"WAL parsing failed for timeline %u",
> +								this_wal_range->tli);
> +
> +		this_wal_range = this_wal_range->next;
> +	}
> +}

Should we have a function to properly escape paths in cases like this?
Not that it's likely or really problematic, but the quoting for path
could be "circumvented".


Greetings,

Andres Freund