0007-parallel-backup-documentation_v7.patch
application/octet-stream
Filename: 0007-parallel-backup-documentation_v7.patch
Type: application/octet-stream
Part: 0
Message:
Re: WIP/PoC for parallel backup
Patch
Same data as JSON:
GET /api/v1/attachments/:id/patch
the parsed metadata as JSON — format, series position, per-file stats; never the diff bytes.
API reference →
Format: format-patch
Series: patch v7-0007
Subject: parallel backup documentation
| File | + | − |
|---|---|---|
| doc/src/sgml/protocol.sgml | 386 | 0 |
| doc/src/sgml/ref/pg_basebackup.sgml | 20 | 0 |
From 63952eafd3d2dbda70535048dbed2815fc75c3d0 Mon Sep 17 00:00:00 2001
From: Asif Rehman <asif.rehman@highgo.ca>
Date: Thu, 7 Nov 2019 16:52:40 +0500
Subject: [PATCH 7/7] parallel backup documentation
---
doc/src/sgml/protocol.sgml | 386 ++++++++++++++++++++++++++++
doc/src/sgml/ref/pg_basebackup.sgml | 20 ++
2 files changed, 406 insertions(+)
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 80275215e0..d582209229 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2700,6 +2700,392 @@ The commands accepted in replication mode are:
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><literal>START_BACKUP</literal>
+ [ <literal>LABEL</literal> <replaceable>'label'</replaceable> ]
+ [ <literal>PROGRESS</literal> ]
+ [ <literal>FAST</literal> ]
+ [ <literal>TABLESPACE_MAP</literal> ]
+
+ <indexterm><primary>START_BACKUP</primary></indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Instructs the server to prepare for performing on-line backup. The following
+ options are accepted:
+ <variablelist>
+ <varlistentry>
+ <term><literal>LABEL</literal> <replaceable>'label'</replaceable></term>
+ <listitem>
+ <para>
+ Sets the label of the backup. If none is specified, a backup label
+ of <literal>start backup</literal> will be used. The quoting rules
+ for the label are the same as a standard SQL string with
+ <xref linkend="guc-standard-conforming-strings"/> turned on.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>PROGRESS</literal></term>
+ <listitem>
+ <para>
+ Request information required to generate a progress report. This will
+ send back an approximate size in the header of each tablespace, which
+ can be used to calculate how far along the stream is done. This is
+ calculated by enumerating all the file sizes once before the transfer
+ is even started, and might as such have a negative impact on the
+ performance. In particular, it might take longer before the first data
+ is streamed. Since the database files can change during the backup,
+ the size is only approximate and might both grow and shrink between
+ the time of approximation and the sending of the actual files.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>FAST</literal></term>
+ <listitem>
+ <para>
+ Request a fast checkpoint.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>TABLESPACE_MAP</literal></term>
+ <listitem>
+ <para>
+ Include information about symbolic links present in the directory
+ <filename>pg_tblspc</filename> in a file named
+ <filename>tablespace_map</filename>. The tablespace map file includes
+ each symbolic link name as it exists in the directory
+ <filename>pg_tblspc/</filename> and the full path of that symbolic link.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
+ <para>
+ In response to this command, server will send out three result sets.
+ </para>
+ <para>
+ The first ordinary result set contains the starting position of the
+ backup, in a single row with two columns. The first column contains
+ the start position given in XLogRecPtr format, and the second column
+ contains the corresponding timeline ID.
+ </para>
+
+ <para>
+ The second ordinary result set has one row for each tablespace.
+ The fields in this row are:
+ <variablelist>
+ <varlistentry>
+ <term><literal>spcoid</literal> (<type>oid</type>)</term>
+ <listitem>
+ <para>
+ The OID of the tablespace, or null if it's the base
+ directory.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>spclocation</literal> (<type>text</type>)</term>
+ <listitem>
+ <para>
+ The full path of the tablespace directory, or null
+ if it's the base directory.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>size</literal> (<type>int8</type>)</term>
+ <listitem>
+ <para>
+ The approximate size of the tablespace, in kilobytes (1024 bytes),
+ if progress report has been requested; otherwise it's null.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
+ <para>
+ The final result set will be sent in a single row with two columns. The
+ first column contains the data of <filename>backup_label</filename> file,
+ and the second column contains the data of <filename>tablespace_map</filename>.
+ </para>
+
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>STOP_BACKUP</literal>
+ [ <literal>LABEL</literal> <replaceable>'label'</replaceable> ]
+ [ <literal>WAL</literal> ]
+ [ <literal>NOWAIT</literal> ]
+
+ <indexterm><primary>STOP_BACKUP</primary></indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Instructs the server to finish performing on-line backup. The following
+ options are accepted:
+ <variablelist>
+ <varlistentry>
+ <term><replaceable class="parameter">LABEL</replaceable><replaceable>'string'</replaceable></term>
+ <listitem>
+ <para>
+ Provides the content of backup_label file to the backup. The content are
+ the same that were returned by <command>START_BACKUP</command>.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>WAL</literal></term>
+ <listitem>
+ <para>
+ Include the necessary WAL segments in the backup. This will include
+ all the files between start and stop backup in the
+ <filename>pg_wal</filename> directory of the base directory tar
+ file.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>NOWAIT</literal></term>
+ <listitem>
+ <para>
+ By default, the backup will wait until the last required WAL
+ segment has been archived, or emit a warning if log archiving is
+ not enabled. Specifying <literal>NOWAIT</literal> disables both
+ the waiting and the warning, leaving the client responsible for
+ ensuring the required log is available.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
+ <para>
+ In response to this command, server will send one or more CopyResponse
+ results followed by a single result set, containing the WAL end position of
+ the backup. The CopyResponse contains <filename>pg_control</filename> and
+ WAL files, if stop backup is run with WAL option.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>SEND_FILE_LIST</literal>
+ <indexterm><primary>SEND_FILE_LIST</primary></indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Instruct the server to return a list of files and directories, available in
+ data directory. In response to this command, server will send one result set
+ per tablespace. The result sets consist of following fields:
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>path</literal> (<type>text</type>)</term>
+ <listitem>
+ <para>
+ The path and name of the file. In case of tablespace, it is an absolute
+ path on the database server, however, in case of <filename>base</filename>
+ tablespace, it is relative to $PGDATA.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>type</literal> (<type>char</type>)</term>
+ <listitem>
+ <para>
+ A single character, identifying the type of file.
+ <itemizedlist spacing="compact" mark="bullet">
+ <listitem>
+ <para>
+ <literal>'f'</literal> - Regular file. Can be any relation or
+ non-relation file in $PGDATA.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>'d'</literal> - Directory.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>'l'</literal> - Symbolic link.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>size</literal> (<type>int8</type>)</term>
+ <listitem>
+ <para>
+ The approximate size of the file, in kilobytes (1024 bytes). It's null if
+ type is 'd' or 'l'.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>mtime</literal> (<type>Int64</type>)</term>
+ <listitem>
+ <para>
+ The file or directory last modification time, as seconds since the Epoch.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ This list will contain all files and directories in the $PGDATA, regardless of
+ whether they are PostgreSQL files or other files added to the same directory.
+ The only excluded files are:
+ <itemizedlist spacing="compact" mark="bullet">
+ <listitem>
+ <para>
+ <filename>postmaster.pid</filename>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <filename>postmaster.opts</filename>
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <filename>pg_internal.init</filename> (found in multiple directories)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Various temporary files and directories created during the operation
+ of the PostgreSQL server, such as any file or directory beginning
+ with <filename>pgsql_tmp</filename> and temporary relations.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Unlogged relations, except for the init fork which is required to
+ recreate the (empty) unlogged relation on recovery.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <filename>pg_wal</filename>, including subdirectories. If the backup is run
+ with WAL files included, a synthesized version of <filename>pg_wal</filename> will be
+ included, but it will only contain the files necessary for the
+ backup to work, not the rest of the contents.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <filename>pg_dynshmem</filename>, <filename>pg_notify</filename>,
+ <filename>pg_replslot</filename>, <filename>pg_serial</filename>,
+ <filename>pg_snapshots</filename>, <filename>pg_stat_tmp</filename>, and
+ <filename>pg_subtrans</filename> are copied as empty directories (even if
+ they are symbolic links).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Files other than regular files and directories, such as symbolic
+ links (other than for the directories listed above) and special
+ device files, are skipped. (Symbolic links
+ in <filename>pg_tblspc</filename> are maintained.)
+ </para>
+ </listitem>
+ </itemizedlist>
+ Owner, group, and file mode are set if the underlying file system on the server
+ supports it.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>SEND_FILES ( <replaceable class="parameter">'FILE'</replaceable> [, ...] )</literal>
+ [ <literal>MAX_RATE</literal> <replaceable>rate</replaceable> ]
+ [ <literal>NOVERIFY_CHECKSUMS</literal> ]
+ [ <literal>START_WAL_LOCATION</literal> ]
+
+ <indexterm><primary>SEND_FILES</primary></indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ Instructs the server to send the contents of the requested FILE(s).
+ </para>
+
+ <para>
+ A clause of the form <literal>SEND_FILES ( 'FILE', 'FILE', ... ) [OPTIONS]</literal>
+ is accepted where one or more FILE(s) can be requested.
+ </para>
+
+ <para>
+ In response to this command, one or more CopyResponse results will be sent,
+ one for each FILE requested. The data in the CopyResponse results will be
+ a tar format (following the “ustar interchange format” specified in the
+ POSIX 1003.1-2008 standard) dump of the tablespace contents, except that
+ the two trailing blocks of zeroes specified in the standard are omitted.
+ </para>
+
+ <para>
+ The following options are accepted:
+ <variablelist>
+ <varlistentry>
+ <term><literal>MAX_RATE</literal> <replaceable>rate</replaceable></term>
+ <listitem>
+ <para>
+ Limit (throttle) the maximum amount of data transferred from server
+ to client per unit of time. The expected unit is kilobytes per second.
+ If this option is specified, the value must either be equal to zero
+ or it must fall within the range from 32 kB through 1 GB (inclusive).
+ If zero is passed or the option is not specified, no restriction is
+ imposed on the transfer.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>NOVERIFY_CHECKSUMS</literal></term>
+ <listitem>
+ <para>
+ By default, checksums are verified during a base backup if they are
+ enabled. Specifying <literal>NOVERIFY_CHECKSUMS</literal> disables
+ this verification.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>START_WAL_LOCATION</literal></term>
+ <listitem>
+ <para>
+ The starting WAL position when START BACKUP command was issued,
+ returned in the form of XLogRecPtr format.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index fc9e222f8d..339e68bda7 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -536,6 +536,26 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><option>-j <replaceable class="parameter">n</replaceable></option></term>
+ <term><option>--jobs=<replaceable class="parameter">n</replaceable></option></term>
+ <listitem>
+ <para>
+ Create <replaceable class="parameter">n</replaceable> threads to copy
+ backup files from the database server. <application>pg_basebackup</application>
+ will open <replaceable class="parameter">n</replaceable> +1 connections
+ to the database. Therefore, the server must be configured with
+ <xref linkend="guc-max-wal-senders"/> set high enough to accommodate all
+ connections.
+ </para>
+
+ <para>
+ parallel mode only works with plain format.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</para>
--
2.21.0 (Apple Git-122.2)