v2-0001-Skip-WAL-for-unlogged-relations-during-online-checks.patch
application/octet-stream
Filename: v2-0001-Skip-WAL-for-unlogged-relations-during-online-checks.patch
Type: application/octet-stream
Part: 0
From 7f5addd7b93132dbb4d8a9a0e45a961df4470806 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Tue, 5 May 2026 14:54:23 +0200
Subject: [PATCH v3 1/3] Skip WAL for unlogged relations during online checksum
enable
ProcessSingleRelationFork() unconditionally generated an FPI WAL
record for every page of every relation when enabling checksums.
Unlogged relations, which by definition never generate WAL for
data changes, were not exempt which generated excessive WAL to
be emitted.
Fix by guarding the FPI WAL record call with RelationNeedsWAL()
to avoid emitting WAL for unlogged main forks. Unlogged pages
are still dirtied to ensure the checksum is written to disk at
the next checkpoint.
The init fork remains WAL-logged even for unlogged relations,
because that is what the standby uses to materialize the relation
after promotion (see ResetUnloggedRelations()). Skipping init-fork
WAL would leave the standby with a stale init fork that, once
copied to the main fork on promotion, would fail checksum
verification on every read of the unlogged relation.
A test which creates an unlogged table with an index, enables
checksums, promotes the standby, and verifies that the unlogged
relation and its indexes are still readable post-promotion has
been added.
Author: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAHg+QDeGrpZbNZdLjd_T4b43xKEEXZN0HGhkFm-1bkBdyzK7AQ@mail.gmail.com
---
src/backend/postmaster/datachecksum_state.c | 16 +++-
.../test_checksums/t/003_standby_restarts.pl | 65 ++++++++++++++++++-
2 files changed, 78 insertions(+), 3 deletions(-)
diff --git a/src/backend/postmaster/datachecksum_state.c b/src/backend/postmaster/datachecksum_state.c
index c7c0593345..fb29662423 100644
--- a/src/backend/postmaster/datachecksum_state.c
+++ b/src/backend/postmaster/datachecksum_state.c
@@ -690,11 +690,23 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
* at one point in the past, so only when checksums are first on, then
* off, and then turned on again. TODO: investigate if this could be
* avoided if the checksum is calculated to be correct and wal_level
- * is set to "minimal",
+ * is set to "minimal".
+ *
+ * Unlogged relations don't need WAL since they are reset to their
+ * init fork on recovery. We still dirty the buffer so that the
+ * checksum is written to disk at the next checkpoint.
+ *
+ * The init fork is an exception: it is WAL-logged so the standby
+ * can materialize the relation after promotion (see
+ * ResetUnloggedRelations()). Skipping it here would leave the
+ * standby with a stale init fork that, once copied to the main
+ * fork on promotion, would fail checksum verification on every
+ * read.
*/
START_CRIT_SECTION();
MarkBufferDirty(buf);
- log_newpage_buffer(buf, false);
+ if (RelationNeedsWAL(reln) || forkNum == INIT_FORKNUM)
+ log_newpage_buffer(buf, false);
END_CRIT_SECTION();
UnlockReleaseBuffer(buf);
diff --git a/src/test/modules/test_checksums/t/003_standby_restarts.pl b/src/test/modules/test_checksums/t/003_standby_restarts.pl
index 11e15c9d73..26b68b98e6 100644
--- a/src/test/modules/test_checksums/t/003_standby_restarts.pl
+++ b/src/test/modules/test_checksums/t/003_standby_restarts.pl
@@ -115,6 +115,69 @@ $result =
$node_primary->safe_psql('postgres', "SELECT count(a) FROM t WHERE a > 1");
is($result, "19998", 'ensure we can safely read all data without checksums');
-$node_standby->stop;
+# ---------------------------------------------------------------------------
+# Test that enabling checksums on a cluster with unlogged relations does not
+# break those relations after a standby promotion. Unlogged relations only
+# WAL-log their init fork; that init fork is what ResetUnloggedRelations()
+# copies to the main fork on promotion. If checksum-enable on the primary
+# does not WAL-log init-fork rewrites, the standby keeps a stale init fork
+# and the post-promotion main fork fails verification on every read.
+#
+
+# Use a btree index so the init fork is non-trivial (one metapage)
+$node_primary->safe_psql(
+ 'postgres', q[
+ CREATE UNLOGGED TABLE unlogged_tbl (id int PRIMARY KEY,
+ payload text);
+ INSERT INTO unlogged_tbl
+ SELECT g, repeat('x', 100) FROM generate_series(1, 1000) g;
+ CREATE INDEX unlogged_tbl_payload_idx ON unlogged_tbl (payload);
+]);
+$node_primary->wait_for_catchup($node_standby, 'replay',
+ $node_primary->lsn('insert'));
+
+# Re-enable data checksums and wait for both nodes to converge to "on"
+enable_data_checksums($node_primary, wait => 'on');
+wait_for_checksum_state($node_standby, 'on');
+$node_primary->wait_for_catchup($node_standby, 'replay',
+ $node_primary->lsn('insert'));
+
+# Promote the standby and verify the unlogged relation is still usable.
+# Without the init-fork WAL fix, every read of the index would fail with
+# "page verification failed, calculated checksum X but expected 0".
$node_primary->stop;
+$node_standby->promote;
+
+$result =
+ $node_standby->safe_psql('postgres', 'SELECT count(*) FROM unlogged_tbl;');
+is($result, '0',
+ 'unlogged table readable on promoted standby (truncated as expected)');
+
+$node_standby->safe_psql('postgres',
+ "INSERT INTO unlogged_tbl SELECT g, repeat('y',100) FROM generate_series(1,100) g;"
+);
+$result = $node_standby->safe_psql('postgres',
+ 'SET enable_seqscan = off; SELECT id FROM unlogged_tbl WHERE id = 50;');
+is($result, '50', 'indexed lookup on promoted standby returns expected row');
+
+$node_standby->safe_psql('postgres', 'CREATE EXTENSION IF NOT EXISTS amcheck;');
+$node_standby->safe_psql('postgres',
+ "SELECT bt_index_check('unlogged_tbl_pkey'::regclass);");
+$node_standby->safe_psql('postgres',
+ "SELECT bt_index_check('unlogged_tbl_payload_idx'::regclass);");
+
+$node_standby->stop;
+
+# Final sanity sweep of the logs for any checksum failures
+my $log = PostgreSQL::Test::Utils::slurp_file($node_primary->logfile, 0);
+unlike(
+ $log,
+ qr/page verification failed,.+\d$/m,
+ "no checksum validation errors in primary log");
+$log = PostgreSQL::Test::Utils::slurp_file($node_standby->logfile, 0);
+unlike(
+ $log,
+ qr/page verification failed,.+\d$/m,
+ "no checksum validation errors in standby log");
+
done_testing();
--
2.39.3 (Apple Git-146)