Re: [PATCH] Fix pg_rewind false positives caused by shutdown-only WAL
BharatDB <bharatdbpg@gmail.com>
From: BharatDB <bharatdbpg@gmail.com>
To: Robert Haas <robertmhaas@gmail.com>
Cc: Srinath Reddy Sadipiralla <srinath2133@gmail.com>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-10-22T12:59:25Z
Lists: pgsql-hackers
Attachments
- 0001-pg_rewind-Skip-false-positive-rewind-on-benign-shutd.patch (text/x-patch) patch 0001
On Wed, Oct 1, 2025 at 6:10 PM Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Sep 30, 2025 at 1:24 PM Srinath Reddy Sadipiralla > <srinath2133@gmail.com> wrote: > > Can you please once confirm this, did you mean that this is not even an > actual problem to fix or only this patch's logic which I provided does not > make sense?, because i am trying out come up with another patch based on > your inputs regarding considering controlfile changes , ignoring > RUNNING_XACTS records, and to use XLogRecGetRmid test. > > Well, the patch's idea is that we can ignore certain WAL records when > deciding whether pg_rewind is needed. But I do not think we can do > that, because (1) those WAL records might do important things like > update the control file and (2) the server will not be OK with > ignoring those WAL records even if pg_rewind decides that they are not > important. If you have a plan for working around those two issues, > please say what your plan is. I don't personally see how it would be > possible to work around those issues, but of course somebody else > might have a good idea that has not occurred to me. > > -- > Robert Haas > EDB: http://www.enterprisedb.com Hi all, With reference to the previous mail, I’d like to submit a small patch for pg_rewind to fix an issue with false-positive rewinds. The patch includes the following logic: - Control file changes: Detects benign shutdown checkpoint differences in the control file and prevents unnecessary rewinds. - Other WAL records (RUNNING_XACTS): Ensured that only meaningful WAL differences trigger rewinds, ignoring records that do not affect server consistency. - Server-side consistency: Maintains data integrity while skipping rewinds for harmless control file changes. - RMID verification: Confirms that WAL records are examined correctly using their Resource Manager IDs (RMID) to avoid misinterpreting benign differences. I added a simple check in pg_rewind to detect these benign control file differences. When such a difference is detected, pg_rewind skips the rewind, prints a log message and no false-positive changes are applied. This is implemented in the function control_diff_is_benign() and is integrated into the last checkpoint detection logic. I tested the logic with a small test script which automatically - initializes a primary cluster and inserts some test data, creates a standby using pg_basebackup, promotes the standby to primary, injects a benign control file change using pg_resetwal, runs pg_rewind and verifies that no rewind happens and confirms that data remains consistent between the clusters. After testing, I got the output as:test_pg_rewind_fix.sh) === 🧮 Test Setup === PRIMARY PORT: 50584 STANDBY PORT: 51636 BASE DIR: /home/deepshikha/pg_rewind_test Cleaning workspace... Initializing primary cluster... initdb: warning: enabling "trust" authentication for local connections initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. waiting for server to start.... done server started Old primary WAL position: 0/01BA4B80 Creating standby cluster via pg_basebackup... 31374/31374 kB (100%), 1/1 tablespace waiting for server to start...... done server started waiting for server to promote.... done server promoted New promoted primary WAL position: 0/03000130 Stopping old primary... waiting for server to shut down.... done server stopped === WAL Summary === Old primary WAL: 0/01BA4B80 New primary WAL: 0/03000130 Injecting benign control file change (pg_resetwal)... Benign difference introduced. Running pg_rewind to test fix... --- pg_rewind output --- pg_rewind: servers diverged at WAL location 0/03000000 on timeline 1 pg_rewind: benign shutdown checkpoint difference detected, skipping rewind pg_rewind: error: could not open file "/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000003": No such file or directory pg_rewind: benign shutdown checkpoint difference detected, skipping rewind pg_rewind: rewinding from last common checkpoint at 0/000006F0 on timeline 2796767292 pg_rewind: error: could not open file "/home/deepshikha/pg_rewind_test/primary/pg_wal/000000010000000000000000": No such file or directory pg_rewind: error: could not read WAL record at 0/000006F0 ------------------------ waiting for server to start.... done server started === Data check === Primary data: id | data ----+--------- 1 | primary 2 | wal1 3 | wal2 (3 rows) New primary data: id | data ----+--------- 1 | primary 2 | wal1 3 | wal2 (3 rows) PASS: Benign control file difference correctly detected. No false-positive rewind. === Test completed === Note => PASS: Benign control file difference correctly detected. No false-positive rewind. This confirms that the patch works as expected, preventing unnecessary rewinds when the only difference between the old primary and the new primary is a benign shutdown checkpoint change. I Kindly request you to review the patch and please let me know if any additional details need to be focused on. Thanking you. Regards, Soumya