Re: Timeline switching with partial WAL records can break replica recovery
Alena Vinter <dlaaren8@gmail.com>
From: D Laaren <dlaaren8@gmail.com>
To: pgsql-hackers@lists.postgresql.org
Date: 2025-06-17T11:59:14Z
Lists: pgsql-hackers
Attachments
- how_replicas_enter_indefinite_loop_1.jpg (image/jpeg)
- how_replicas_enter_indefinite_loop_2.jpg (image/jpeg)
I've done more research and identified that replicas enter an indefinite loop in the 'XLogReadPage' function. The loop works as follows: 0. timeline N contains a partially written record with LSN = targetRecPtr; 1. In 'XLogReadPage' we attempt to read the next page, which has to contain the rest of the unfinished record; 2. In 'WaitForWALToBecomeAvailable' walrcv is requested to fetch records starting from LSN = targetRecPtr on timeline N + 1; 3. Walrcv retrieves data up to the end of page containing the end of timeline N + 1; 4. Then, in 'WaitForWALToBecomeAvailable', replica switches to XLOG_FROM_ARCHIVE state, and the function returns true; 5. Execution continues in 'XLogReadPage'; 6. The page at addr = targetPagePtr is checked for validity, but we get an 'invalid magic number' error because walrcv hasn't retrieved this page; 7. Execution jumps to 'next_record_is_invalid' label; 8. Since we are in StandBy mode, the process retries from the beginning. See the attachments for more colorful illustration this time =) From my point of view, the first solution which I described in my previous message still seems like a good choice. I've also found the current solution in commit [1]. With all due respect, but it seems to treat the symptom rather than the underlying issue. [1] https://github.com/postgres/postgres/commit/6cf1647d87e7cd423d71525a8759b75c4e4a47ec