Thread

Change checkpoint‑record‑missing PANIC to FATAL

Nitin Jadhav <nitinjadhavpostgres@gmail.com> — 2025-12-16T10:55:37Z
Hi,

While working on [1], we discussed whether the redo-record-missing error
should be a PANIC or a FATAL. We concluded that FATAL is more appropriate,
as it is more appropriate for the current situation and achieves the
intended behavior and also it is consistent with the backup_label path,
which already reports FATAL in the same scenario.

However, when the checkpoint record is missing, the behavior remains
inconsistent: Without a backup_label, we currently raise a PANIC. With a
backup_label, the same code path reports a FATAL.Since we have already made
the redo‑record‑missing case to FATAL in 15f68ce
<https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=15f68cebdcec>,
it seems reasonable to align the checkpoint‑record‑missing case as well.
The existing PANIC dates back to an era before online backups and archive
recovery existed, when external manipulation of WAL was not expected and
such conditions were treated as internal faults. With all such features, it
is much more realistic for WAL segments to go missing due to operational
issues, and such cases are often recoverable. So switching this to FATAL
appears appropriate.

Please share your thoughts.

I am happy to share a patch including a TAP test to cover this behavior
once we agree to proceed.

[1]:
https://www.postgresql.org/message-id/flat/CAMm1aWaaJi2w49c0RiaDBfhdCL6ztbr9m%3DdaGqiOuVdizYWYaA%40mail.gmail.com

Best Regards,
Nitin Jadhav
Azure Database for PostgreSQL
Microsoft