Re: Requiring recovery.signal or standby.signal when recovering with a backup_label
David Zhang <david.zhang@highgo.ca>
From: David Zhang <david.zhang@highgo.ca>
To: Michael Paquier <michael@paquier.xyz>
Cc: Postgres hackers <pgsql-hackers@lists.postgresql.org>
Date: 2023-07-19T18:21:17Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Delay recovery mode LOG after reading backup_label and/or checkpoint record
- dc5bd3889437 17.0 landed
-
Mention standby.signal in FATALs for checkpoint record missing at recovery
- 1ffdc03c21ae 17.0 landed
-
XLOG file archiving and point-in-time recovery. There are still some
- 66ec2db72840 8.0.0 cited
On 2023-07-16 6:27 p.m., Michael Paquier wrote: > > Delete a backup_label from a fresh base backup can easily lead to data > corruption, as the startup process would pick up as LSN to start > recovery from the control file rather than the backup_label file. > This would happen if a checkpoint updates the redo LSN in the control > file while a backup happens and the control file is copied after the > checkpoint, for instance. If one wishes to deploy a new primary from > a base backup, recovery.signal is the way to go, making sure that the > new primary is bumped into a new timeline once recovery finishes, on > top of making sure that the startup process starts recovery from a > position where the cluster would be able to achieve a consistent > state. Thanks a lot for sharing this information. > > How would you rewrite that? I am not sure how many details we want to > put here in terms of differences between recovery.signal and > standby.signal, still we surely should mention these are the two > possible choices. Honestly, I can't convince myself to mention the backup_label here too. But, I can share some information regarding my testing of the patch and the corresponding results. To assess the impact of the patch, I executed the following commands for before and after, pg_basebackup -h localhost -p 5432 -U david -D pg_backup1 pg_ctl -D pg_backup1 -l /tmp/logfile start Before the patch, there were no issues encountered when starting an independent Primary server. However, after applying the patch, I observed the following behavior when starting from the base backup: 1) simply start server from a base backup FATAL: could not find recovery.signal or standby.signal when recovering with backup_label HINT: If you are restoring from a backup, touch "/media/david/disk1/pg_backup1/recovery.signal" or "/media/david/disk1/pg_backup1/standby.signal" and add required recovery options. 2) touch a recovery.signal file and then try to start the server, the following error was encountered: FATAL: must specify restore_command when standby mode is not enabled 3) touch a standby.signal file, then the server successfully started, however, it operates in standby mode, whereas the intended behavior was for it to function as a primary server. Best regards, David