Thread

  1. Enforcing that all WAL has been replayed after restoring from backup

    Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> — 2011-08-09T09:00:00Z

    Currently, if you take a backup with "pg_basebackup -x" (which means it 
    will include all the WAL to required restore within the backup dump), 
    and hit Ctrl-C while the WAL is being streamed, you end up with a data 
    directory that you can start postmaster from, even though the backup was 
    not complete. So what appears to be a valid backup - it starts up fine - 
    can actually be corrupt.
    
    I put in a check against that back in March, but it had to be reverted 
    because it broke crash recovery when the system crashed while a 
    pg_start_backup() based backup was in progress:
    
    http://archives.postgresql.org/message-id/4DA58686.1050501@enterprisedb.com
    
    Here's a patch to add it back in a more fine-grained fashion. The patch 
    adds an extra line to backup_label, indicating whether the backup was 
    taken with pg_start/stop_backup(), or by streaming (= pg_basebackup). 
    For a backup taken with pg_start_backup(), the behavior is kept the same 
    as it has been - if the end-of-backup record is not reached during crash 
    recovery, the database starts up anyway. But for a streamed backup, you 
    get an error at startup.
    
    I think this is a nice additional safeguard to have, making streamed 
    backups more robust. I'd like to add this to 9.1, but it required an 
    extra field to be added to the control file, so it would force an 
    initdb. It's probably not worth that. Or, we could sneak in the extra 
    boolean field to some currently unused pad space in the ControlFile struct.
    
    -- 
       Heikki Linnakangas
       EnterpriseDB   http://www.enterprisedb.com