Thread

  1. Re: pg_waldump: support decoding of WAL inside tarfile

    Jakub Wartak <jakub.wartak@enterprisedb.com> — 2025-11-19T08:20:14Z

    On Mon, Nov 17, 2025 at 5:51 AM Amul Sul <sulamul@gmail.com> wrote:
    >
    > On Thu, Nov 6, 2025 at 2:33 PM Amul Sul <sulamul@gmail.com> wrote:
    > >
    > > On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <robertmhaas@gmail.com> wrote:
    > > >
    > > > On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <sulamul@gmail.com> wrote:
    > > > [....]
    > > Kindly have a look at the attached version. Thank you !
    > >
    >
    > Attached is the rebased version against the latest master head (e76defbcf09).
    
    Hi Amul, thanks for working on this. I haven't really looked at the
    source code deeply (I trust Robert eyes much more than mine on this
    one), just skimmed a little bit:
    
    1. As stated earlier, get_tmp_walseg_path() is still vulnerable (it
    uses predictable path that could be used by attacker in $TMPDIR)
    
    2. On the usability front:
    
    a. If you do `pg_waldump --path pg_wal.tar -s 0/31000000` it will dump
    a lot of WAL records and then print final:
    pg_waldump: error: could not find file "000000010000000000000034" in archive
    
    However, with `pg_waldump --path pg_wal.tar -s 0/31000000
    --stats=record` (not passing '-e') it will simply bailout without
    printing stats and with error:
    pg_waldump: error: could not find file "000000010000000000000034" in archive
    
    IMHO, it could print stats if it was capable of getting at least 1 WAL record.
    
    3. The most critical issue for me was the initial lack of error
    pass-through from pg_waldump (when used with WALs in tar) to the
    pg_verifybackup. Now it works fine, so thanks for this:
    
    a. pg_waldump is capable of discovering missing WALs as requested and
    throwing proper return code (good)
    $ /usr/pgsql19/bin/pg_waldump --path pg_wal.tar -s 0/31005F70 -e 0/343D2650 -q
    pg_waldump: error: could not find file "000000010000000000000034" in archive
    $ echo $?
    1
    $
    
    b. pg_verifybackup now also complains properly with missing WAL inside tar
    
    $ tar --delete -f pg_wal.tar 000000010000000000000032 # simulate loss of file
    $ tar -tf pg_wal.tar
    000000010000000000000031
    archive_status/000000010000000000000031.done
    archive_status/000000010000000000000032.done
    000000010000000000000033
    $ grep Start-LSN backup_manifest
    { "Timeline": 1, "Start-LSN": "0/31005F70", "End-LSN": "0/333D2650" }
    $ /usr/pgsql19/bin/pg_verifybackup -P /tmp/basebackup/
    791372/791372 kB (100%) verified
    pg_waldump: error: could not find file "000000010000000000000032" in archive
    pg_verifybackup: error: WAL parsing failed for timeline 1
    $ echo $?
    1
    $
    
    -J.