Thread

  1. autovacuum launcher crash: assert in pgstat_count_io_op (IOOP_EXTEND on pg_database's VM)

    Ewan Young <kdbase.hack@gmail.com> — 2026-05-31T04:36:45Z

    Hi hackers,
    
    I was stress-testing master (commit e2b35735b00, assertions enabled) with a
    workload that does a lot of DDL/DML, including creating and dropping
    databases in a tight loop, and the autovacuum launcher kept crashing on me
    --
    every 15-40 minutes or so once it was under load:
    
      TRAP: failed Assert("pgstat_tracks_io_op(MyBackendType, io_object,
            io_context, io_op)"), File: "pgstat_io.c", Line: 74
      LOG:  autovacuum launcher process (PID ...) was terminated by signal 6:
            Aborted
    
    The postmaster recovers fine, but it just starts another launcher that hits
    the exact same assert, so it never really gets out of the loop.
    
    The short version: the launcher is in get_database_list(), doing its seqscan
    of pg_database, and on-access pruning kicks in during the scan. Since
    b46e1e54d07 ("Allow on-access pruning to set pages all-visible"),
    heap_page_prune_opt() pins the visibility map unconditionally once it
    decides
    to prune -- before it ever checks rel_read_only. visibilitymap_pin() isn't
    read-only though: if the VM page isn't there yet it extends the fork, and
    pg_database has no VM fork, so we end up doing an actual relation extend
    (IOOP_EXTEND) from the launcher. pgstat_tracks_io_op() says the launcher
    must never do an EXTEND, hence the assertion.
    
    What surprised me is that the launcher's catalog scan isn't even flagged
    read-only (table_beginscan_catalog doesn't set SO_HINT_REL_READ_ONLY),
    so it never actually intends to set the VM -- it just pins/extends it
    anyway.
    
    Here are the relevant frames:
      #3  ExceptionalCondition ("pgstat_tracks_io_op(...)", "pgstat_io.c", 74)
              at assert.c:65
      #4  pgstat_count_io_op (io_object=IOOBJECT_RELATION,
              io_context=IOCONTEXT_NORMAL, io_op=IOOP_EXTEND, cnt=1, bytes=8192)
              at pgstat_io.c:74
      #5  pgstat_count_io_op_time (...) at pgstat_io.c:160
              at bufmgr.c:3030
      #7  ExtendBufferedRelCommon (... fork=VISIBILITYMAP_FORKNUM ...)
              at bufmgr.c:2774
      #8  ExtendBufferedRelTo (... fork=VISIBILITYMAP_FORKNUM, extend_to=1 ...)
              at bufmgr.c:1099
      #9  vm_extend (vm_nblocks=1, ...) at visibilitymap.c:614
      #10 vm_readbuf (blkno=0, extend=true) at visibilitymap.c:572
      #11 visibilitymap_pin (...) at visibilitymap.c:216
      #12 heap_page_prune_opt (..., rel_read_only=...) at pruneheap.c:339
      #13 heap_prepare_pagescan (...) at heapam.c:638
      #14 heapgettup_pagemode (... ForwardScanDirection ...) at heapam.c:1113
      #15 heap_getnext (...) at heapam.c:1454
      #16 get_database_list () at autovacuum.c:1856
      #17 do_start_worker () at autovacuum.c:1172
      #19 launch_worker (...) at autovacuum.c:1355
      #20 AutoVacLauncherMain (...) at autovacuum.c:780
      #21 postmaster_child_launch (child_type=B_AUTOVAC_LAUNCHER, ...)
              at launch_backend.c:268
      #22 StartChildProcess (type=B_AUTOVAC_LAUNCHER) at postmaster.c:4030
      #23 LaunchMissingBackgroundProcesses () at postmaster.c:3375
      #24 ServerLoop () at postmaster.c:1743
      #25 PostmasterMain (...) at postmaster.c:1415
      #26 main (...) at main.c:231
    
    I haven't been able to boil this down to a clean standalone repro yet -- it
    seems to need the launcher to hit get_database_list() at the moment a
    pg_database page is prunable and the VM fork still has to grow -- but the
    path
    looks pretty clear from the stack.
    
    Regards,
    Ewan