Thread

  1. Add mode column to pg_stat_progress_vacuum

    Shinya Kato <shinya11.kato@gmail.com> — 2025-08-14T11:12:55Z

    Hi hackers,
    
    I would like to propose a patch that enhances the
    pg_stat_progress_vacuum view by adding a mode column. The patch is
    attached.
    
    Although it is possible to identify an anti-wraparound VACUUM through
    the process title (to prevent wraparound) or specific log entries, it
    would be significantly more convenient for monitoring purposes to have
    this status clearly indicated in the pg_stat_progress_vacuum view.
    This would enable DBAs to immediately understand the urgency of the
    vacuum process without needing to check separate logs or system
    processes.
    
    This patch introduces a mode column to provide this visibility. The
    possible values are:
    - normal: A standard, user-initiated VACUUM or a regular autovacuum run.
    - anti-wraparound: An autovacuum run launched specifically to prevent
    transaction ID wraparound.
    - failsafe: A vacuum that has entered failsafe mode to prevent
    imminent transaction ID wraparound.
    
    This will allow administrators to better understand the context and
    urgency of vacuum operations, which is crucial for monitoring and
    troubleshooting.
    
    Design Considerations:
    When defining the scope of the anti-wraparound mode, I considered
    including manual commands like VACUUM (FREEZE) or VACUUM
    (DISABLE_PAGE_SKIPPING). However, I decided against this to keep the
    meaning of the mode clear and simple. These options can be used for
    various purposes, and overloading the anti-wraparound mode with too
    many meanings could become confusing. Therefore, the current
    implementation limits this mode to autovacuum runs that are explicitly
    launched for wraparound prevention.
    
    Regarding Testing:
    I was able to manually verify the failsafe mode's behavior by using
    the existing test script at
    src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl. This script
    successfully triggered the failsafe condition and the view reported
    the correct mode. However, I found this test to be somewhat flaky in
    my environment and decided not to add it to the patch to avoid
    introducing a potentially unstable test into the tree.
    
    Thought?
    
    --
    Best regards,
    Shinya Kato
    NTT OSS Center