Thread

  1. [PATCH] Add sampling statistics to autoanalyze log output

    Tatsuya Kawata <kawatatatsuya0913@gmail.com> — 2025-12-06T17:39:45Z

    Hi,
    
    I would like to propose a patch to add sampling statistics to autoanalyze
    log output, addressing an inconsistency between ANALYZE VERBOSE and
    autoanalyze logging.
    
    ## Problem
    
    Currently, ANALYZE VERBOSE displays sampling statistics, but autoanalyze
    does not log this information.
    This makes it harder to diagnose issues with automatic statistics
    collection.
    
    Example (current behavior):
    - ANALYZE VERBOSE: Shows "INFO:  "pg_class": scanned 14 of 14 pages,
    containing 434 live rows and 11 dead rows; 434 rows in sample, 434
    estimated total rows."
    - autoanalyze: No sampling information
    
    ## Solution
    
    This patch unifies the logging output by moving sampling statistics from
    acquire_sample_rows() to do_analyze_rel()'s instrumentation section. Now
    both ANALYZE VERBOSE and autoanalyze output the same sampling information
    in a consolidated log message.
    
    Key changes:
    1. Updated AcquireSampleRowsFunc typedef to include 4 new output parameters
    2. Modified acquire_sample_rows() and acquire_inherited_sample_rows() to
    populate these parameters
    3. Added sampling statistics output in do_analyze_rel()
    4. Updated postgres_fdw and file_fdw implementations
    
    ## Example Output
    
    After the patch(adding both ANALYZE VERBOSE and autoanalyze) :
    sampling: scanned 14 of 14 pages, containing 434 live rows and 11 dead
    rows; 434 rows in sample, 434 estimated total rows
    
    For inherited tables, statistics are accumulated across all children.
    
    ## Design Question
    
    For inherited tables, the current patch shows only the accumulated total.
    An alternative approach would be to show per-child statistics followed by
    the total.
    I wanted to align with do_analyze_rel()'s structure to properly support
    autoanalyze (autovacuum) logging.
    However, I haven't found a clean way to preserve per-child output while
    maintaining this structure.
    I would appreciate any advice or suggestions on how to achieve both goals
    if there's a better approach I'm missing.
    
    I would appreciate your feedback!
    
    Regards,