Thread

  1. Re: analyze-in-stages post upgrade questions

    Mircea Cadariu <cadariu.mircea@gmail.com> — 2025-08-06T04:01:18Z

    Hi,
    
    On 30/07/2025 12:49, Fujii Masao wrote:
    > I've started reviewing the patch since it's marked as ready for committer.
    Thanks!
    > Overall, I like the change. But I have one question: should this be treated as
    > a bug fix that we back-patch to supported branches, or is it more of
    > an improvement that should only go into master?
    I reckon it might make sense to back-patch it to previous versions, as 
    users might not upgrade always to the latest version.
    >           Only calculate statistics for use by the optimizer (no vacuum).
    > +        If that option is specified, <command>vacuumdb</command> will also
    > +        process partitioned tables.  Without that option, only the partitions
    > +        will be considered, unless a partitioned table is explicitly specified
    > +        with the <option>--table</option> option.
    >
    > This wording seems a bit out of place in the --analyze-only section,
    > since it also describes the default behavior of vacuumdb without that option.
    > Wouldn't it make more sense to move that explanation in the --table section?
    >
    > For example, we could add something like:
    >
    > ------------------
    > If no tables are specified with the --table option, vacuumdb will
    > clean all regular tables and materialized views in the connected
    > database. If --analyze-only or --analyze-in-stages is also specified,
    > it will analyze all regular tables, partitioned tables, and
    > materialized views (but not foreign tables).
    > ------------------
    Yes, agreed.
    > + /*
    > + * VACUUMing partitioned tables would be unreasonably expensive, since
    > + * that entails processing the partitions twice (once as part of the
    > + * partitioned table, once as tables in their own right) for no
    > + * benefit. But if we only ANALYZE, collecting statistics for
    > + * partitioned tables is worth the effort.
    > + */
    >
    > This is probably true. But isn't the main reason more about aligning with
    > the behavior of the underlying VACUUM and ANALYZE commands? As the vacuumdb
    > docs says, "There is no effective difference between vacuuming and analyzing
    > databases via this utility and via other methods for accessing the server.",
    > so its default target objects should match: VACUUM skips partitioned tables
    > by default, while ANALYZE includes them. If that's the case, maybe the comment
    > should reflect that instead.
    I see what you mean. From that perspective, I wonder if we even need a 
    comment there at all.
    > + qr/statement:\s+ANALYZE\s+public\.parent_table/s,
    > + '--analyze_only updates statistics for partitioned tables');
    >
    > A plain space might be sufficient instead of \s+.
    > Also, I don't think the backslash before ".parent_table" is necessary.
    
    Good catch! Indeed let's simplify that to contain strictly only what's 
    necessary.
    
    Kind regards,
    
    Mircea Cadariu