Re: analyze-in-stages post upgrade questions
Mircea Cadariu <cadariu.mircea@gmail.com>
From: Mircea Cadariu <cadariu.mircea@gmail.com>
To: Fujii Masao <masao.fujii@gmail.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>,
"Zechman, Derek S" <Derek.S.Zechman@snapon.com>,
Adrian Klaver <adrian.klaver@aklaver.com>, pgsql-hackers@lists.postgresql.org
Date: 2025-08-06T04:01:18Z
Lists: pgsql-hackers
Hi, On 30/07/2025 12:49, Fujii Masao wrote: > I've started reviewing the patch since it's marked as ready for committer. Thanks! > Overall, I like the change. But I have one question: should this be treated as > a bug fix that we back-patch to supported branches, or is it more of > an improvement that should only go into master? I reckon it might make sense to back-patch it to previous versions, as users might not upgrade always to the latest version. > Only calculate statistics for use by the optimizer (no vacuum). > + If that option is specified, <command>vacuumdb</command> will also > + process partitioned tables. Without that option, only the partitions > + will be considered, unless a partitioned table is explicitly specified > + with the <option>--table</option> option. > > This wording seems a bit out of place in the --analyze-only section, > since it also describes the default behavior of vacuumdb without that option. > Wouldn't it make more sense to move that explanation in the --table section? > > For example, we could add something like: > > ------------------ > If no tables are specified with the --table option, vacuumdb will > clean all regular tables and materialized views in the connected > database. If --analyze-only or --analyze-in-stages is also specified, > it will analyze all regular tables, partitioned tables, and > materialized views (but not foreign tables). > ------------------ Yes, agreed. > + /* > + * VACUUMing partitioned tables would be unreasonably expensive, since > + * that entails processing the partitions twice (once as part of the > + * partitioned table, once as tables in their own right) for no > + * benefit. But if we only ANALYZE, collecting statistics for > + * partitioned tables is worth the effort. > + */ > > This is probably true. But isn't the main reason more about aligning with > the behavior of the underlying VACUUM and ANALYZE commands? As the vacuumdb > docs says, "There is no effective difference between vacuuming and analyzing > databases via this utility and via other methods for accessing the server.", > so its default target objects should match: VACUUM skips partitioned tables > by default, while ANALYZE includes them. If that's the case, maybe the comment > should reflect that instead. I see what you mean. From that perspective, I wonder if we even need a comment there at all. > + qr/statement:\s+ANALYZE\s+public\.parent_table/s, > + '--analyze_only updates statistics for partitioned tables'); > > A plain space might be sufficient instead of \s+. > Also, I don't think the backslash before ".parent_table" is necessary. Good catch! Indeed let's simplify that to contain strictly only what's necessary. Kind regards, Mircea Cadariu