Thread

  1. Re: POC: Parallel processing of indexes in autovacuum

    Daniil Davydov <3danissimo@gmail.com> — 2025-08-18T08:30:49Z

    Hi,
    Thank you very much for your comments!
    In this letter I'll answer both of your recent letters.
    
    On Fri, Aug 8, 2025 at 6:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
    >
    > Thank you for updating the patch. Here are some review comments.
    >
    > +        /* Release all launched (i.e. reserved) parallel autovacuum workers. */
    > +        if (AmAutoVacuumWorkerProcess())
    > +                AutoVacuumReleaseParallelWorkers(nlaunched_workers);
    > +
    >
    > We release the reserved worker in parallel_vacuum_end(). However,
    > parallel_vacuum_end() is called only once at the end of vacuum. I
    > think we need to release the reserved worker after index vacuuming or
    > cleanup, otherwise we would end up holding the reserved workers until
    > the end of vacuum even if we invoke index vacuuming multiple times.
    >
    
    Yep, you are right. It was easy to miss because typically the autovacuum
    takes only one cycle to process a table. Since both index vacuum and
    index cleanup uses the parallel_vacuum_process_all_indexes function,
    I think that both releasing and reserving should be placed there.
    
    > ---
    > +void
    > +assign_autovacuum_max_parallel_workers(int newval, void *extra)
    > +{
    > +        autovacuum_max_parallel_workers = Min(newval, max_worker_processes);
    > +}
    >
    > I don't think we need the assign hook for this GUC parameter. We can
    > internally cap the maximum value by max_worker_processes like other
    > GUC parameters such as max_parallel_maintenance_workers and
    > max_parallel_workers.
    
    Ok, I get it - we don't want to give a configuration error for no serious
    reason. Actually, we already internally capping
    autovacuum_max_parallel_workers by max_worker_processes (inside
    parallel_vacuum_compute_workers function). This is the same behavior
    as max_parallel_maintenance_workers got.
    
    I'll get rid of the assign hook and add one more capping inside autovacuum
    shmem initialization : Since max_worker_processes is PGC_POSTMASTER
    parameter, av_freeParallelWorkers must not exceed its value.
    
    >
    > ---+        /* Refresh autovacuum_max_parallel_workers paremeter */
    > +        CHECK_FOR_INTERRUPTS();
    > +        if (ConfigReloadPending)
    > +        {
    > +                ConfigReloadPending = false;
    > +                ProcessConfigFile(PGC_SIGHUP);
    > +        }
    > +
    > +        LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
    > +
    > +        /*
    > +         * If autovacuum_max_parallel_workers parameter was reduced during
    > +         * parallel autovacuum execution, we must cap available
    > workers number by
    > +         * its new value.
    > +         */
    > +        AutoVacuumShmem->av_freeParallelWorkers =
    > +                Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
    > +                        autovacuum_max_parallel_workers);
    > +
    > +        LWLockRelease(AutovacuumLock);
    >
    > I think another race condition could occur; suppose
    > autovacuum_max_parallel_workers is set to '5' and one autovacuum
    > worker reserved 5 workers, meaning that
    > AutoVacuumShmem->av_freeParallelWorkers is 0. Then, the user changes
    > autovacuum_max_parallel_workers to 3 and reloads the conf file right
    > after the autovacuum worker checks the interruption. The launcher
    > processes calls adjust_free_parallel_workers() but
    > av_freeParallelWorkers remains 0, and the autovacuum worker increments
    > it by 5 as its autovacuum_max_parallel_workers value is still 5.
    >
    
    I think this problem can be solved if we put AutovacuumLock acquiring
    before processing the config file, but I understand that this is a bad way.
    
    > I think that we can have the autovacuum_max_parallel_workers value on
    > shmem, and only the launcher process can modify its value if the GUC
    > is changed. Autovacuum workers simply increase or decrease the
    > av_freeParallelWorkers within the range of 0 and the
    > autovacuum_max_parallel_workers value on shmem. When changing
    > autovacuum_max_parallel_workers and av_freeParallelWorkers values on
    > shmem, the launcher process calculates the number of workers reserved
    > at that time and calculate the new av_freeParallelWorkers value by
    > subtracting the new autovacuum_max_parallel_workers by the number of
    > reserved workers.
    >
    
    Good idea, I agree. Replacing the GUC parameter with the variable in shmem
    leaves the current logic of free workers management unchanged. Essentially,
    this  is the same solution as I described above, but we are holding lock not
    during  config reloading, but during a simple value check. It makes much
    more sense.
    
    > ---
    > +AutoVacuumReserveParallelWorkers(int nworkers)
    > +{
    > +   int         can_launch;
    >
    > How about renaming it to 'nreserved' or something? can_launch looks
    > like it's a boolean variable to indicate whether the process can
    > launch workers.
    >
    
    There are no objections.
    
    On Fri, Aug 15, 2025 at 3:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
    >
    > While testing the patch, I found there are other two problems:
    >
    > 1. when an autovacuum worker who reserved workers fails with an error,
    > the reserved workers are not released. I think we need to ensure that
    > all reserved workers are surely released at the end of vacuum even
    > with an error.
    >
    
    Agree. I'll add a try/catch block to the parallel_vacuum_process_all_indexes
    (the only place where we are reserving workers).
    
    > 2. when an autovacuum worker (not parallel vacuum worker) who uses
    > parallel vacuum gets SIGHUP, it errors out with the error message
    > "parameter "max_stack_depth" cannot be set during a parallel
    > operation". Autovacuum checks the configuration file reload in
    > vacuum_delay_point(), and while reloading the configuration file, it
    > attempts to set max_stack_depth in
    > InitializeGUCOptionsFromEnvironment() (which is called by
    > ProcessConfigFileInternal()). However, it cannot change
    > max_stack_depth since the worker is in parallel mode but
    > max_stack_depth doesn't have GUC_ALLOW_IN_PARALLEL flag. This doesn't
    > happen in regular backends who are using parallel queries because they
    > check the configuration file reload at the end of each SQL command.
    >
    
    Hm, this is a really serious problem. I see only two ways to solve it (both are
    not really good) :
    1)
    Do not allow processing of the config file during parallel autovacuum
    execution.
    2)
    Teach the autovacuum to enter parallel mode only during the index vacuum/cleanup
    phase. I'm a bit wary about it, because the design says that we should
    be in parallel
    mode during the whole parallel operation. But actually, if we can make
    sure that all
    launched workers are exited, I don't see reasons, why can't we just
    exit parallel mode
    at the end of parallel_vacuum_process_all_indexes.
    
    What do you think about it? By now, I haven't made any changes related
    to this problem.
    
    Again, thank you for the review. Please, see v10 patches (only 0001
    has been changed) :
    1) Reserve and release workers only inside parallel_vacuum_process_all_indexes.
    2) Add try/catch block to the parallel_vacuum_process_all_indexes, so we can
    release workers even after an error. This required adding a static
    variable to account
    for the total number of reserved workers (av_nworkers_reserved).
    3) Cap autovacuum_max_parallel_workers by max_worker_processes only inside
    autovacuum code. Assign hook has been removed.
    4) Use shmem value for determining the maximum number of parallel autovacuum
    workers (eliminate race condition between launcher and leader process).
    
    --
    Best regards,
    Daniil Davydov