Thread

  1. Re: Adding REPACK [concurrently]

    Alvaro Herrera <alvherre@alvh.no-ip.org> — 2025-10-10T14:11:32Z

    Hello,
    
    Here's patch v24.  I was hoping to push this today, but I think there
    were too many changes from v23 for that.  Here's what I did:
    
    - pg_stat_progress_cluster is no longer a view on top of the low-level
      pg_stat_get_progress_info() function.  Instead, it's a view on top of
      pg_stat_progress_repack.  The only change it applies on top of that
      one is change the command from REPACK to one of VACUUM FULL or
      CLUSTER, depending on whether an index is being used or not.  This
      should keep the behavior identical to previous versions.
      Alternatively we could just hide rows where the command is REPACK, but
      I don't think that would be any better.  This way, we maintain
      compatibility with tools reading pg_stat_progress_cluster.  Maybe this
      is useless and we should just drop the view, not sure, we can discuss
      separately.
    
    - pg_stat_progress_repack itself now shows the command.  Also I got rid
      of the separate enum values for the command, and instead used the
      values from the parse node (RepackCommand); this removes about a dozen
      lines of C code.  To forestall potentially bogus usage of value 0, I
      made the enum start from 1.
    
    - I noticed that you can do "CLUSTER pg_class ON some_index" and it will
      happily modify pg_index.indisclustered, which is a bit weird
      considering that allow_system_table_mods is off -- if you later try
      ALTER TABLE .. SET WITHOUT CLUSTER, it won't let you.  I think this is
      bogus and we should change it so that CLUSTER refuses to change the
      clustered index on a system catalog, unless allow_system_table_mods is
      on.  However, that would be a change from longstanding behavior which
      is specifically tested for in regression tests, so I didn't do it.
      We can discuss such a change separately.  But I did make REPACK refuse
      to do that, because we don't need to propagate bogus historical
      behavior.  So REPACK will fail if you try to change the indisclustered
      index, but it will work fine if you repack based on the same index as
      before, or repack with no index.
    
    - pg_repackdb: if you try with a non-superuser without specifying a
      table name, it will fail as soon as it hits the first catalog table or
      whatever with "ERROR: cannot lock this table".  This is sorta fine for
      vacuumdb, but only because VACUUM itself will instead say "WARNING:
      cannot lock table XYZ, skipping", so it's not an error and vacuumdb
      keeps running.  IMO this is bogus: vacuumdb should not try to process
      tables that it doesn't have privileges to.  However, not wanting to
      change longstanding behavior, I left that alone.  For pg_repackdb, I
      added a condition in the WHERE clause there to only fetch tables that
      the current user has MAINTAIN privilege over.  Then you can do a
      "pg_repackdb -U foobar" and it will nicely process the tables that
      that user is allowed to process.  We can discuss changing the vacuumdb
      behavior separately.
    
    - Added some additional tests for pg_repackdb and REPACK.
    
    - Updated the docs.
    
    -- 
    Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/