Re: Adding REPACK [concurrently]

Alvaro Herrera <alvherre@alvh.no-ip.org>

From: Alvaro Herrera <alvherre@alvh.no-ip.org>
To: Amit Kapila <amit.kapila16@gmail.com>
Cc: Antonin Houska <ah@cybertec.at>, Mihail Nikalayeu <mihailnikalayeu@gmail.com>, Andres Freund <andres@anarazel.de>, Srinath Reddy Sadipiralla <srinath2133@gmail.com>, Matthias van de Meent <boekewurm+postgres@gmail.com>, Pg Hackers <pgsql-hackers@lists.postgresql.org>, Robert Treat <rob@xzilla.net>
Date: 2026-05-13T16:58:14Z
Lists: pgsql-hackers
Hello Amit,

On 2026-May-13, Amit Kapila wrote:

> So now the question is where do we go from here. I am not confident
> that the current code to achieve db-specific snapshots in logical
> decoding is the best possible solution both because of the drawbacks
> (like we won't be able to enable this on standby) and inefficiencies
> pointed out by me in this and previous emails in this work.

This is a fair question.  I don't think we have time to go much further
on this aspect before beta 1, so we either accept this patch, fix the
inefficiencies you pointed out and keep db-specific snapshots, or we
revert db-specific snapshots and go back to the standard snapshot-taking
technique for REPACK in 19 and see what we can improve for 20.

Now, the worst consequence of reverting db-specific snapshots is that
you will only be able to run REPACK in a single database at a time
(because any subsequent REPACK will have to wait until the first one
finishes before being able to get its snapshot).  In most normal cases
this is probably not a big deal.  But if you have a multitenant system,
and you want your users to be able to run REPACK on their tables, you
may be a bit screwed.  So I hesitate to just go and revert it without
offering those people any alternative.

(It's also possible that being unable to run more than one REPACK at a
time is not so big a deal.  After all, it's supposed to be an infrequent
operation.  And users probably don't or shouldn't have multi-terabyte
tables in multitenant databases anyway.)

I'm not sure I understand the point of the standby.  I mean, you can't
run REPACK on the standby anyway, so I don't see this as a very
problematic restriction.  Do you have other reasons for wanting a
db-specific snapshot in a standby?

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/