Thread

  1. A concurrent VACUUM FULL?

    Erik Nordström <erik@timescale.com> — 2025-06-30T09:49:01Z

    Hi hackers,
    
    I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is
    possible to do a concurrent version of it using a multi-transactional
    approach similar to concurrent reindexing and partition detach.
    
    The idea would be to hold weaker locks in TX1 when doing the heap rewrite
    (essentially allow reads but prevent writes), and then do the actual heap
    swap in a second TX2 transaction.
    
    Having experimented a bit with this approach, I found that reindexing is an
    issue because that happens after the new heap has been swapped in. The
    current reindex during a heap swap effectively blocks reads so if one
    starts a new transaction after swapping heaps, it will block reads for a
    long time.
    
    This made me think about two ways to handle this:
    
    1. Rebuild indexes on the temporary heap in TX1 and then swap in the new
    indexes along with the new heap in TX2.
    
    2. Do a concurrent index rebuild after the heap swap.
    
    Of the two approaches above, (2) seems easiest to implement, but the
    downside is that indexes would be invalid while indexes are rebuilt.
    Therefore, (1) seems to be the more desirable one because all the heavy
    lifting would be done in TX1 on the temporary heap.
    
    Does anyone have a sense of whether approach (1) is feasible or whether
    there are any major blockers?
    
    Is this worth pursuing at all or am I missing something?
    
    Best regards,
    
    Erik
    -- 
    Database Architect, Timescale
    
  2. Re: A concurrent VACUUM FULL?

    Antonin Houska <ah@cybertec.at> — 2025-06-30T10:03:42Z

    Erik Nordström <erik@timescale.com> wrote:
    
    > Hi hackers,
    > 
    > I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is possible to do a concurrent version of it using a
    > multi-transactional approach similar to concurrent reindexing and partition detach.
    > 
    > The idea would be to hold weaker locks in TX1 when doing the heap rewrite (essentially allow reads but prevent writes), and then do the
    > actual heap swap in a second TX2 transaction.
    
    Patch [1] is in the queue that allows both reads and writes. (An exclusive
    lock is acquired here for the swaps, but that should be held for very short
    time.)
    
    -- 
    Antonin Houska
    Web: https://www.cybertec-postgresql.com
    
    [1] https://commitfest.postgresql.org/patch/5117/
    
    
    
    
  3. Re: A concurrent VACUUM FULL?

    Erik Nordström <erik@timescale.com> — 2025-06-30T11:29:49Z

    On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote:
    
    > Erik Nordström <erik@timescale.com> wrote:
    >
    > > Hi hackers,
    > >
    > > I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is
    > possible to do a concurrent version of it using a
    > > multi-transactional approach similar to concurrent reindexing and
    > partition detach.
    > >
    > > The idea would be to hold weaker locks in TX1 when doing the heap
    > rewrite (essentially allow reads but prevent writes), and then do the
    > > actual heap swap in a second TX2 transaction.
    >
    > Patch [1] is in the queue that allows both reads and writes. (An exclusive
    > lock is acquired here for the swaps, but that should be held for very short
    > time.)
    >
    >
    That sounds great. Do you know if there's anything I can do to help?
    
    - Erik
    
    
    > --
    > Antonin Houska
    > Web: https://www.cybertec-postgresql.com
    >
    > [1] https://commitfest.postgresql.org/patch/5117/
    >
    
  4. Re: A concurrent VACUUM FULL?

    Álvaro Herrera <alvherre@kurilemu.de> — 2025-06-30T11:46:18Z

    On 2025-Jun-30, Erik Nordström wrote:
    
    > On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote:
    
    > > Patch [1] is in the queue that allows both reads and writes. (An exclusive
    > > lock is acquired here for the swaps, but that should be held for very short
    > > time.)
    >
    > That sounds great. Do you know if there's anything I can do to help?
    
    It would be very valuable if you can review the code, test it under the
    weirdest conditions you can imagine or just under normal conditions,
    proof-read the documentation, try to see if anything is missing that
    should be there, and so on.  Everything that you would expect from a new
    feature released as part of the next Postgres release.  Any
    problems/crashes/ abnormalities that you report before the patch is
    included in Postgres, is one less issue that we'll have to deal with
    after the release.
    
    -- 
    Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
    "El miedo atento y previsor es la madre de la seguridad" (E. Burke)
    
    
    
    
  5. Re: A concurrent VACUUM FULL?

    Erik Nordström <erik@timescale.com> — 2025-06-30T11:55:46Z

    On Mon, Jun 30, 2025 at 1:46 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
    
    > On 2025-Jun-30, Erik Nordström wrote:
    >
    > > On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote:
    >
    > > > Patch [1] is in the queue that allows both reads and writes. (An
    > exclusive
    > > > lock is acquired here for the swaps, but that should be held for very
    > short
    > > > time.)
    > >
    > > That sounds great. Do you know if there's anything I can do to help?
    >
    > It would be very valuable if you can review the code, test it under the
    > weirdest conditions you can imagine or just under normal conditions,
    > proof-read the documentation, try to see if anything is missing that
    > should be there, and so on.  Everything that you would expect from a new
    > feature released as part of the next Postgres release.  Any
    > problems/crashes/ abnormalities that you report before the patch is
    > included in Postgres, is one less issue that we'll have to deal with
    > after the release.
    >
    >
    I'll do my best to test the feature. One question I have, though, is why
    not start with supporting concurrent reads but not writes? That would
    already be a win and make the patch simpler.
    
    Best,
    
    - Erik
    
    
    > --
    > Álvaro Herrera         PostgreSQL Developer  —
    > https://www.EnterpriseDB.com/
    > "El miedo atento y previsor es la madre de la seguridad" (E. Burke)
    >
    
  6. Re: A concurrent VACUUM FULL?

    Antonin Houska <ah@cybertec.at> — 2025-06-30T12:03:38Z

    Erik Nordström <erik@timescale.com> wrote:
    
    > On Mon, Jun 30, 2025 at 1:46 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
    > 
    >  On 2025-Jun-30, Erik Nordström wrote:
    > 
    >  > On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at> wrote:
    > 
    >  > > Patch [1] is in the queue that allows both reads and writes. (An exclusive
    >  > > lock is acquired here for the swaps, but that should be held for very short
    >  > > time.)
    >  >
    >  > That sounds great. Do you know if there's anything I can do to help?
    > 
    >  It would be very valuable if you can review the code, test it under the
    >  weirdest conditions you can imagine or just under normal conditions,
    >  proof-read the documentation, try to see if anything is missing that
    >  should be there, and so on.  Everything that you would expect from a new
    >  feature released as part of the next Postgres release.  Any
    >  problems/crashes/ abnormalities that you report before the patch is
    >  included in Postgres, is one less issue that we'll have to deal with
    >  after the release.
    > 
    > I'll do my best to test the feature.
    
    Thanks. I've noticed that the patch set needs rebase. I'll try to prepare a
    new version today.
    
    > One question I have, though, is why not start with supporting concurrent reads but not writes? That would
    > already be a win and make the patch simpler.
    
    It occurred to me at some point too, but I think it would be rather a
    different implementation. So if we were to support both read-only and
    read-write modes, the amount of code would be even higher.
    
    -- 
    Antonin Houska
    Web: https://www.cybertec-postgresql.com
    
    
    
    
  7. Re: A concurrent VACUUM FULL?

    wenhui qiu <qiuwenhuifx@gmail.com> — 2025-06-30T13:46:32Z

    HI Erik Nordström
         In online production environments, blocking writes is generally
    unacceptable in most cases. The only acceptable approach is to allow
    concurrent read/write operations, with brief locks permitted only during
    the final steps of the process. We can see pg-osc's implementation (
    https://github.com/shayonj/pg-osc) for a non-blocking approach to VACUUM
    FULL operations."
    
    <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
    无病毒。www.avg.com
    <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
    <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
    
    On Mon, Jun 30, 2025 at 8:03 PM Antonin Houska <ah@cybertec.at> wrote:
    
    > Erik Nordström <erik@timescale.com> wrote:
    >
    > > On Mon, Jun 30, 2025 at 1:46 PM Álvaro Herrera <alvherre@kurilemu.de>
    > wrote:
    > >
    > >  On 2025-Jun-30, Erik Nordström wrote:
    > >
    > >  > On Mon, Jun 30, 2025 at 12:03 PM Antonin Houska <ah@cybertec.at>
    > wrote:
    > >
    > >  > > Patch [1] is in the queue that allows both reads and writes. (An
    > exclusive
    > >  > > lock is acquired here for the swaps, but that should be held for
    > very short
    > >  > > time.)
    > >  >
    > >  > That sounds great. Do you know if there's anything I can do to help?
    > >
    > >  It would be very valuable if you can review the code, test it under the
    > >  weirdest conditions you can imagine or just under normal conditions,
    > >  proof-read the documentation, try to see if anything is missing that
    > >  should be there, and so on.  Everything that you would expect from a new
    > >  feature released as part of the next Postgres release.  Any
    > >  problems/crashes/ abnormalities that you report before the patch is
    > >  included in Postgres, is one less issue that we'll have to deal with
    > >  after the release.
    > >
    > > I'll do my best to test the feature.
    >
    > Thanks. I've noticed that the patch set needs rebase. I'll try to prepare a
    > new version today.
    >
    > > One question I have, though, is why not start with supporting concurrent
    > reads but not writes? That would
    > > already be a win and make the patch simpler.
    >
    > It occurred to me at some point too, but I think it would be rather a
    > different implementation. So if we were to support both read-only and
    > read-write modes, the amount of code would be even higher.
    >
    > --
    > Antonin Houska
    > Web: https://www.cybertec-postgresql.com
    >
    >
    >
    
  8. Re: A concurrent VACUUM FULL?

    DINESH NAIR <dinesh_nair@iitmpravartak.net> — 2025-06-30T15:23:13Z

    Hi Eric,
    
    Currently the first suggested approach "Rebuild indexes on the temporary heap in TX1 and then swap in the new indexes along with the new heap in TX2."  sound good.
    
    It would be great if we are able to perform concurrent reads and writes.
    In OLTP environments will it lead to slowing of the queries or query performance issues !!!!
    
    
    Thanks
    
    Dinesh Nair
    
    
    ________________________________
    From: Erik Nordström <erik@timescale.com>
    Sent: Monday, June 30, 2025 3:19 PM
    To: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
    Subject: A concurrent VACUUM FULL?
    
    You don't often get email from erik@timescale.com. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
    Caution: This email was sent from an external source. Please verify the sender’s identity before clicking links or opening attachments.
    Hi hackers,
    
    I've been looking at the code for CLUSTER/VACUUM FULL, and whether it is possible to do a concurrent version of it using a multi-transactional approach similar to concurrent reindexing and partition detach.
    
    The idea would be to hold weaker locks in TX1 when doing the heap rewrite (essentially allow reads but prevent writes), and then do the actual heap swap in a second TX2 transaction.
    
    Having experimented a bit with this approach, I found that reindexing is an issue because that happens after the new heap has been swapped in. The current reindex during a heap swap effectively blocks reads so if one starts a new transaction after swapping heaps, it will block reads for a long time.
    
    This made me think about two ways to handle this:
    
    1. Rebuild indexes on the temporary heap in TX1 and then swap in the new indexes along with the new heap in TX2.
    
    2. Do a concurrent index rebuild after the heap swap.
    
    Of the two approaches above, (2) seems easiest to implement, but the downside is that indexes would be invalid while indexes are rebuilt. Therefore, (1) seems to be the more desirable one because all the heavy lifting would be done in TX1 on the temporary heap.
    
    Does anyone have a sense of whether approach (1) is feasible or whether there are any major blockers?
    
    Is this worth pursuing at all or am I missing something?
    
    Best regards,
    
    Erik
    --
    Database Architect, Timescale
    
  9. Re: A concurrent VACUUM FULL?

    Álvaro Herrera <alvherre@kurilemu.de> — 2025-06-30T16:37:53Z

    On 2025-Jun-30, DINESH  NAIR wrote:
    
    > In OLTP environments will it lead to slowing of the queries or query
    > performance issues !!!!
    
    Sure, to some extent, but ideally you wouldn't use it in a recurring
    fashion but only as an emergency solution out of a really serious bloat
    problem (so it's not something you should have impacting your production
    in a recurring fashion); also, performance should improve for the system
    overall, comparing to the state before compacting the table.
    
    I suggest you try pg_squeeze (a single run of it in a table, not
    scheduled runs) and report back how the system performs for you in the
    period when it is executing.  I expect that the impact of REPACK is
    going to be largely the same as that of pg_squeeze, because the
    implementation is very similar.
    
    -- 
    Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
    Y una voz del caos me habló y me dijo
    "Sonríe y sé feliz, podría ser peor".
    Y sonreí. Y fui feliz.
    Y fue peor.