Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Matthias van de Meent <boekewurm+postgres@gmail.com>
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
To: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Cc: Sergey Sargsyan <sergey.sargsyan.2001@gmail.com>, Álvaro Herrera <alvherre@kurilemu.de>, Andres Freund <andres@anarazel.de>, Michael Paquier <michael@paquier.xyz>, PostgreSQL Hackers <pgsql-hackers@postgresql.org>, Andrey Borodin <amborodin86@gmail.com>, Melanie Plageman <melanieplageman@gmail.com>
Date: 2025-11-28T16:57:55Z
Lists: pgsql-hackers
On Fri, 28 Nov 2025 at 15:50, Mihail Nikalayeu <mihailnikalayeu@gmail.com> wrote: > > Hello! > > On Thu, Nov 27, 2025 at 9:07 PM Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > While it might not break, and might not hold back other tables' > > visibility horizons, it'll still hold back pruning on the table we're > > acting on, and that's likely one which already had bloat issues if > > you're running RIC (or REPACK). > > Yes, a good point about REPACK, agreed. > > BTW, what is about using the same reset snapshot technique for REPACK also? > > I thought it is impossible, but what if we: > > * while reading the heap we "remember" our current page position into > shared memory > * preserve all xmin/max/cid into newly created repacked table (we need > it for MVCC-safe approach anyway) > * in logical decoding layer - we check TID of our tuple and looking at > "current page" we may correctly decide what to do with at apply phase: > > - if it in "non-yet read pages" - ignore (we will read it later) - but > signal scan to ensure it will reset snapshot before that page > (reset_before = min(reset_before, tid)) > - if it in "already read pages" - remember the apply operation (with > exact target xmin/xmax and resulting xmin/xmax) Yes, exactly - keep track of which snapshot was used for which part of the table, and all updates that add/remove tuples from the scanned range after that snapshot are considered inserts/deletes, similar to how it'd work if LR had a filter on `ctid BETWEEN '(0, 0)' AND '(end-of-snapshot-scan)'` which then gets updated every so often. I'm a bit worried, though, that LR may lose updates due to commit order differences between WAL and PGPROC. I don't know how that's handled in logical decoding, and can't find much literature about it in the repo either. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com)