Thread
Commits
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add options to control whether VACUUM runs vac_update_datfrozenxid.
- a46a7011b271 16.0 landed
-
Use catalog query to discover tables to process in vacuumdb
- e0c2933a767c 12.0 cited
-
BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
PG Bug reporting form <noreply@postgresql.org> — 2022-12-13T10:57:39Z
The following bug has been logged on the website: Bug reference: 17717 Logged by: Gunnar L Email address: postgresql@taljaren.se PostgreSQL version: 15.0 Operating system: Ubuntu Linux Description: We have observed a significant slowdown in vacuumdb performance between different versions of postgresql. And possibly also a memory issue. We run a specific data model, where each customer has its own schema with its own set of tables. Each database server hosts 16 databases, each containing around 250 customer schemas. Due to postgres creating a new file for each database object, we end up with around 5 million files on each database server. This may or may not be related to the issue we're seeing (new algorithms with new time complexity?) We upgraded from postgresql 9.5 to postgresql 13, and noticed a significant slowdown in how vacuumdb performs. Before, we could run a vacuumdb -a -z each night, taking around 2 hours to complete. After the upgrade, we see a constant 100% CPU utilization during the vacuumdb process (almost no I/O activity), and vacuumdb cannot complete within a reasonable time. We're able to vacuum about 3-4 databases each night. We are able to recreate this issue, using a simple bash script to generate a similar setup. From local testing, here are our findings: Concerning speed: * Version 9.5, 10, 11 are fast (9.5 slower than 10 and 11) * Version 12, 13, 14 are very, very slow * Version 15 is faster (a lot faster than 12,13,14) but not nearly as fast as 10 or 11. Concerning memory usage: * Version 15 is using a lot more shared memory OR it might not be releasing it properly after vacuuming a db. These are the timings for vacuuming the 16 dbs. Version Seconds Completed ------------------------------ 9.5 412 16/16 10 178 16/16 11 166 16/16 12 8319 1/16 or 2/16 (manually aborted) 13 18853 3/16 or 4/16 (manually aborted) 14 16857 3/16 or 4/16 (manually aborted) 15 617 1/16 (crashed!) 15 4158 6/16 (crashed! --shm-size=256mb) 15 9500 16/16 (--shm-size=4096mb) The timing of the only successful run for postgres 15 is somewhat flaky, since the machine was suspended for about 1-1.5 hours so 9500 is only an estimate, but the first run (1 db completed in 10 minutes) gives that it is faster than 12-14 but slower than 10 and 11 (3 minutes to complete everything) The following describes our setup This is the script (called setup.sh) we’re using to populate the databases (we give a port number as parameter) ##### start of setup.sh export PGPASSWORD=mysecretpassword PORT=$1 echo ""> tables_$PORT.sql for schema in `seq -w 1 250`; do echo "create schema schema$schema;" >> tables_$PORT.sql for table in `seq -w 1 500`; do echo "create table schema$schema.table$table (id int);" >> tables_$PORT.sql done done echo "Setting up db: 01" createdb -h localhost -U postgres -p $PORT db01 psql -q -h localhost -U postgres -p $PORT db01 -f tables_$PORT.sql # This seems to be the fastest way to create the databases for db in `seq -w 2 16`; do echo "Setting up db: $db" createdb -h localhost -U postgres -p $PORT --template db01 db$db done ####### end of setup.sh To execute a test for a particular postgres version (in this example PG 9.5), we run the following. It will setup PG 9.5 on port 15432. docker run --rm --name pg95 -e POSTGRES_PASSWORD=mysecretpassword -p 15432:5432 -d postgres:9.5 ./setup.sh 15432 date; time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres"; date (The date commands are added to keep track of when tasks were started). Here are complete set of commands and output and comments (We use different ports for different versions of PG) date; time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres"; date (The date commands since it takes some time to run) time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres" vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db16" vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" real 6m52,070s user 0m0,048s sys 0m0,029s time docker exec -it pg10 bash -c "vacuumdb -a -z -U postgres" vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db16" vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" real 2m58,354s user 0m0,043s sys 0m0,013s time docker exec -it pg11 bash -c "vacuumdb -a -z -U postgres" vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db16" vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" real 2m46,181s user 0m0,047s sys 0m0,012s date; time docker exec -it pg12 bash -c "vacuumdb -a -z -U postgres"; date lör 10 dec 2022 18:57:43 CET vacuumdb: vacuuming database "db01" vacuumdb: vacuuming database "db02" ^CCancel request sent vacuumdb: error: vacuuming of table "schema241.table177" in database "db02" failed: ERROR: canceling statement due to user request real 138m39,600s user 0m0,177s sys 0m0,418s lör 10 dec 2022 21:16:22 CET date;time docker exec -it pg13 bash -c "vacuumdb -a -z -U postgres" lör 10 dec 2022 07:22:32 CET vacuumdb: vacuuming database "db01" vacuumdb: vacuuming database "db02" vacuumdb: vacuuming database "db03" vacuumdb: vacuuming database "db04" ^CCancel request sent real 314m13,172s user 0m0,551s sys 0m0,663s lör 10 dec 2022 12:37:03 CET date;time docker exec -it pg14 bash -c "vacuumdb -a -z -U postgres"; date lör 10 dec 2022 14:15:37 CET vacuumdb: vacuuming database "db01" vacuumdb: vacuuming database "db02" vacuumdb: vacuuming database "db03" vacuumdb: vacuuming database "db04" ^CCancel request sent real 280m57,172s user 0m0,586s sys 0m0,559s lör 10 dec 2022 18:56:34 CET date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date lör 10 dec 2022 12:50:25 CET vacuumdb: vacuuming database "db01" vacuumdb: vacuuming database "db02" vacuumdb: error: processing of database "db02" failed: ERROR: could not resize shared memory segment "/PostgreSQL.2952321776" to 27894720 bytes: No space left on device real 10m17,913s user 0m0,030s sys 0m0,049s lör 10 dec 2022 13:00:43 CET # it was faster, but we need to extend shared memory to make it work docker run --rm --name pg15 --shm-size=256mb -e POSTGRES_PASSWORD=mysecretpassword -p 55555:5432 -d postgres:15 date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date mån 12 dec 2022 08:56:17 CET vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db07" vacuumdb: error: processing of database "db07" failed: ERROR: could not resize shared memory segment "/PostgreSQL.1003084622" to 27894720 bytes: No space left on device real 69m18,345s user 0m0,217s sys 0m0,086s mån 12 dec 2022 10:05:36 CET docker run --rm --name pg15 --shm-size=4096mb -e POSTGRES_PASSWORD=mysecretpassword -p 55555:5432 -d postgres:15 date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date mån 12 dec 2022 11:16:11 CET vacuumdb: vacuuming database "db01" …<snip>... vacuumdb: vacuuming database "db16" vacuumdb: vacuuming database "postgres" vacuumdb: vacuuming database "template1" real 232m46,168s user 0m0,227s sys 0m0,467s mån 12 dec 2022 15:08:57 CET Here is the hardware that was used AMD Ryzen 7 PRO 5850U with Radeon Graphics 8 Cores, 16 threads $ free total used free shared buff/cache available Mem: 28562376 5549716 752624 1088488 22260036 21499752 Swap: 999420 325792 673628 Disk: NVMe device, Samsung SSD 980 1TB -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-15T18:56:30Z
PG Bug reporting form <noreply@postgresql.org> writes: > We run a specific data model, where each customer has its own schema with > its own set of tables. Each database server hosts 16 databases, each > containing around 250 customer schemas. Due to postgres creating a new file > for each database object, we end up with around 5 million files on each > database server. This may or may not be related to the issue we're seeing > (new algorithms with new time complexity?) > We upgraded from postgresql 9.5 to postgresql 13, and noticed a significant > slowdown in how vacuumdb performs. Before, we could run a vacuumdb -a -z > each night, taking around 2 hours to complete. After the upgrade, we see a > constant 100% CPU utilization during the vacuumdb process (almost no I/O > activity), and vacuumdb cannot complete within a reasonable time. We're able > to vacuum about 3-4 databases each night. I poked into this a little bit. On HEAD, watching things with "perf" identifies vac_update_datfrozenxid() as the main time sink. It's not hard to see why: that does a seqscan of pg_class, and it's invoked at the end of each vacuum() call. So if you try to vacuum each table in the DB separately, you're going to end up spending O(N^2) time in often-useless rescans of pg_class. This isn't a huge problem in ordinary-sized DBs, but with 125000 small tables in the DB it becomes the dominant cost. > Concerning speed: > * Version 9.5, 10, 11 are fast (9.5 slower than 10 and 11) > * Version 12, 13, 14 are very, very slow > * Version 15 is faster (a lot faster than 12,13,14) but not nearly as fast > as 10 or 11. The reason for the v12 performance change is that up through v11, "vacuumdb -a -z" would just issue "VACUUM (ANALYZE);" in each DB. So vac_update_datfrozenxid only ran once. Beginning in v12 (commit e0c2933a7), vacuumdb issues a separate VACUUM command for each targeted table, which causes the problem. I'm not sure why there's a performance delta from 14 to 15. It doesn't look like vacuumdb itself had any material changes, so we must have done something different on the backend side. This may indicate that there's another O(N^2) behavior that we got rid of in v15. Anyway, that change isn't bad, so I did not poke into it too much. Conclusions: * As a short-term fix, you could try using vacuumdb from v11 with the newer servers. Or just do "psql -c 'vacuum analyze'" and not bother with vacuumdb at all. (On HEAD, with this example database, 'vacuum analyze' takes about 7 seconds per DB for me, versus ~10 minutes using vacuumdb.) * To fix vacuumdb properly, it might be enough to get it to batch VACUUMs, say by naming up to 1000 tables per command instead of just one. I'm not sure how that would interact with its parallelization logic, though. It's not really solving the O(N^2) issue either, just pushing it further out. * A better idea, though sadly not very back-patchable, could be to expose a VACUUM option to control whether it runs vac_update_datfrozenxid, so that vacuumdb can do that just once at the end. Considering that vac_update_datfrozenxid requires an exclusive lock, the current behavior is poison for parallel vacuuming quite aside from the O(N^2) issue. This might tie into some work Peter G. has been pursuing, too. regards, tom lane
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Peter Geoghegan <pg@bowt.ie> — 2022-12-15T20:06:57Z
On Thu, Dec 15, 2022 at 10:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > * A better idea, though sadly not very back-patchable, could > be to expose a VACUUM option to control whether it runs > vac_update_datfrozenxid, so that vacuumdb can do that just > once at the end. Considering that vac_update_datfrozenxid > requires an exclusive lock, the current behavior is poison for > parallel vacuuming quite aside from the O(N^2) issue. This > might tie into some work Peter G. has been pursuing, too. That sounds like a good idea to me. But do we actually need a VACUUM option for this? I wonder if we could get away with having the VACUUM command never call vac_update_datfrozenxid(), except when run in single-user mode. It would be nice to make pg_xact/clog truncation autovacuum's responsibility. Autovacuum already does things differently to the VACUUM command, and for reasons that seem related to this complaint about vacuumdb. Besides, autovacuum is already on the hook to call vac_update_datfrozenxid() for the benefit of databases that haven't actually been vacuumed, per the do_autovacuum() comments right above its vac_update_datfrozenxid() call. -- Peter Geoghegan
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-15T21:57:16Z
Peter Geoghegan <pg@bowt.ie> writes: > On Thu, Dec 15, 2022 at 10:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> * A better idea, though sadly not very back-patchable, could >> be to expose a VACUUM option to control whether it runs >> vac_update_datfrozenxid, so that vacuumdb can do that just >> once at the end. Considering that vac_update_datfrozenxid >> requires an exclusive lock, the current behavior is poison for >> parallel vacuuming quite aside from the O(N^2) issue. This >> might tie into some work Peter G. has been pursuing, too. > That sounds like a good idea to me. But do we actually need a VACUUM > option for this? I wonder if we could get away with having the VACUUM > command never call vac_update_datfrozenxid(), except when run in > single-user mode. It would be nice to make pg_xact/clog truncation > autovacuum's responsibility. I could get behind manual VACUUM not invoking vac_update_datfrozenxid by default, perhaps. But if it can never call it, then that is a fairly important bit of housekeeping that is unreachable except by autovacuum. No doubt the people who turn off autovacuum are benighted, but they're still out there. Could we get somewhere by saying that manual VACUUM calls vac_update_datfrozenxid only if it's a full-DB vacuum (ie, no table was specified)? That would fix the problem at hand. However, it'd mean (since v12) that a vacuumdb run never calls vac_update_datfrozenxid at all, which would result in horrible problems for any poor sods who think that a cronjob running "vacuumdb -a" is an adequate substitute for autovacuum. Or maybe we could modify things so that "autovacuum = off" doesn't prevent occasional cycles of vac_update_datfrozenxid-and-nothing-else? In the end I feel like a manual way to call vac_update_datfrozenxid would be the least magical way of running this. regards, tom lane
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Peter Geoghegan <pg@bowt.ie> — 2022-12-16T04:39:54Z
On Thu, Dec 15, 2022 at 1:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I could get behind manual VACUUM not invoking vac_update_datfrozenxid > by default, perhaps. But if it can never call it, then that is a > fairly important bit of housekeeping that is unreachable except by > autovacuum. No doubt the people who turn off autovacuum are benighted, > but they're still out there. I wouldn't mind adding another option for this to VACUUM. We already have a couple of VACUUM options that are only really needed as escape hatches, or perhaps as testing tools used by individual Postgres hackers. Another one doesn't seem too bad. The VACUUM command should eventually become totally niche, so I'm not too concerned about going overboard here. > Could we get somewhere by saying that manual VACUUM calls > vac_update_datfrozenxid only if it's a full-DB vacuum (ie, no table > was specified)? That would fix the problem at hand. That definitely seems reasonable. > Or maybe we could modify things so that "autovacuum = off" doesn't prevent > occasional cycles of vac_update_datfrozenxid-and-nothing-else? That's what I was thinking of. It seems like a more natural approach to me, at least offhand. I have to imagine that the vast majority of individual calls to vac_update_datfrozenxid have just about zero chance of updating datfrozenxid or datminmxid as things stand. There is bound to be some number of completely static tables in every database (maybe just system catalogs). Those static tables are bound to be the tables that hold back datfrozenxid/datminmxid approximately all the time. To me this suggests that vac_update_datfrozenxid should fully own the fact that it's supposed to be called out of band, possibly only in autovacuum. Separately, I wonder if it would make sense to invent a new fast-path for the VACUUM command that is designed to inexpensively determine that it cannot possibly matter if vac_update_datfrozenxid is never called, given the specifics (the details of the target rel and its TOAST rel). -- Peter Geoghegan
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-16T14:49:09Z
Peter Geoghegan <pg@bowt.ie> writes: > On Thu, Dec 15, 2022 at 1:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Or maybe we could modify things so that "autovacuum = off" doesn't prevent >> occasional cycles of vac_update_datfrozenxid-and-nothing-else? > That's what I was thinking of. It seems like a more natural approach > to me, at least offhand. Seems worth looking into. But I suppose the launch frequency would have to be more often than the current behavior for autovacuum = off, so it would complicate the logic in that area. > I have to imagine that the vast majority of individual calls to > vac_update_datfrozenxid have just about zero chance of updating > datfrozenxid or datminmxid as things stand. That is a really good point. How about teaching VACUUM to track the oldest original relfrozenxid and relminmxid among the table(s) it processed, and skip vac_update_datfrozenxid unless at least one of those matches the database's values? For extra credit, also skip if we didn't successfully advance the source rel's value. This might lead to a fix that solves the OP's problem while not changing anything fundamental, which would make it reasonable to back-patch. regards, tom lane
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Peter Geoghegan <pg@bowt.ie> — 2022-12-16T18:47:07Z
On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I have to imagine that the vast majority of individual calls to > > vac_update_datfrozenxid have just about zero chance of updating > > datfrozenxid or datminmxid as things stand. > > That is a really good point. How about teaching VACUUM to track > the oldest original relfrozenxid and relminmxid among the table(s) > it processed, and skip vac_update_datfrozenxid unless at least one > of those matches the database's values? For extra credit, also > skip if we didn't successfully advance the source rel's value. Hmm. I think that that would probably work. It would certainly work on 15+, because there tends to be "natural diversity" among the relfrozenxid values seen for each table, due to the "track oldest extant XID" work; we no longer see many tables that all have the same relfrozenxid, that advance in lockstep. But even that factor probably doesn't matter, since we only need one "laggard relfrozenxid" static table for the scheme to work and work well. That is probably a safe bet on all versions, though I'd have to check to be sure. > This might lead to a fix that solves the OP's problem while not > changing anything fundamental, which would make it reasonable > to back-patch. That's a big plus. This is a nasty regression. I wouldn't call it a must-fix, but it's bad enough to be worth fixing if we can come up with a reasonably non-invasive approach. -- Peter Geoghegan
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-16T20:33:33Z
Peter Geoghegan <pg@bowt.ie> writes: > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> That is a really good point. How about teaching VACUUM to track >> the oldest original relfrozenxid and relminmxid among the table(s) >> it processed, and skip vac_update_datfrozenxid unless at least one >> of those matches the database's values? For extra credit, also >> skip if we didn't successfully advance the source rel's value. > Hmm. I think that that would probably work. > It would certainly work on 15+, because there tends to be "natural > diversity" among the relfrozenxid values seen for each table, due to > the "track oldest extant XID" work; we no longer see many tables that > all have the same relfrozenxid, that advance in lockstep. But even > that factor probably doesn't matter, since we only need one "laggard > relfrozenxid" static table for the scheme to work and work well. That > is probably a safe bet on all versions, though I'd have to check to be > sure. Oh, I see your point: if a whole lot of tables have the same relfrozenxid and it matches datfrozenxid, this won't help. Still, we can hope that that's an uncommon situation. If we postulate somebody trying to use scheduled "vacuumdb -z" in place of autovacuum, they shouldn't really have that situation. Successively vacuuming many tables should normally result in the tables' relfrozenxids not being all the same, unless they were unlucky enough to have a very long-running transaction holding back the global xmin horizon the whole time. regards, tom lane
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Michael Paquier <michael@paquier.xyz> — 2022-12-18T02:21:47Z
On Thu, Dec 15, 2022 at 01:56:30PM -0500, Tom Lane wrote: > * To fix vacuumdb properly, it might be enough to get it to > batch VACUUMs, say by naming up to 1000 tables per command > instead of just one. I'm not sure how that would interact > with its parallelization logic, though. It's not really > solving the O(N^2) issue either, just pushing it further out. I have been thinking about this part, and using a hardcoded rule for the batches would be tricky. The list of relations returned by the scan of pg_class are ordered by relpages, so depending on the distribution of the sizes (few tables with a large size and a lot of table with small sizes, exponential distribution of table sizes), we may finish with more downsides than upsides in some cases, even if we use a linear rule based on the number of relations, or even if we distribute the relations across the slots in a round robin fashion for example. In order to control all that, rather than a hardcoded rule, could it be as simple as introducing an option like vacuumdb --batch=N defaulting to 1 to let users control the number of relations grouped in a single command with a round robin distribution for each slot? -- Michael
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Christophe Pettus <xof@thebuild.com> — 2022-12-18T02:23:27Z
> In order to control all that, rather than a hardcoded rule, could it > be as simple as introducing an option like vacuumdb --batch=N > defaulting to 1 to let users control the number of relations grouped > in a single command with a round robin distribution for each slot? My first reaction to that is: Is it possible to explain to a DBA what N should be for a particular cluster?
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart <nathandbossart@gmail.com> — 2022-12-18T23:55:00Z
On Thu, Dec 15, 2022 at 08:39:54PM -0800, Peter Geoghegan wrote: > On Thu, Dec 15, 2022 at 1:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I could get behind manual VACUUM not invoking vac_update_datfrozenxid >> by default, perhaps. But if it can never call it, then that is a >> fairly important bit of housekeeping that is unreachable except by >> autovacuum. No doubt the people who turn off autovacuum are benighted, >> but they're still out there. > > I wouldn't mind adding another option for this to VACUUM. We already > have a couple of VACUUM options that are only really needed as escape > hatches, or perhaps as testing tools used by individual Postgres > hackers. Another one doesn't seem too bad. The VACUUM command should > eventually become totally niche, so I'm not too concerned about going > overboard here. Perhaps there could also be an update-datfrozenxid function that vacuumdb calls when finished with a database. Even if vacuum becomes smarter about calling vac_update_datfrozenxid, this might still be worth doing. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Michael Paquier <michael@paquier.xyz> — 2022-12-19T03:21:26Z
On Sat, Dec 17, 2022 at 06:23:27PM -0800, Christophe Pettus wrote: > My first reaction to that is: Is it possible to explain to a DBA > what N should be for a particular cluster? Assuming that we can come up with a rather straight-forward still portable rule for the distribution of the relations across of the slots like something I mentioned above (which is not the best thing depending on the sizes and the number of tables), that would be quite tricky IMO. -- Michael
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-28T20:13:23Z
[ redirecting to -hackers because patch attached ] Peter Geoghegan <pg@bowt.ie> writes: > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> That is a really good point. How about teaching VACUUM to track >> the oldest original relfrozenxid and relminmxid among the table(s) >> it processed, and skip vac_update_datfrozenxid unless at least one >> of those matches the database's values? For extra credit, also >> skip if we didn't successfully advance the source rel's value. > Hmm. I think that that would probably work. I poked into that idea some more and concluded that getting VACUUM to manage it behind the user's back is not going to work very reliably. The key problem is explained by this existing comment in autovacuum.c: * Even if we didn't vacuum anything, it may still be important to do * this, because one indirect effect of vac_update_datfrozenxid() is to * update ShmemVariableCache->xidVacLimit. That might need to be done * even if we haven't vacuumed anything, because relations with older * relfrozenxid values or other databases with older datfrozenxid values * might have been dropped, allowing xidVacLimit to advance. That is, if the table that's holding back datfrozenxid gets dropped between VACUUM runs, VACUUM would never think that it might have advanced the global minimum. I'm forced to the conclusion that we have to expose some VACUUM options if we want this to work well. Attached is a draft patch that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options (name bikeshedding welcome) and teaches vacuumdb to use them. Light testing says that this is a win: even on the regression database, which isn't all that big, I see a drop in vacuumdb's runtime from ~260 ms to ~175 ms. Of course this is a case where VACUUM doesn't really have anything to do, so it's a best-case scenario ... but still, I was expecting the effect to be barely above noise with this many tables, yet it's a good bit more. regards, tom lane -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart <nathandbossart@gmail.com> — 2022-12-28T21:12:53Z
On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: > I'm forced to the conclusion that we have to expose some VACUUM > options if we want this to work well. Attached is a draft patch > that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options > (name bikeshedding welcome) and teaches vacuumdb to use them. This is the conclusion I arrived at, too. In fact, I was just about to post a similar patch set. I'm attaching it here anyway, but I'm fine with proceeding with your version. I think the main difference between your patch and mine is that I've exposed vac_update_datfrozenxid() via a function instead of a VACUUM option. IMHO that feels a little more natural, but I can't say I feel too strongly about it. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-28T21:20:19Z
Nathan Bossart <nathandbossart@gmail.com> writes: > I think the main difference between your patch and mine is that I've > exposed vac_update_datfrozenxid() via a function instead of a VACUUM > option. IMHO that feels a little more natural, but I can't say I feel too > strongly about it. I thought about that but it seems fairly unsafe, because that means that vac_update_datfrozenxid is executing inside a user-controlled transaction. I don't think it will hurt us if the user does a ROLLBACK afterward --- but if he sits on the open transaction, that would be bad, if only because we're still holding the LockDatabaseFrozenIds lock which will block other VACUUMs. There might be more hazards besides that; certainly no one has ever tried to run vac_update_datfrozenxid that way before. regards, tom lane
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart <nathandbossart@gmail.com> — 2022-12-28T21:21:50Z
On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: > + /* If we used SKIP_DATABASE_STATS, mop up with ONLY_DATABASE_STATS */ > + if (vacopts->skip_database_stats && stage == ANALYZE_NO_STAGE && !failed) > + { > + executeCommand(conn, "VACUUM (ONLY_DATABASE_STATS);", echo); > + } When I looked at this, I thought it would be better to send the command through the parallel slot machinery so that failures would use the same code path as the rest of the VACUUM commands. However, you also need to adjust ParallelSlotsWaitCompletion() to mark the slots as idle so that the slot array can be reused after it is called. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart <nathandbossart@gmail.com> — 2022-12-28T21:23:21Z
On Wed, Dec 28, 2022 at 04:20:19PM -0500, Tom Lane wrote: > Nathan Bossart <nathandbossart@gmail.com> writes: >> I think the main difference between your patch and mine is that I've >> exposed vac_update_datfrozenxid() via a function instead of a VACUUM >> option. IMHO that feels a little more natural, but I can't say I feel too >> strongly about it. > > I thought about that but it seems fairly unsafe, because that means > that vac_update_datfrozenxid is executing inside a user-controlled > transaction. I don't think it will hurt us if the user does a > ROLLBACK afterward --- but if he sits on the open transaction, > that would be bad, if only because we're still holding the > LockDatabaseFrozenIds lock which will block other VACUUMs. > There might be more hazards besides that; certainly no one has ever > tried to run vac_update_datfrozenxid that way before. That's a good point. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-28T22:05:39Z
Nathan Bossart <nathandbossart@gmail.com> writes: > On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: >> + executeCommand(conn, "VACUUM (ONLY_DATABASE_STATS);", echo); > When I looked at this, I thought it would be better to send the command > through the parallel slot machinery so that failures would use the same > code path as the rest of the VACUUM commands. However, you also need to > adjust ParallelSlotsWaitCompletion() to mark the slots as idle so that the > slot array can be reused after it is called. Hm. I was just copying the way commands are issued further up in the same function. But I think you're right: once we've done ParallelSlotsAdoptConn(sa, conn); it's probably not entirely kosher to use the conn directly. regards, tom lane
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Justin Pryzby <pryzby@telsasoft.com> — 2022-12-29T01:23:54Z
On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: > Peter Geoghegan <pg@bowt.ie> writes: > > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> That is a really good point. How about teaching VACUUM to track > >> the oldest original relfrozenxid and relminmxid among the table(s) > >> it processed, and skip vac_update_datfrozenxid unless at least one > >> of those matches the database's values? For extra credit, also > >> skip if we didn't successfully advance the source rel's value. > > > Hmm. I think that that would probably work. > > I'm forced to the conclusion that we have to expose some VACUUM > options if we want this to work well. Attached is a draft patch > that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options > (name bikeshedding welcome) and teaches vacuumdb to use them. I was surprised to hear that this added *two* options. I assumed it would look like: VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) -- Justin -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Justin Pryzby <pryzby@telsasoft.com> — 2022-12-29T01:29:10Z
On Sun, Dec 18, 2022 at 11:21:47AM +0900, Michael Paquier wrote: > On Thu, Dec 15, 2022 at 01:56:30PM -0500, Tom Lane wrote: > > * To fix vacuumdb properly, it might be enough to get it to > > batch VACUUMs, say by naming up to 1000 tables per command > > instead of just one. I'm not sure how that would interact > > with its parallelization logic, though. It's not really > > solving the O(N^2) issue either, just pushing it further out. > > I have been thinking about this part, and using a hardcoded rule for > the batches would be tricky. The list of relations returned by the > scan of pg_class are ordered by relpages, so depending on the > distribution of the sizes (few tables with a large size and a lot of > table with small sizes, exponential distribution of table sizes), we > may finish with more downsides than upsides in some cases, even if we > use a linear rule based on the number of relations, or even if we > distribute the relations across the slots in a round robin fashion for > example. I've always found it weird that it uses "ORDER BY relpages". I'd prefer if it could ORDER BY age(relfrozenxid) or GREATEST(age(relfrozenxid), age(relminmxid)), at least if you specify one of the --min-*age parms. Or something less hardcoded and unconfigurable. -- Justin
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-29T02:17:54Z
Justin Pryzby <pryzby@telsasoft.com> writes: > On Wed, Dec 28, 2022 at 03:13:23PM -0500, Tom Lane wrote: >> I'm forced to the conclusion that we have to expose some VACUUM >> options if we want this to work well. Attached is a draft patch >> that invents SKIP_DATABASE_STATS and ONLY_DATABASE_STATS options >> (name bikeshedding welcome) and teaches vacuumdb to use them. > I assumed it would look like: > VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) Meh. We could do it like that, but I think options that look like booleans but aren't are messy. regards, tom lane -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Andres Freund <andres@anarazel.de> — 2022-12-29T03:03:29Z
Hi, On 2022-12-28 15:13:23 -0500, Tom Lane wrote: > [ redirecting to -hackers because patch attached ] > > Peter Geoghegan <pg@bowt.ie> writes: > > On Fri, Dec 16, 2022 at 6:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> That is a really good point. How about teaching VACUUM to track > >> the oldest original relfrozenxid and relminmxid among the table(s) > >> it processed, and skip vac_update_datfrozenxid unless at least one > >> of those matches the database's values? For extra credit, also > >> skip if we didn't successfully advance the source rel's value. > > > Hmm. I think that that would probably work. > > I poked into that idea some more and concluded that getting VACUUM to > manage it behind the user's back is not going to work very reliably. > The key problem is explained by this existing comment in autovacuum.c: > > * Even if we didn't vacuum anything, it may still be important to do > * this, because one indirect effect of vac_update_datfrozenxid() is to > * update ShmemVariableCache->xidVacLimit. That might need to be done > * even if we haven't vacuumed anything, because relations with older > * relfrozenxid values or other databases with older datfrozenxid values > * might have been dropped, allowing xidVacLimit to advance. > > That is, if the table that's holding back datfrozenxid gets dropped > between VACUUM runs, VACUUM would never think that it might have > advanced the global minimum. I wonder if a less aggressive version of this idea might still work. Perhaps we could use ShmemVariableCache->latestCompletedXid or ShmemVariableCache->nextXid to skip at least some updates? Obviously this isn't going to help if there's a lot of concurrent activity, but the case of just running vacuumdb -a might be substantially improved. Separately I wonder if it's worth micro-optimizing vac_update_datfrozenxid() a bit. I e.g. see a noticable speedup bypassing systable_getnext() and using heap_getnext(). It's really too bad that we want to check for "in the future" xids, otherwise we could use a ScanKey to filter at a lower level. Greetings, Andres Freund
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-29T17:22:58Z
I wrote: > Justin Pryzby <pryzby@telsasoft.com> writes: >> I assumed it would look like: >> VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) > Meh. We could do it like that, but I think options that look like > booleans but aren't are messy. Note that I'm not necessarily objecting to there being just one option, only to its values being a superset-of-boolean. Perhaps this'd work: VACUUM (DATABASE_STATS {UPDATE,SKIP,ONLY}) regards, tom lane -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart <nathandbossart@gmail.com> — 2022-12-29T18:26:57Z
On Thu, Dec 29, 2022 at 12:22:58PM -0500, Tom Lane wrote: >> Justin Pryzby <pryzby@telsasoft.com> writes: >>> VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) > VACUUM (DATABASE_STATS {UPDATE,SKIP,ONLY}) +1 for only introducing one option. IMHO UPDATE_DATABASE_STATS fits a little better since it states the action like most of the other options, but I think both choices are sufficiently clear. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-29T18:45:49Z
Nathan Bossart <nathandbossart@gmail.com> writes: > +1 for only introducing one option. IMHO UPDATE_DATABASE_STATS fits a > little better since it states the action like most of the other options, > but I think both choices are sufficiently clear. Consistency of VACUUM's options seems like a lost cause already :-(. Between them, DISABLE_PAGE_SKIPPING, SKIP_LOCKED, and PROCESS_TOAST cover just about the entire set of possibilities for describing a boolean option --- we couldn't even manage to be consistent about whether ON or OFF is the default, let alone where the verb is. And it's hard to argue that FULL, VERBOSE, or PARALLEL is a verb. regards, tom lane
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart <nathandbossart@gmail.com> — 2022-12-29T18:52:14Z
On Wed, Dec 28, 2022 at 07:03:29PM -0800, Andres Freund wrote: > Separately I wonder if it's worth micro-optimizing vac_update_datfrozenxid() a > bit. I e.g. see a noticable speedup bypassing systable_getnext() and using > heap_getnext(). It's really too bad that we want to check for "in the future" > xids, otherwise we could use a ScanKey to filter at a lower level. Another thing I'm exploring is looking up the datfrozenxid/datminmxid before starting the pg_class scan so that the scan can be stopped early if it sees that we cannot possibly advance the values. The overwrite-corrupt-values logic might make this a little more complicated, but I think it'd be sufficient to force the pg_class scan to complete if we ever see a value "in the future." Overwriting the corrupt value might be delayed, but it would eventually happen once the table ages advance. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
-
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2022-12-29T20:29:15Z
Nathan Bossart <nathandbossart@gmail.com> writes: > On Thu, Dec 29, 2022 at 12:22:58PM -0500, Tom Lane wrote: >> Justin Pryzby <pryzby@telsasoft.com> writes: >>> VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) >>>> VACUUM (DATABASE_STATS {UPDATE,SKIP,ONLY}) > +1 for only introducing one option. IMHO UPDATE_DATABASE_STATS fits a > little better since it states the action like most of the other options, > but I think both choices are sufficiently clear. I tried to make a patch along these lines, and soon hit a stumbling block: ONLY is a fully-reserved SQL keyword. I don't think this syntax is attractive enough to justify requiring people to double-quote the option, so we are back to square one. Anybody have a different suggestion? regards, tom lane -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Nathan Bossart <nathandbossart@gmail.com> — 2022-12-29T21:37:19Z
On Thu, Dec 29, 2022 at 03:29:15PM -0500, Tom Lane wrote: > Nathan Bossart <nathandbossart@gmail.com> writes: >> On Thu, Dec 29, 2022 at 12:22:58PM -0500, Tom Lane wrote: >>> Justin Pryzby <pryzby@telsasoft.com> writes: >>>> VACUUM (UPDATE_DATABASE_STATS {yes,no,only}) >>>>> VACUUM (DATABASE_STATS {UPDATE,SKIP,ONLY}) > >> +1 for only introducing one option. IMHO UPDATE_DATABASE_STATS fits a >> little better since it states the action like most of the other options, >> but I think both choices are sufficiently clear. > > I tried to make a patch along these lines, and soon hit a stumbling > block: ONLY is a fully-reserved SQL keyword. I don't think this > syntax is attractive enough to justify requiring people to > double-quote the option, so we are back to square one. Anybody > have a different suggestion? Hm. I thought about using PreventInTransactionBlock() for the function, but that probably won't work for a few reasons. AFAICT we'd need to restrict it to only be callable via "SELECT update_database_stats()", which feels a bit unnatural. There was some discussion elsewhere [0] about adding a PROCESS_MAIN_RELATION option or expanding PROCESS_TOAST to simplify vacuuming the TOAST table directly. If such an option existed, you could call VACUUM (PROCESS_MAIN_RELATION FALSE, PROCESS_TOAST FALSE, UPDATE_DATABASE_STATES TRUE) pg_class; to achieve roughly what we need. I'll admit this is hacky, though. So, adding both SKIP_DATABASE_STATS and ONLY_DATABASE_STATS might be the best bet. [0] https://postgr.es/m/20221215191246.GA252861%40nathanxps13 -- Nathan Bossart Amazon Web Services: https://aws.amazon.com -
Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
Tom Lane <tgl@sss.pgh.pa.us> — 2023-01-06T17:53:07Z
Nathan Bossart <nathandbossart@gmail.com> writes: > On Thu, Dec 29, 2022 at 03:29:15PM -0500, Tom Lane wrote: >> I tried to make a patch along these lines, and soon hit a stumbling >> block: ONLY is a fully-reserved SQL keyword. I don't think this >> syntax is attractive enough to justify requiring people to >> double-quote the option, so we are back to square one. Anybody >> have a different suggestion? > ... adding both SKIP_DATABASE_STATS and ONLY_DATABASE_STATS might be the > best bet. Nobody has proposed a different bikeshed color, so I'm going to proceed with that syntax. I'll incorporate the parallel-machinery fix from your patch and push to HEAD only (since it's hard to argue this isn't a new feature). This needn't foreclose pursuing the various ideas about making vac_update_datfrozenxid faster; but none of those would eliminate the fundamental O(N^2) issue AFAICS. regards, tom lane