Re: POC: make mxidoff 64 bits
Heikki Linnakangas <hlinnaka@iki.fi>
From: Heikki Linnakangas <hlinnaka@iki.fi>
To: Maxim Orlov <orlovmg@gmail.com>
Cc: wenhui qiu <qiuwenhuifx@gmail.com>,
Alexander Korotkov <aekorotkov@gmail.com>,
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>,
Postgres hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-11-12T13:00:02Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix partial read handling in pg_upgrade's multixact conversion
- ac94ce8194e5 19 (unreleased) landed
-
Increase timeout in multixid_conversion upgrade test
- bd43940b02b2 19 (unreleased) landed
-
Improve sanity checks on multixid members length
- ecb553ae8211 19 (unreleased) landed
-
Clarify comment on multixid offset wraparound check
- 170361d7b869 14.21 landed
- b0b52b7123ae 15.16 landed
- 7d42e2367c6b 16.12 landed
- cd1a887fe9bf 17.8 landed
- 3fbad030a24d 18.2 landed
- 366dcdaf5779 19 (unreleased) landed
-
Never store 0 as the nextMXact
- 87a350e1f284 19 (unreleased) landed
-
Add runtime checks for bogus multixact offsets
- d4b7bde4183b 19 (unreleased) landed
-
Widen MultiXactOffset to 64 bits
- bd8d9c9bdfa0 19 (unreleased) landed
-
Move pg_multixact SLRU page format definitions to a separate header
- bb3b1c4f6462 19 (unreleased) landed
-
Convert confusing macros in multixact.c to static inline functions
- 0099b9408e8c 17.0 landed
-
Index SLRUs by 64-bit integers rather than by 32-bit integers
- 4ed8f0913bfd 17.0 cited
-
Cope with possible failure of the oldest MultiXact to exist.
- b6a3444fa635 9.4.4 cited
Attachments
- consume-mxids.patch.txt (text/plain)
- v24-0001-Move-pg_multixact-SLRU-page-format-definitions-t.patch (text/x-patch) patch v24-0001
- v24-0002-Use-64-bit-multixact-offsets.patch (text/x-patch) patch v24-0002
- v24-0003-Add-pg_upgrade-for-64-bit-multixact-offsets.patch (text/x-patch) patch v24-0003
- v24-0004-Remove-oldestOffset-oldestOffsetKnown-from-multi.patch (text/x-patch) patch v24-0004
- v24-0005-TEST-bump-catversion.patch (text/x-patch) patch v24-0005
- v24-0006-TEST-Add-test-for-64-bit-mxoff-in-pg_resetwal.patch (text/x-patch) patch v24-0006
- v24-0007-TEST-Add-test-for-wraparound-of-next-new-multi-i.patch (text/x-patch) patch v24-0007
- v24-0008-TEST-Add-test-for-64-bit-mxoff-in-pg_upgrade.patch (text/x-patch) patch v24-0008
On 07/11/2025 18:03, Maxim Orlov wrote: > I tried finding out how long it would take to convert a big number of > segments. Unfortunately, I only have access to a very old machine right > now. It took me 7 hours to generate this much data on my old > Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 16 Gb of RAM. > > Here are my rough measurements: > > HDD > $ sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches > $ time pg_upgrade > ... > real 4m59.459s > user 0m19.974s > sys 0m13.640s > > SSD > $ sudo sync && echo 3 | sudo tee /proc/sys/vm/drop_caches > $ time pg_upgrade > ... > real 4m52.958s > user 0m19.826s > sys 0m13.624s > > I aim to get access to more modern stuff and check it all out there. Thanks, I also did some perf testing on my laptop. I wrote a little helper function to consume multixids, and used it to create a v17 cluster with 100 million multixids. See attached consume-mxids.patch.txt. I then ran pg_upgrade on that, and measured how long the pg_multixact conversion part of pg_upgrade took. It took about 1.2 s on my laptop. Extrapolating from that, converting 1 billion multixids would take 12 s. These were very simple multixacts with just one member each, though; realistic multixacts with more members would presumably take a little longer. In any case, I think we're in an acceptable ballpark here. There's some very low-hanging fruit though: Profiling with 'linux-perf' suggested that a lot of CPU time was spent simply on the function call overhead of GetOldMultiXactIdSingleMember, SlruReadSwitchPage, RecordNewMultiXact, SlruWriteSwitchPage for each multixact. I added an inlined fast path to SlruReadSwitchPage and SlruWriteSwitchPage to eliminate the function call overhead of those in the common case that no page switch is needed. With that, the 100 million mxid test case I used went from 1.2 s to 0.9 s. We could optimize this further but I think this is good enough. Some other changes since patch set v23: - Rebased. I committed the wraparound bug fixes. - I added an SlruFileName() helper function to slru_io.c, and support for reading SLRUs with long_segment_names==true. It's not needed currently, but it seemed like a weird omission. AllocSlruRead() actually left 'long_segment_names' uninitialized which is error-prone. We could've just documented it, but it seems just as easy to support it. - I split the multixact_internal.h header in a separate commit, to make it more clear what changes are related to 64-bit offsets I kept all the new test cases for now. We need to decide which ones are worth keeping, and polish and speed up the ones we decide to keep. I'm getting one failure from the pg_upgrade/008_mxoff test: > [14:43:38.422](0.530s) not ok 26 - dump outputs from original and restored regression databases match > [14:43:38.422](0.000s) # Failed test 'dump outputs from original and restored regression databases match' > # at /home/heikki/git-sandbox/postgresql/src/test/perl/PostgreSQL/Test/Utils.pm line 801. > [14:43:38.422](0.000s) # got: '1' > # expected: '0' > === diff of /home/heikki/git-sandbox/postgresql/build/testrun/pg_upgrade/008_mxoff/data/tmp_test_AC6A/oldnode_6_dump.sql_adjusted and /home/heikki/git-sandbox/postgresql/build/testrun/pg_upgrade/008_mxoff/data/tmp_test_AC6A/newnode_6_dump.sql_adjusted > === stdout === > --- /home/heikki/git-sandbox/postgresql/build/testrun/pg_upgrade/008_mxoff/data/tmp_test_AC6A/oldnode_6_dump.sql_adjusted 2025-11-12 14:43:38.030399957 +0200 > +++ /home/heikki/git-sandbox/postgresql/build/testrun/pg_upgrade/008_mxoff/data/tmp_test_AC6A/newnode_6_dump.sql_adjusted 2025-11-12 14:43:38.314399819 +0200 > @@ -2,8 +2,8 @@ > -- PostgreSQL database dump > -- > \restrict test > --- Dumped from database version 17.6 > --- Dumped by pg_dump version 17.6 > +-- Dumped from database version 19devel > +-- Dumped by pg_dump version 19devel > SET statement_timeout = 0; > SET lock_timeout = 0; > SET idle_in_transaction_session_timeout = 0;=== stderr === > === EOF === > [14:43:38.425](0.004s) # >>> case #6 I ran the test with: (rm -rf build/testrun/ build/tmp_install/; olddump=/tmp/olddump-regress.sql oldinstall=/home/heikki/pgsql.17stable/ meson test -C build --suite setup --suite pg_upgrade) - Heikki