Re: POC: make mxidoff 64 bits
Heikki Linnakangas <hlinnaka@iki.fi>
From: Heikki Linnakangas <hlinnaka@iki.fi>
To: Maxim Orlov <orlovmg@gmail.com>
Cc: wenhui qiu <qiuwenhuifx@gmail.com>,
Alexander Korotkov <aekorotkov@gmail.com>,
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>,
Postgres hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-10-30T09:10:43Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix partial read handling in pg_upgrade's multixact conversion
- ac94ce8194e5 19 (unreleased) landed
-
Increase timeout in multixid_conversion upgrade test
- bd43940b02b2 19 (unreleased) landed
-
Improve sanity checks on multixid members length
- ecb553ae8211 19 (unreleased) landed
-
Clarify comment on multixid offset wraparound check
- 170361d7b869 14.21 landed
- b0b52b7123ae 15.16 landed
- 7d42e2367c6b 16.12 landed
- cd1a887fe9bf 17.8 landed
- 3fbad030a24d 18.2 landed
- 366dcdaf5779 19 (unreleased) landed
-
Never store 0 as the nextMXact
- 87a350e1f284 19 (unreleased) landed
-
Add runtime checks for bogus multixact offsets
- d4b7bde4183b 19 (unreleased) landed
-
Widen MultiXactOffset to 64 bits
- bd8d9c9bdfa0 19 (unreleased) landed
-
Move pg_multixact SLRU page format definitions to a separate header
- bb3b1c4f6462 19 (unreleased) landed
-
Convert confusing macros in multixact.c to static inline functions
- 0099b9408e8c 17.0 landed
-
Index SLRUs by 64-bit integers rather than by 32-bit integers
- 4ed8f0913bfd 17.0 cited
-
Cope with possible failure of the oldest MultiXact to exist.
- b6a3444fa635 9.4.4 cited
On 30/10/2025 08:13, Maxim Orlov wrote: > On Tue, 28 Oct 2025 at 17:17, Heikki Linnakangas <hlinnaka@iki.fi > <mailto:hlinnaka@iki.fi>> wrote: > > On 27/10/2025 17:54, Maxim Orlov wrote: > > > If backend C looks up multixid 101 in between steps 3 and 4, it would > read the offset incorrectly, because 'base' isn't set yet. > > Hmm, maybe I miss something? We set page base on first write of any > offset on the page, not only the first one. In other words, there > should never be a case when we read an offset without a previously > defined page base. Correct me if I'm wrong: > 1. Backend A assigned mxact=100, offset=1000. > 2. Backend B assigned mxact=101, offset=1010. > 3. Backend B calls RecordNewMultiXact()/MXOffsetWrite() and > set page base=1010, offset plus 0^0x80000000 bit while > holding lock on the page. > 4. Backend C looks up for the mxact=101 by calling MXOffsetRead() > and should get exactly what he's looking for: > base (1010) + offset (0) minus 0x80000000 bit. > 5. Backend A calls RecordNewMultiXact() and sets his offset using > existing base from step 3. Oh I see, the 'base' is not necessarily the base offset of the first multixact on the page, it's the base offset of the first multixid that is written to the page. And the (short) offsets can be negative. That's a frighteningly clever encoding scheme. One upshot of that is that WAL redo might get construct the page with a different 'base'. I guess that works, but it scares me. Could we come up with a more deterministic scheme? - Heikki