Re: POC: make mxidoff 64 bits
Heikki Linnakangas <hlinnaka@iki.fi>
From: Heikki Linnakangas <hlinnaka@iki.fi>
To: Maxim Orlov <orlovmg@gmail.com>
Cc: Alvaro Herrera <alvherre@alvh.no-ip.org>,
Alexander Korotkov <aekorotkov@gmail.com>, wenhui qiu
<qiuwenhuifx@gmail.com>,
Postgres hackers <pgsql-hackers@lists.postgresql.org>,
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Date: 2025-12-04T10:39:43Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Fix partial read handling in pg_upgrade's multixact conversion
- ac94ce8194e5 19 (unreleased) landed
-
Increase timeout in multixid_conversion upgrade test
- bd43940b02b2 19 (unreleased) landed
-
Improve sanity checks on multixid members length
- ecb553ae8211 19 (unreleased) landed
-
Clarify comment on multixid offset wraparound check
- 170361d7b869 14.21 landed
- b0b52b7123ae 15.16 landed
- 7d42e2367c6b 16.12 landed
- cd1a887fe9bf 17.8 landed
- 3fbad030a24d 18.2 landed
- 366dcdaf5779 19 (unreleased) landed
-
Never store 0 as the nextMXact
- 87a350e1f284 19 (unreleased) landed
-
Add runtime checks for bogus multixact offsets
- d4b7bde4183b 19 (unreleased) landed
-
Widen MultiXactOffset to 64 bits
- bd8d9c9bdfa0 19 (unreleased) landed
-
Move pg_multixact SLRU page format definitions to a separate header
- bb3b1c4f6462 19 (unreleased) landed
-
Convert confusing macros in multixact.c to static inline functions
- 0099b9408e8c 17.0 landed
-
Index SLRUs by 64-bit integers rather than by 32-bit integers
- 4ed8f0913bfd 17.0 cited
-
Cope with possible failure of the oldest MultiXact to exist.
- b6a3444fa635 9.4.4 cited
On 26/11/2025 17:50, Heikki Linnakangas wrote: > On 26/11/2025 17:23, Maxim Orlov wrote: >> On Tue, 25 Nov 2025 at 13:07, Heikki Linnakangas <hlinnaka@iki.fi >> <mailto:hlinnaka@iki.fi>> wrote: >>> GetOldMultiXactIdSingleMember() currently asserts that the offset is >>> never zero, but it should try to do something sensible in that case >>> instead of just failing. >> >> Correct me if I'm wrong, but we added the assertion that offsets are >> never 0, based on the idea that case #2 will never take place during an >> update. If this isn't the case, this assertion could be removed. >> The rest of the function appears to work correctly. >> >> I even think that, as an experiment, we could randomly reset some of the >> offsets to zero and nothing would happen, except that some data would >> be lost. > > +1 > >> The most sensible thing we can do is give the user a warning, right? >> Something like, "During the update, we encountered some weird offset >> that shouldn't have been there, but there's nothing we can do about it, >> just take note." > > Yep, makes sense. I read through the SLRU reading codepath, looking for all the things that could go wrong (not sure I got them all): 1. An SLRU file does not exist 2. An SLRU file is too short, i.e. a page does not exist 3. The offset in 'offsets' page is 0 4. The offset in 'offsets' page looks invalid, i.e. it's greater than nextOffset or smaller than oldestOffset. 5. The offset is out of order compared to its neighbors 6. The multixid has no members 7. The multixid has an invalid (0) member 8. A multixid has more than one updating member Some of those situations are theoretically are possible if there was a crash. We don't follow the WAL-before-data rule for these SLRUs. Instead, we piggyback on the WAL-before-data of the heap page that would reference the multixid. In other words, we rely on the fact that if a multixid write is missed or torn because of a crash, that multixid will not be referenced from anywhwere and will never be read. However, that doesn't hold for pg_upgrade. pg_upgrade will try to read all the multixids. So we need to make the multixact reading code tolerant of the situations that could be present after a crash. I think the right philosophy here is that we try to read all the old multixids, and do our best to interpret them the same way that the old server would. For those situations that can legitimately be present if the old server crashed at some point, be silent. For cases that should not happen, even if there was a crash, print a warning. For example, I think an SLRU file should never be missing (1) or truncated (2). But the zero offset (3), and (6) can happen. Perhaps we should check that all the files exist and have the correct sizes in the pre-check stage, and abort the upgrade early if anything is missing. That would be pretty cheap to check. - Heikki