Re: POC: make mxidoff 64 bits

Heikki Linnakangas <hlinnaka@iki.fi>

From: Heikki Linnakangas <hlinnaka@iki.fi>
To: Maxim Orlov <orlovmg@gmail.com>, wenhui qiu <qiuwenhuifx@gmail.com>
Cc: Alexander Korotkov <aekorotkov@gmail.com>, Postgres hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-04-01T18:25:16Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Fix partial read handling in pg_upgrade's multixact conversion

  2. Increase timeout in multixid_conversion upgrade test

  3. Improve sanity checks on multixid members length

  4. Clarify comment on multixid offset wraparound check

  5. Never store 0 as the nextMXact

  6. Add runtime checks for bogus multixact offsets

  7. Widen MultiXactOffset to 64 bits

  8. Move pg_multixact SLRU page format definitions to a separate header

  9. Convert confusing macros in multixact.c to static inline functions

  10. Index SLRUs by 64-bit integers rather than by 32-bit integers

  11. Cope with possible failure of the oldest MultiXact to exist.

On 07/03/2025 13:30, Maxim Orlov wrote:
> Here is a rebase, v14.

Thanks! I did some manual testing of this. I created a little helper 
function to consume multixids, to test the autovacuum behavior, and 
found one issue:

If you consume a lot of multixid members space, by creating lots of 
multixids with huge number of members in each, you can end up with a 
very bloated members SLRU, and autovacuum is in no hurry to clean it up. 
Here's what I did:

1. Installed attached test module
2. Ran "select consume_multixids(10000, 100000);" many times
3. ran:

$ du -h data/pg_multixact/members/
26G	data/pg_multixact/members/

When I run "vacuum freeze; select * from pg_database;", I can see that 
'datminmxid' for the current database is advanced. However, autovacuum 
is in no hurry to vacuum 'template0' and 'template1', so 
pg_multixact/members/ does not get truncated. Eventually, when 
autovacuum_multixact_freeze_max_age is reached, it presumably will, but 
you will run out of disk space before that.

There is this check for members size at the end of SetOffsetVacuumLimit():

> 
> 	/*
> 	 * Do we need autovacuum?	If we're not sure, assume yes.
> 	 */
> 	return !oldestOffsetKnown ||
> 		(nextOffset - oldestOffset > MULTIXACT_MEMBER_AUTOVAC_THRESHOLD);

And the caller (SetMultiXactIdLimit()) will in fact signal the 
autovacuum launcher after "vacuum freeze" because of that. But 
autovacuum launcher will look at the datminmxid / relminmxid values, see 
that they are well within autovacuum_multixact_freeze_max_age, and do 
nothing.

This is a very extreme case, but clearly the code to signal autovacuum 
launcher, and the freeze age cutoff that autovacuum then uses, are not 
in sync.

This patch removed MultiXactMemberFreezeThreshold(), per my suggestion, 
but we threw this baby with the bathwater. We discussed that in this 
thread, but didn't come up with any solution. But ISTM we still need 
something like MultiXactMemberFreezeThreshold() to trigger autovacuum 
freezing if the members have grown too large.

-- 
Heikki Linnakangas
Neon (https://neon.tech)