Re: Add 64-bit XIDs into PostgreSQL 15

Chris Travers <chris.travers@gmail.com>

From: Chris Travers <chris.travers@gmail.com>
To: pgsql-hackers@lists.postgresql.org
Cc: Fedor Sigaev <teodor@sigaev.ru>, Alexander Korotkov <aekorotkov@gmail.com>, Aleksander Alekseev <afiskon@gmail.com>, Konstantin Knizhnik <knizhnik@garret.ru>, Nikita Glukhov <n.gluhov@postgrespro.ru>, Yura Sokolov <y.sokolov@postgrespro.ru>, Maxim Orlov <orlovmg@gmail.com>, Pavel Borisov <pashkin.elfe@gmail.com>
Date: 2022-11-21T07:58:00Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Add SLRU tests for 64-bit page case

  2. Make use FullTransactionId in 2PC filenames

  3. Use larger segment file names for pg_notify

  4. Index SLRUs by 64-bit integers rather than by 32-bit integers

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

I have a very serious concern about the current patch set. as someone who has faced transaction id wraparound in the past.

I can start by saying I think it would be helpful (if the other issues are approached reasonably) to have 64-bit xids, but there is an important piece of context in reventing xid wraparounds that seems missing from this patch unless I missed something.

XID wraparound is a symptom, not an underlying problem.  It usually occurs when autovacuum or other vacuum strategies have unexpected stalls and therefore fail to work as expected.  Shifting to 64-bit XIDs dramatically changes the sorts of problems that these stalls are likely to pose to operational teams.  -- you can find you are running out of storage rather than facing an imminent database shutdown.  Worse, this patch delays the problem until some (possibly far later!) time, when vacuum will take far longer to finish, and options for resolving the problem are diminished.  As a result I am concerned that merely changing xids from 32-bit to 64-bit will lead to a smaller number of far more serious outages.

What would make a big difference from my perspective would be to combine this with an inverse system for warning that there is a problem, allowing the administrator to throw warnings about xids since last vacuum, with a configurable threshold.  We could have this at two billion by default as that would pose operational warnings not much later than we have now.

Otherwise I can imagine cases where instead of 30 hours to vacuum a table, it takes 300 hours on a database that is short on space.  And I would not want to be facing such a situation.

The new status of this patch is: Waiting on Author