Re: Add 64-bit XIDs into PostgreSQL 15

Robert Haas <robertmhaas@gmail.com>

From: Robert Haas <robertmhaas@gmail.com>

To: Peter Geoghegan <pg@bowt.ie>

Cc: Fujii Masao <masao.fujii@oss.nttdata.com>, Maxim Orlov <orlovmg@gmail.com>, pgsql-hackers <pgsql-hackers@postgresql.org>

Date: 2022-01-07T21:12:43Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Add SLRU tests for 64-bit page case
- a60b8a58f435 17.0 landed
Make use FullTransactionId in 2PC filenames
- 5a1dfde8334b 17.0 landed
Use larger segment file names for pg_notify
- 2cdf131c46e6 17.0 landed
Index SLRUs by 64-bit integers rather than by 32-bit integers
- 4ed8f0913bfd 17.0 landed

On Thu, Jan 6, 2022 at 3:45 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Jan 4, 2022 at 9:40 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > Could you tell me what happens if new tuple with XID larger than xid_base + 0xFFFFFFFF is inserted into the page? Such new tuple is not allowed to be inserted into that page?
>
> I fear that this patch will have many bugs along these lines. Example:
> Why is it okay that convert_page() may have to defragment a heap page,
> without holding a cleanup lock? That will subtly break code that holds
> a pin on the buffer, when a tuple slot contains a C pointer to a
> HeapTuple in shared memory (though only if we get unlucky).

Yeah. I think it's possible that some approach along the lines of what
is proposed here can work, but quality of implementation is a big
issue. This stuff is not easy to get right. Another thing that I'm
wondering about is the "double xmax" representation. That requires
some way of distinguishing when that representation is in use. I'd be
curious to know where we found the bits for that -- the tuple header
isn't exactly replete with extra bit space.

Also, if we have an epoch of some sort that is included in new page
headers but not old ones, that adds branches to code that might
sometimes be quite hot. I don't know how much of a problem that is,
but it seems worth worrying about.

For all of that, I don't particularly agree with Jim Finnerty's idea
that we ought to solve the problem by forcing sufficient space to
exist in the page pre-upgrade. There are some advantages to such
approaches, but they make it really hard to roll out changes. You have
to roll out the enabling change first, wait until everyone is running
a release that supports it, and only then release the technology that
requires the additional page space. Since we don't put new features
into back-branches -- and an on-disk format change would be a poor
place to start -- that would make rolling something like this out take
many years. I think we'll be much happier putting all the complexity
in the new release.

-- 
Robert Haas
EDB: http://www.enterprisedb.com