Re: Add 64-bit XIDs into PostgreSQL 15
Yura Sokolov <y.sokolov@postgrespro.ru>
From: Yura Sokolov <y.sokolov@postgrespro.ru>
To: Evgeny Voropaev <evorop.wiki@gmail.com>, Maxim Orlov <orlovmg@gmail.com>
Cc: PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Date: 2025-06-11T13:12:39Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add SLRU tests for 64-bit page case
- a60b8a58f435 17.0 landed
-
Make use FullTransactionId in 2PC filenames
- 5a1dfde8334b 17.0 landed
-
Use larger segment file names for pg_notify
- 2cdf131c46e6 17.0 landed
-
Index SLRUs by 64-bit integers rather than by 32-bit integers
- 4ed8f0913bfd 17.0 landed
11.06.2025 09:00, Evgeny Voropaev wrote: > 2) About repairing fragmentation. > > The original approach implemented in PG18 assumes that fragmentation > occurs during every `prune_freeze` operation. It happens because the > logic of the "redo"-function `heap_xlog_prune_freeze` assumes that > fragmentation has to be done by `heap_page_prune_execute`. > Attempting to > omit fragmentation can result in page inconsistencies on the "redo"-side > (i.e. on a secondary node, or during the recovery process on primary > one). No! Because patch uses flag in WAL record to instruct "redo"-side to omit fragmentation as well if needed. > So, implementation of optional repairing of fragmentation > conflicts with the basic assumption about "necessity of fragmentation". > In order to prevent inconsistency xid64v62 patch invokes > `heap_page_prune_and_freeze` with `repairFragmentation` equal to true > from everywhere in the patch code except from > `heap_page_prepare_for_xid` which uses `repairFragmentation=false`. > > So, why must we perform a `heap_page_prune_execute` without a > fragmentation during the preparation of a page for xid? > > What exactly would break if we did invoke `heap_page_prune_execute` with > `repairFragmentation=true` during performing of `heap_page_prepare_for_xid`? Short answer: - `repairFragmentation` parameter were added after investigating real production issues with earlier patch versions. Long answer: How SELECT works with tuples on a page? It: - PINS the page - takes CONTENT LOCK in SHARED mode - collects HeapTuples which LOOKS INTO RAW PAGE with t_data.t_choice.t_heap - RELEASES content lock - may use those HeapTuples for indefinitely long time relying only on PIN of the page. I.e. SELECT relies on the fact, while a page is pinned, tuples on the page stay at the same positions in memory. That is why LockBufferForCleanup and ConditionalLockBufferForCleanup checks there is only single PIN on the page - only backend which will perform cleanup is allowed to PIN the page. UPDATE/INSERT/DELETE lock CONTENT LOCK in EXCLUSIVE mode because they may add new tuples. But they are not allowed to move tuples because concurrent backends allowed to read tuples from the page in exactly same moment. -- regards Yura Sokolov aka funny-falcon