Thread
-
BUG #19109: catalog lookup with the wrong snapshot during logical decoding causes coredump
PG Bug reporting form <noreply@postgresql.org> — 2025-11-10T15:22:11Z
The following bug has been logged on the website: Bug reference: 19109 Logged by: haiyang li Email address: mohen.lhy@alibaba-inc.com PostgreSQL version: 18.0 Operating system: any Description: Hi all, I got some similar core-dump files when running logical decoding test in PostgreSQL 14.19 (2 times in debug mode) and PostgreSQL 18.0(1 time in release mode). All the provided information is based on 14.19 due to more information about the core-dump. The stack trace is as follows: ``` Program terminated with signal SIGSEGV, Segmentation fault. ... #2 <signal handler called> #3 0x00000000007f41c6 in nocachegetattr (tuple=0x6ddd578, attnum=3, tupleDesc=0x7fb4586e8e48) at heaptuple.c:698 #4 0x00007fb4589d579d in tuple_to_stringinfo (s=0x2fb0bb0, tupdesc=0x7fb4586e8e48, tuple=0x6ddd578, skip_nulls=false) at test_decoding.c:560 #5 0x00007fb4589d5b2d in pg_decode_change (ctx=0x2fac888, txn=0x2fea668, relation=0x7fb4586db898, change=0x7145cf8) at test_decoding.c:645 #6 0x0000000000e66be6 in change_cb_wrapper (cache=0x2fbc8f8, txn=0x2fea668, relation=0x7fb4586db898, change=0x7145cf8) at logical.c:1076 #7 0x0000000000e72d2a in ReorderBufferApplyChange (rb=0x2fbc8f8, txn=0x2fea668, relation=0x7fb4586db898, change=0x7145cf8, streaming=false) at reorderbuffer.c:2020 #8 0x0000000000e73712 in ReorderBufferProcessTXN (rb=0x2fbc8f8, txn=0x2fea668, commit_lsn=36675124736, snapshot_now=0x2fc4580, command_id=0, streaming=false) at reorderbuffer.c:2319 #9 0x0000000000e743e8 in ReorderBufferReplay (txn=0x2fea668, rb=0x2fbc8f8, xid=10798, commit_lsn=36675124736, end_lsn=36675124784, commit_time=815706623738566, origin_id=0, origin_lsn=0) at reorderbuffer.c:2773 #10 0x0000000000e74466 in ReorderBufferCommit (rb=0x2fbc8f8, xid=10798, commit_lsn=36675124736, end_lsn=36675124784, commit_time=815706623738566, origin_id=0, origin_lsn=0) at reorderbuffer.c:2797 #11 0x0000000000e614fb in DecodeCommit (ctx=0x2fac888, buf=0x7ffd647e7ea0, parsed=0x7ffd647e7d40, xid=10798, two_phase=false) at decode.c:848 #12 0x0000000000e605b2 in DecodeXactOp (ctx=0x2fac888, buf=0x7ffd647e7ea0) at decode.c:303 #13 0x0000000000e600d1 in LogicalDecodingProcessRecord (ctx=0x2fac888, record=0x2facc48) at decode.c:144 #14 0x0000000000e69981 in pg_logical_slot_get_changes_guts (fcinfo=0x2fa0808, confirm=true, binary=false) at logicalfuncs.c:296 #15 0x0000000000e69b88 in pg_logical_slot_get_changes (fcinfo=0x2fa0808) at logicalfuncs.c:365 ... #35 0x0000000000f8df2b in exec_simple_query (query_string=0x2d732e8 "SELECT * FROM pg_logical_slot_get_changes("test_logical_decode_slot_0", NULL, NULL, "stream-changes", "on")") at postgres.c:1360 ... (gdb) f 3 #3 0x00000000007f41c6 in nocachegetattr (tuple=0x6ddd578, attnum=3, tupleDesc=0x7fb4586e8e48) at heaptuple.c:698 698 off = att_align_pointer(off, att->attalign, -1, (gdb) p tupleDesc->attrs[0] $3 = {attrelid = 35727, attname = {data = "id", '\000' <repeats 61 times>}, atttypid = 23, attstattarget = -1, attlen = 4, attnum = 1, attndims = 0, attcacheoff = 0, atttypmod = -1, attbyval = true, attalign = 105 'i', attstorage = 112 'p', attcompression = 0 '\000', attnotnull = true, atthasdef = false, atthasmissing = false, attidentity = 0 '\000', attgenerated = 0 '\000', attisdropped = false, attislocal = true, attinhcount = 0, attcollation = 0, attisinvisible = false} (gdb) p tupleDesc->attrs[1] $4 = {attrelid = 35727, attname = {data = "unique1", '\000' <repeats 56 times>}, atttypid = 23, attstattarget = -1, attlen = 4, attnum = 2, attndims = 0, attcacheoff = 4, atttypmod = -1, attbyval = true, attalign = 105 'i', attstorage = 112 'p', attcompression = 0 '\000', attnotnull = false, atthasdef = false, atthasmissing = false, attidentity = 0 '\000', attgenerated = 0 '\000', attisdropped = false, attislocal = true, attinhcount = 0, attcollation = 0, attisinvisible = false} (gdb) p tupleDesc->attrs[2] $5 = {attrelid = 35727, attname = {data = "hundred", '\000' <repeats 56 times>}, atttypid = 1700, attstattarget = -1, attlen = -1, attnum = 3, attndims = 0, attcacheoff = 8, atttypmod = -1, attbyval = false, attalign = 105 'i', attstorage = 120 'x', attcompression = 0 '\000', attnotnull = false, atthasdef = false, atthasmissing = false, attidentity = 0 '\000', attgenerated = 0 '\000', attisdropped = false, attislocal = true, attinhcount = 0, attcollation = 0, attisinvisible = false} (gdb) f 7 #7 0x0000000000e72d2a in ReorderBufferApplyChange (rb=0x2fbc8f8, txn=0x2fea668, relation=0x7fb4586db898, change=0x7145cf8, streaming=false) at reorderbuffer.c:2020 2020 rb->apply_change(rb, txn, relation, change); (gdb) p change->lsn $16 = 36675122736 (gdb) f 8 #8 0x0000000000e73712 in ReorderBufferProcessTXN (rb=0x2fbc8f8, txn=0x2fea668, commit_lsn=36675124736, snapshot_now=0x2fc4580, command_id=0, streaming=false) at reorderbuffer.c:2319 2319 ReorderBufferApplyChange(rb, txn, relation, change, (gdb) p *snapshot_now->xip@20 $17 = {10713, 10731, 10744, 10745, 10746, 10747, 10748, 10749, 10753, 10754, 10757, 10758, 10766, 10770, 10771, 10773, 10785, 10786, 10789, 10790} (gdb) p *snapshot_now $18 = {snapshot_type = SNAPSHOT_HISTORIC_MVCC, xmin = 10713, xmax = 10791, xip = 0x2fc4658, xcnt = 20, subxip = 0x0, subxcnt = 0, suboverflowed = false, takenDuringRecovery = false, copied = false, curcid = 0, speculativeToken = 0, vistest = 0x0, active_count = 5, regd_count = 0 ... ``` After further investigation, I found that the immediate cause of the crash was that the unique1 column is actually of type numeric (atttypid = 1700), but here it was mistakenly treated as int4 (atttypid = 23). This type mismatch led to a segmentation fault when retrieving the data. (After restoring from backup to the next LSN after txn 10798 commit (8/8A01FE30), I was able to confirm that the unique1 column is indeed of type numeric.) logical slot information: -[ RECORD 1 ]-------+--------------------------- slot_name | test_logical_decode_slot_0 plugin | test_decoding slot_type | logical datoid | 33795 database | ddl_full0 temporary | f active | f active_pid | xmin | catalog_xmin | 10713 restart_lsn | 8/3C425D18 confirmed_flush_lsn | 8/7B80FBA0 wal_status | extended safe_wal_size | two_phase | f Between txn 10713 and 10798, txn 10716 and 10722 successfully executed the column type change SQL: ALTER TABLE IF EXISTS t1 ALTER COLUMN unique1 SET DATA TYPE NUMERIC; Relevant WAL records: ``` -- 10716 rmgr: Standby len (rec/tot): 42/ 42, tx: 10716, lsn: 8/4259E558, prev 8/4259C6C0, desc: LOCK xid 10716 db 33795 rel 35727 rmgr: Heap2 len (rec/tot): 60/ 60, tx: 10716, lsn: 8/4259E588, prev 8/4259E558, desc: NEW_CID rel 1663/33795/1249; tid 84/1; cmin: 4294967295, cmax: 0, combo: 4294967295 rmgr: Heap2 len (rec/tot): 60/ 60, tx: 10716, lsn: 8/4259E5C8, prev 8/4259E588, desc: NEW_CID rel 1663/33795/1249; tid 84/2; cmin: 0, cmax: 4294967295, combo: 4294967295 rmgr: Heap len (rec/tot): 105/ 105, tx: 10716, lsn: 8/4259E608, prev 8/4259E5C8, desc: HOT_UPDATE off 1 xmax 10716 flags 0x60 ; new off 2 xmax 0, blkref #0: rel 1663/33795/1249 blk 84 rmgr: Heap2 len (rec/tot): 60/ 60, tx: 10716, lsn: 8/4259E678, prev 8/4259E608, desc: NEW_CID rel 1663/33795/2619; tid 6/24; cmin: 4294967295, cmax: 0, combo: 4294967295 rmgr: Heap len (rec/tot): 54/ 54, tx: 10716, lsn: 8/4259E6B8, prev 8/4259E678, desc: DELETE off 24 flags 0x00 KEYS_UPDATED , blkref #0: rel 1663/33795/2619 blk 6 ... rmgr: Transaction len (rec/tot): 4505/ 4505, tx: 10716, lsn: 8/4323DB78, prev 8/4323BCE0, desc: COMMIT 2025-11-06 01:10:08.196860 UTC; rels: base/33795/1632572 base/33795/1632575 base/33795/1632576 base/33795/1632606 base/33795/1632604 base/33795/1632603 base/33795/1632602 base/33795/1632601; inval msgs: catcache 51 catcache 50 catcache 51 catcache 50 catcache 50 catcache 51 catcache 50 catcache 50 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 51 catcache 50 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 51 catcache 50 catcache 32 catcache 7 catcache 6 catcache 7 catcache 6 catcache 51 catcache 50 catcache 76 catcache 75 catcache 76 catcache 75 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 7 catcache 6 catcache 7 catcache 6 catcache 32 catcache 51 catcache 50 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 76 catcache 75 catcache 76 catcache 75 catcache 51 catcache 50 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 59 relcache 1640003 relcache 1640003 relcache 1640004 relcache 1640000 snapshot 2608 snapshot 1214 relcache 1632575 snapshot 2608 relcache 1632576 relcache 1632575 snapshot 2608 snapshot 2608 snapshot 2608 relcache 35727 relcache 1622119 relcache 1622119 relcache 35727 relcache 1606414 relcache 1606414 relcache 35727 relcache 1545373 relcache 1545373 relcache 35727 relcache 1537952 relcache 1537952 relcache 35727 relcache 35733 relcache 35733 relcache 35727 relcache 1640000 snapshot 2608 relcache 1640000 snapshot 2608 relcache 1640003 relcache 1640004 relcache 1640004 relcache 1640003 snapshot 2608 relcache 1640003 snapshot 2608 relcache 1640000 snapshot 1214 relcache 35727 -- 10722 rmgr: Standby len (rec/tot): 42/ 42, tx: 10722, lsn: 8/4946A640, prev 8/4946A468, desc: LOCK xid 10722 db 33795 rel 35727 rmgr: Heap2 len (rec/tot): 60/ 60, tx: 10722, lsn: 8/4946A670, prev 8/4946A640, desc: NEW_CID rel 1663/33795/1249; tid 84/2; cmin: 4294967295, cmax: 0, combo: 4294967295 rmgr: Heap2 len (rec/tot): 60/ 60, tx: 10722, lsn: 8/4946A6B0, prev 8/4946A670, desc: NEW_CID rel 1663/33795/1249; tid 84/6; cmin: 0, cmax: 4294967295, combo: 4294967295 rmgr: Heap len (rec/tot): 76/ 76, tx: 10722, lsn: 8/4946A6F0, prev 8/4946A6B0, desc: HOT_UPDATE off 2 xmax 10722 flags 0x20 ; new off 6 xmax 0, blkref #0: rel 1663/33795/1249 blk 84 rmgr: Transaction len (rec/tot): 78/ 78, tx: 10722, lsn: 8/4946A740, prev 8/4946A6F0, desc: INVALIDATION ; inval msgs: catcache 7 catcache 6 relcache 35727 rmgr: Transaction len (rec/tot): 98/ 98, tx: 10722, lsn: 8/4946A790, prev 8/4946A740, desc: COMMIT 2025-11-06 01:10:09.687549 UTC; inval msgs: catcache 7 catcache 6 relcache 35727 -- 10798 rmgr: Heap len (rec/tot): 1211/ 1211, tx: 10798, lsn: 8/8A01F630, prev 8/8A01F560, desc: INSERT off 11 flags 0x08, blkref #0: rel 1663/33795/1677447 blk 0 rmgr: Btree len (rec/tot): 64/ 64, tx: 10798, lsn: 8/8A01FAF0, prev 8/8A01F630, desc: INSERT_LEAF off 5, blkref #0: rel 1663/33795/1695353 blk 1 rmgr: Gin len (rec/tot): 174/ 174, tx: 10798, lsn: 8/8A01FB30, prev 8/8A01FAF0, desc: UPDATE_META_PAGE , blkref #0: rel 1663/33795/1695354 blk 0, blkref #1: rel 1663/33795/1695354 blk 2 rmgr: Btree len (rec/tot): 64/ 64, tx: 10798, lsn: 8/8A01FBE0, prev 8/8A01FB30, desc: INSERT_LEAF off 5, blkref #0: rel 1663/33795/1695355 blk 1 rmgr: Hash len (rec/tot): 72/ 72, tx: 10798, lsn: 8/8A01FC20, prev 8/8A01FBE0, desc: INSERT off 6, blkref #0: rel 1663/33795/1695356 blk 2, blkref #1: rel 1663/33795/1695356 blk 0 rmgr: Gist len (rec/tot): 322/ 322, tx: 10798, lsn: 8/8A01FC68, prev 8/8A01FC20, desc: PAGE_UPDATE , blkref #0: rel 1663/33795/1695357 blk 0 rmgr: Btree len (rec/tot): 80/ 80, tx: 10798, lsn: 8/8A01FDB0, prev 8/8A01FC68, desc: INSERT_LEAF off 3, blkref #0: rel 1663/33795/1695358 blk 2 rmgr: Transaction len (rec/tot): 46/ 46, tx: 10798, lsn: 8/8A01FE00, prev 8/8A01FDB0, desc: COMMIT 2025-11-06 01:10:23.738566 UTC -- ALL RUNING_XACTS WAL RECORD rmgr: Standby len (rec/tot): 58/ 58, tx: 0, lsn: 8/3C425D18, prev 8/3C425CD8, desc: RUNNING_XACTS nextXid 10674 latestCompletedXid 10673 oldestRunningXid 10639; 2 xacts: 10639 10664 ... rmgr: Standby len (rec/tot): 62/ 62, tx: 0, lsn: 8/66737378, prev 8/66736B80, desc: RUNNING_XACTS nextXid 10742 latestCompletedXid 10741 oldestRunningXid 10713; 3 xacts: 10731 10713 10729 ... rmgr: Standby len (rec/tot): 66/ 66, tx: 0, lsn: 8/89C23500, prev 8/89C234A0, desc: RUNNING_XACTS nextXid 10798 latestCompletedXid 10797 oldestRunningXid 10767; 4 xacts: 10780 10769 10795 10767 ``` From the snapshot_now information for txn 10798, txn 10716 and 10722 are not present in xip, which appears unexpected. I have not yet figured out the root cause and will continue analyzing. If you need the complete WAL log from 8/3C425D18 to 8/8A01FE30, please let me know — I can provide it. regards, Haiyang Li