Thread

  1. BUG #19109: catalog lookup with the wrong snapshot during logical decoding causes coredump

    PG Bug reporting form <noreply@postgresql.org> — 2025-11-10T15:22:11Z

    The following bug has been logged on the website:
    
    Bug reference:      19109
    Logged by:          haiyang li
    Email address:      mohen.lhy@alibaba-inc.com
    PostgreSQL version: 18.0
    Operating system:   any
    Description:        
    
    Hi all,
    I got some similar core-dump files when running logical decoding test in
    PostgreSQL 14.19 (2
    times in debug mode) and PostgreSQL 18.0(1 time in release mode). All the
    provided information
    is based on 14.19 due to more information about the core-dump.
    
    The stack trace is as follows:
    ```
    Program terminated with signal SIGSEGV, Segmentation fault.
    ...
    #2  <signal handler called>
    #3  0x00000000007f41c6 in nocachegetattr (tuple=0x6ddd578, attnum=3,
    tupleDesc=0x7fb4586e8e48) at heaptuple.c:698
    #4  0x00007fb4589d579d in tuple_to_stringinfo (s=0x2fb0bb0,
    tupdesc=0x7fb4586e8e48, tuple=0x6ddd578, skip_nulls=false) at
    test_decoding.c:560
    #5  0x00007fb4589d5b2d in pg_decode_change (ctx=0x2fac888, txn=0x2fea668,
    relation=0x7fb4586db898, change=0x7145cf8) at test_decoding.c:645
    #6  0x0000000000e66be6 in change_cb_wrapper (cache=0x2fbc8f8, txn=0x2fea668,
    relation=0x7fb4586db898, change=0x7145cf8) at logical.c:1076
    #7  0x0000000000e72d2a in ReorderBufferApplyChange (rb=0x2fbc8f8,
    txn=0x2fea668, relation=0x7fb4586db898, change=0x7145cf8, streaming=false)
    at reorderbuffer.c:2020
    #8  0x0000000000e73712 in ReorderBufferProcessTXN (rb=0x2fbc8f8,
    txn=0x2fea668, commit_lsn=36675124736, snapshot_now=0x2fc4580, command_id=0,
    streaming=false) at reorderbuffer.c:2319
    #9  0x0000000000e743e8 in ReorderBufferReplay (txn=0x2fea668, rb=0x2fbc8f8,
    xid=10798, commit_lsn=36675124736, end_lsn=36675124784,
    commit_time=815706623738566, origin_id=0, origin_lsn=0) at
    reorderbuffer.c:2773
    #10 0x0000000000e74466 in ReorderBufferCommit (rb=0x2fbc8f8, xid=10798,
    commit_lsn=36675124736, end_lsn=36675124784, commit_time=815706623738566,
    origin_id=0, origin_lsn=0) at reorderbuffer.c:2797
    #11 0x0000000000e614fb in DecodeCommit (ctx=0x2fac888, buf=0x7ffd647e7ea0,
    parsed=0x7ffd647e7d40, xid=10798, two_phase=false) at decode.c:848
    #12 0x0000000000e605b2 in DecodeXactOp (ctx=0x2fac888, buf=0x7ffd647e7ea0)
    at decode.c:303
    #13 0x0000000000e600d1 in LogicalDecodingProcessRecord (ctx=0x2fac888,
    record=0x2facc48) at decode.c:144
    #14 0x0000000000e69981 in pg_logical_slot_get_changes_guts
    (fcinfo=0x2fa0808, confirm=true, binary=false) at logicalfuncs.c:296
    #15 0x0000000000e69b88 in pg_logical_slot_get_changes (fcinfo=0x2fa0808) at
    logicalfuncs.c:365
    ...
    #35 0x0000000000f8df2b in exec_simple_query (query_string=0x2d732e8 "SELECT
    * FROM pg_logical_slot_get_changes("test_logical_decode_slot_0", NULL, NULL,
    "stream-changes", "on")") at postgres.c:1360
    ...
    
    (gdb) f 3
    #3  0x00000000007f41c6 in nocachegetattr (tuple=0x6ddd578, attnum=3,
    tupleDesc=0x7fb4586e8e48) at heaptuple.c:698
    698                                             off = att_align_pointer(off,
    att->attalign, -1,
    (gdb) p tupleDesc->attrs[0]
    $3 = {attrelid = 35727, attname = {data = "id", '\000' <repeats 61 times>},
    atttypid = 23, attstattarget = -1, attlen = 4, attnum = 1, attndims = 0,
      attcacheoff = 0, atttypmod = -1, attbyval = true, attalign = 105 'i',
    attstorage = 112 'p', attcompression = 0 '\000', attnotnull = true,
    atthasdef = false,
      atthasmissing = false, attidentity = 0 '\000', attgenerated = 0 '\000',
    attisdropped = false, attislocal = true, attinhcount = 0, attcollation = 0,
      attisinvisible = false}
    (gdb) p tupleDesc->attrs[1]
    $4 = {attrelid = 35727, attname = {data = "unique1", '\000' <repeats 56
    times>}, atttypid = 23, attstattarget = -1, attlen = 4, attnum = 2, attndims
    = 0,
      attcacheoff = 4, atttypmod = -1, attbyval = true, attalign = 105 'i',
    attstorage = 112 'p', attcompression = 0 '\000', attnotnull = false,
    atthasdef = false,
      atthasmissing = false, attidentity = 0 '\000', attgenerated = 0 '\000',
    attisdropped = false, attislocal = true, attinhcount = 0, attcollation = 0,
      attisinvisible = false}
    (gdb) p tupleDesc->attrs[2]
    $5 = {attrelid = 35727, attname = {data = "hundred", '\000' <repeats 56
    times>}, atttypid = 1700, attstattarget = -1, attlen = -1, attnum = 3,
    attndims = 0,
      attcacheoff = 8, atttypmod = -1, attbyval = false, attalign = 105 'i',
    attstorage = 120 'x', attcompression = 0 '\000', attnotnull = false,
    atthasdef = false,
      atthasmissing = false, attidentity = 0 '\000', attgenerated = 0 '\000',
    attisdropped = false, attislocal = true, attinhcount = 0, attcollation = 0,
      attisinvisible = false}
    
    (gdb) f 7
    #7  0x0000000000e72d2a in ReorderBufferApplyChange (rb=0x2fbc8f8,
    txn=0x2fea668, relation=0x7fb4586db898, change=0x7145cf8, streaming=false)
    at reorderbuffer.c:2020
    2020                    rb->apply_change(rb, txn, relation, change);
    (gdb) p change->lsn
    $16 = 36675122736
    
    (gdb) f 8
    #8  0x0000000000e73712 in ReorderBufferProcessTXN (rb=0x2fbc8f8,
    txn=0x2fea668, commit_lsn=36675124736, snapshot_now=0x2fc4580, command_id=0,
    streaming=false)
        at reorderbuffer.c:2319
    2319
    ReorderBufferApplyChange(rb, txn, relation, change,
    (gdb) p *snapshot_now->xip@20
    $17 = {10713, 10731, 10744, 10745, 10746, 10747, 10748, 10749, 10753, 10754,
    10757, 10758, 10766, 10770, 10771, 10773, 10785, 10786, 10789, 10790}
    (gdb) p *snapshot_now
    $18 = {snapshot_type = SNAPSHOT_HISTORIC_MVCC, xmin = 10713, xmax = 10791,
    xip = 0x2fc4658, xcnt = 20, subxip = 0x0, subxcnt = 0, suboverflowed =
    false,
      takenDuringRecovery = false, copied = false, curcid = 0, speculativeToken
    = 0, vistest = 0x0, active_count = 5, regd_count = 0
    ...
    ```
    
    After further investigation, I found that the immediate cause of the crash
    was that the unique1
    column is actually of type numeric (atttypid = 1700), but here it was
    mistakenly treated as
    int4 (atttypid = 23). This type mismatch led to a segmentation fault when
    retrieving the data.
    (After restoring from backup to the next LSN after txn 10798 commit
    (8/8A01FE30), I was able to
    confirm that the unique1 column is indeed of type numeric.)
    
    logical slot information:
    -[ RECORD 1 ]-------+---------------------------
    slot_name           | test_logical_decode_slot_0
    plugin              | test_decoding
    slot_type           | logical
    datoid              | 33795
    database            | ddl_full0
    temporary           | f
    active              | f
    active_pid          |
    xmin                |
    catalog_xmin        | 10713
    restart_lsn         | 8/3C425D18
    confirmed_flush_lsn | 8/7B80FBA0
    wal_status          | extended
    safe_wal_size       |
    two_phase           | f
    
    Between txn 10713 and 10798, txn 10716 and 10722 successfully executed the
    column type change SQL:
    ALTER TABLE IF EXISTS t1 ALTER COLUMN unique1 SET DATA TYPE NUMERIC;
    
    Relevant WAL records:
    ```
    -- 10716
    rmgr: Standby     len (rec/tot):     42/    42, tx:      10716, lsn:
    8/4259E558, prev 8/4259C6C0, desc: LOCK xid 10716 db 33795 rel 35727
    rmgr: Heap2       len (rec/tot):     60/    60, tx:      10716, lsn:
    8/4259E588, prev 8/4259E558, desc: NEW_CID rel 1663/33795/1249; tid 84/1;
    cmin: 4294967295, cmax: 0, combo: 4294967295
    rmgr: Heap2       len (rec/tot):     60/    60, tx:      10716, lsn:
    8/4259E5C8, prev 8/4259E588, desc: NEW_CID rel 1663/33795/1249; tid 84/2;
    cmin: 0, cmax: 4294967295, combo: 4294967295
    rmgr: Heap        len (rec/tot):    105/   105, tx:      10716, lsn:
    8/4259E608, prev 8/4259E5C8, desc: HOT_UPDATE off 1 xmax 10716 flags 0x60 ;
    new off 2 xmax 0, blkref #0: rel 1663/33795/1249 blk 84
    rmgr: Heap2       len (rec/tot):     60/    60, tx:      10716, lsn:
    8/4259E678, prev 8/4259E608, desc: NEW_CID rel 1663/33795/2619; tid 6/24;
    cmin: 4294967295, cmax: 0, combo: 4294967295
    rmgr: Heap        len (rec/tot):     54/    54, tx:      10716, lsn:
    8/4259E6B8, prev 8/4259E678, desc: DELETE off 24 flags 0x00 KEYS_UPDATED ,
    blkref #0: rel 1663/33795/2619 blk 6
    ...
    rmgr: Transaction len (rec/tot):   4505/  4505, tx:      10716, lsn:
    8/4323DB78, prev 8/4323BCE0, desc: COMMIT 2025-11-06 01:10:08.196860 UTC;
    rels: base/33795/1632572 base/33795/1632575 base/33795/1632576
    base/33795/1632606 base/33795/1632604 base/33795/1632603 base/33795/1632602
    base/33795/1632601; inval msgs: catcache 51 catcache 50 catcache 51 catcache
    50 catcache 50 catcache 51 catcache 50 catcache 50 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 51 catcache
    50 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 51 catcache 50 catcache 32 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 51 catcache 50 catcache 76
    catcache 75 catcache 76 catcache 75 catcache 51 catcache 50 catcache 51
    catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51
    catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51
    catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51
    catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51
    catcache 50 catcache 51 catcache 50 catcache 51 catcache 50 catcache 51
    catcache 50 catcache 7 catcache 6 catcache 7 catcache 6 catcache 32 catcache
    51 catcache 50 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 76 catcache
    75 catcache 76 catcache 75 catcache 51 catcache 50 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6
    catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
    catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 59 relcache
    1640003 relcache 1640003 relcache 1640004 relcache 1640000 snapshot 2608
    snapshot 1214 relcache 1632575 snapshot 2608 relcache 1632576 relcache
    1632575 snapshot 2608 snapshot 2608 snapshot 2608 relcache 35727 relcache
    1622119 relcache 1622119 relcache 35727 relcache 1606414 relcache 1606414
    relcache 35727 relcache 1545373 relcache 1545373 relcache 35727 relcache
    1537952 relcache 1537952 relcache 35727 relcache 35733 relcache 35733
    relcache 35727 relcache 1640000 snapshot 2608 relcache 1640000 snapshot 2608
    relcache 1640003 relcache 1640004 relcache 1640004 relcache 1640003 snapshot
    2608 relcache 1640003 snapshot 2608 relcache 1640000 snapshot 1214 relcache
    35727
    
    -- 10722
    rmgr: Standby     len (rec/tot):     42/    42, tx:      10722, lsn:
    8/4946A640, prev 8/4946A468, desc: LOCK xid 10722 db 33795 rel 35727
    rmgr: Heap2       len (rec/tot):     60/    60, tx:      10722, lsn:
    8/4946A670, prev 8/4946A640, desc: NEW_CID rel 1663/33795/1249; tid 84/2;
    cmin: 4294967295, cmax: 0, combo: 4294967295
    rmgr: Heap2       len (rec/tot):     60/    60, tx:      10722, lsn:
    8/4946A6B0, prev 8/4946A670, desc: NEW_CID rel 1663/33795/1249; tid 84/6;
    cmin: 0, cmax: 4294967295, combo: 4294967295
    rmgr: Heap        len (rec/tot):     76/    76, tx:      10722, lsn:
    8/4946A6F0, prev 8/4946A6B0, desc: HOT_UPDATE off 2 xmax 10722 flags 0x20 ;
    new off 6 xmax 0, blkref #0: rel 1663/33795/1249 blk 84
    rmgr: Transaction len (rec/tot):     78/    78, tx:      10722, lsn:
    8/4946A740, prev 8/4946A6F0, desc: INVALIDATION ; inval msgs: catcache 7
    catcache 6 relcache 35727
    rmgr: Transaction len (rec/tot):     98/    98, tx:      10722, lsn:
    8/4946A790, prev 8/4946A740, desc: COMMIT 2025-11-06 01:10:09.687549 UTC;
    inval msgs: catcache 7 catcache 6 relcache 35727
    
    -- 10798
    rmgr: Heap        len (rec/tot):   1211/  1211, tx:      10798, lsn:
    8/8A01F630, prev 8/8A01F560, desc: INSERT off 11 flags 0x08, blkref #0: rel
    1663/33795/1677447 blk 0
    rmgr: Btree       len (rec/tot):     64/    64, tx:      10798, lsn:
    8/8A01FAF0, prev 8/8A01F630, desc: INSERT_LEAF off 5, blkref #0: rel
    1663/33795/1695353 blk 1
    rmgr: Gin         len (rec/tot):    174/   174, tx:      10798, lsn:
    8/8A01FB30, prev 8/8A01FAF0, desc: UPDATE_META_PAGE , blkref #0: rel
    1663/33795/1695354 blk 0, blkref #1: rel 1663/33795/1695354 blk 2
    rmgr: Btree       len (rec/tot):     64/    64, tx:      10798, lsn:
    8/8A01FBE0, prev 8/8A01FB30, desc: INSERT_LEAF off 5, blkref #0: rel
    1663/33795/1695355 blk 1
    rmgr: Hash        len (rec/tot):     72/    72, tx:      10798, lsn:
    8/8A01FC20, prev 8/8A01FBE0, desc: INSERT off 6, blkref #0: rel
    1663/33795/1695356 blk 2, blkref #1: rel 1663/33795/1695356 blk 0
    rmgr: Gist        len (rec/tot):    322/   322, tx:      10798, lsn:
    8/8A01FC68, prev 8/8A01FC20, desc: PAGE_UPDATE , blkref #0: rel
    1663/33795/1695357 blk 0
    rmgr: Btree       len (rec/tot):     80/    80, tx:      10798, lsn:
    8/8A01FDB0, prev 8/8A01FC68, desc: INSERT_LEAF off 3, blkref #0: rel
    1663/33795/1695358 blk 2
    rmgr: Transaction len (rec/tot):     46/    46, tx:      10798, lsn:
    8/8A01FE00, prev 8/8A01FDB0, desc: COMMIT 2025-11-06 01:10:23.738566 UTC
    
    -- ALL RUNING_XACTS WAL RECORD
    rmgr: Standby     len (rec/tot):     58/    58, tx:          0, lsn:
    8/3C425D18, prev 8/3C425CD8, desc: RUNNING_XACTS nextXid 10674
    latestCompletedXid 10673 oldestRunningXid 10639; 2 xacts: 10639 10664
    ...
    rmgr: Standby     len (rec/tot):     62/    62, tx:          0, lsn:
    8/66737378, prev 8/66736B80, desc: RUNNING_XACTS nextXid 10742
    latestCompletedXid 10741 oldestRunningXid 10713; 3 xacts: 10731 10713 10729
    ...
    rmgr: Standby     len (rec/tot):     66/    66, tx:          0, lsn:
    8/89C23500, prev 8/89C234A0, desc: RUNNING_XACTS nextXid 10798
    latestCompletedXid 10797 oldestRunningXid 10767; 4 xacts: 10780 10769 10795
    10767
    ```
    
    From the snapshot_now information for txn 10798, txn 10716 and 10722 are not
    present in xip,
    which appears unexpected.
    
    I have not yet figured out the root cause and will continue analyzing.
    If you need the complete WAL log from 8/3C425D18 to 8/8A01FE30, please let
    me know — I can provide it.
    
    regards,
    Haiyang Li