Thread

  1. Re: [PATCH] Release replication slot on error in SQL-callable slot functions

    shveta malik <shveta.malik@gmail.com> — 2026-05-26T04:36:24Z

    On Tue, May 26, 2026 at 12:31 AM SATYANARAYANA NARLAPURAM
    <satyanarlapuram@gmail.com> wrote:
    >
    > Hi,
    >
    > On Mon, May 25, 2026 at 2:58 AM shveta malik <shveta.malik@gmail.com> wrote:
    >>
    >> On Mon, May 25, 2026 at 12:42 PM SATYANARAYANA NARLAPURAM
    >> <satyanarlapuram@gmail.com> wrote:
    >> >
    >> > Hi
    >> >
    >> > On Fri, May 22, 2026 at 2:16 AM shveta malik <shveta.malik@gmail.com> wrote:
    >> >>
    >> >> Thanks for reporting the issue. I could reproduce the same issue with
    >> >> all these as well:
    >> >>
    >> >> pg_logical_slot_peek_changes
    >> >> pg_logical_slot_get_binary_changes
    >> >> pg_logical_slot_peek_binary_changes
    >> >
    >> >
    >> > Please find the attached v2 patch that addressed these three cases as well.
    >> >
    >>
    >> Thank You for addressuing these cases. A few comments:
    >>
    >> 1)
    >>
    >> +-- Test 2: session remains usable after the error (MyReplicationSlot cleared)
    >>
    >> It shoudl be part of 'Test 1' itself and thus should not be named as 'Test 2'
    >>
    >> 2)
    >> --------
    >> +-- Test 4: copy_replication_slot with max_replication_slots exceeded.
    >> +-- We reduce max_replication_slots artificially by filling all remaining slots.
    >> +-- Instead, trigger an error by copying to an already-existing name.
    >> +DO $$
    >> +BEGIN
    >> +    PERFORM pg_copy_logical_replication_slot('regression_slot_t3',
    >> 'regression_slot_t3');
    >> +EXCEPTION WHEN OTHERS THEN
    >> +    RAISE NOTICE 'caught: %', SQLERRM;
    >> +END;
    >> +$$;
    >> +-- The original slot must still exist and be usable
    >> +SELECT count(*) = 1 AS orig_slot_ok FROM pg_replication_slots
    >> +    WHERE slot_name = 'regression_slot_t3';
    >> -----------
    >>
    >> I don't think we can hit the Assert with above test (at-least I could
    >> not). Since creation of slot itself will fail as the slot with
    >> same-name already exists, MyReplicationSlot will never be set and thus
    >> Assert will not be hit.  A better testcase will be below which fails
    >> during LoadOutputPlugin() after slot-creation and MyReplicationSlot is
    >> set already.
    >>
    >> SELECT pg_create_logical_replication_slot('src_slot', 'test_decoding');
    >>
    >> DO $$
    >> BEGIN
    >> PERFORM pg_copy_logical_replication_slot('src_slot', 'dst_slot',
    >> false, 'nonexistent_plugin');
    >> EXCEPTION WHEN others THEN
    >> RAISE NOTICE 'caught: %', SQLERRM;
    >> END $$;
    >>
    >> SELECT count(*) FROM pg_logical_slot_get_changes('src_slot', NULL, NULL);
    >>
    >> 3)
    >> So overall these are the problematic APIs:
    >>
    >> pg_create_logical_replication_slot
    >> pg_replication_slot_advance
    >> pg_copy_logical_replication_slot
    >> pg_logical_slot_peek_binary_changes
    >> pg_logical_slot_peek_changes
    >> pg_logical_slot_get_changes
    >> pg_logical_slot_get_binary_changes
    >>
    >> First 3 are are mutually exclusive fixes fow which we have added
    >> testcases. Last 4 are addressed by fixing common function
    >> pg_logical_slot_get_changes_guts(). I think we should add a test case
    >> for at-least any one of these APIs to cover
    >> pg_logical_slot_get_changes_guts().
    >
    >
    > Thanks for reviewing. Please review the attached v3 patch.
    >
    
    A few trivial things:
    
    1)
    pg_replication_slot_advance:
    + PG_TRY();
    + {
    + /* Acquire the slot so we "own" it */
    + ReplicationSlotAcquire(NameStr(*slotname), true, true);
    + /* A slot whose restart_lsn has never been reserved cannot be advanced */
    + if (!XLogRecPtrIsValid(MyReplicationSlot->data.restart_lsn))
    
    
    We can have a blank line after ReplicationSlotAcquire for better readability.
    
    2)
    
    +SELECT 'init' FROM
    pg_create_logical_replication_slot('regression_slot_t3',
    'test_decoding', true);
    +SELECT count(*) = 1 AS slot_exists FROM pg_replication_slots
    +    WHERE slot_name = 'regression_slot_t3';
    
    The intent is not clear why are we checking existence of
    regression_slot_t3? I think we can skip it (or else add a comment if
    really needed). The success of previous
    pg_create_logical_replication_slot is enough to confirm that session
    is healthy to run other slot related queries.
    
    3)
    +SELECT pg_drop_replication_slot('regression_slot_phy');
    +
    +-- cleanup
    +SELECT pg_drop_replication_slot('regression_slot_t3');
    
    We can move drop of 'regression_slot_phy' too under '-- cleanup'
    
    ~~
    
    I have no further comments other than the trivial things mentioned above.
    
    thanks
    Shveta