Thread

  1. Re: [PATCH] Release replication slot on error in SQL-callable slot functions

    SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> — 2026-05-27T04:31:45Z

    Hi Shveta,
    
    On Tue, May 26, 2026 at 8:54 PM shveta malik <shveta.malik@gmail.com> wrote:
    
    > On Tue, May 26, 2026 at 1:41 PM SATYANARAYANA NARLAPURAM
    > <satyanarlapuram@gmail.com> wrote:
    > >
    > > Hi,
    > >
    > > On Mon, May 25, 2026 at 10:50 PM shveta malik <shveta.malik@gmail.com>
    > wrote:
    > >>
    > >> On Tue, May 26, 2026 at 10:06 AM shveta malik <shveta.malik@gmail.com>
    > wrote:
    > >> >
    > >> > On Tue, May 26, 2026 at 12:31 AM SATYANARAYANA NARLAPURAM
    > >> > <satyanarlapuram@gmail.com> wrote:
    > >> > >
    > >> > > Hi,
    > >> > >
    > >> > > On Mon, May 25, 2026 at 2:58 AM shveta malik <
    > shveta.malik@gmail.com> wrote:
    > >> > >>
    > >> > >> On Mon, May 25, 2026 at 12:42 PM SATYANARAYANA NARLAPURAM
    > >> > >> <satyanarlapuram@gmail.com> wrote:
    > >> > >> >
    > >> > >> > Hi
    > >> > >> >
    > >> > >> > On Fri, May 22, 2026 at 2:16 AM shveta malik <
    > shveta.malik@gmail.com> wrote:
    > >> > >> >>
    > >> > >> >> Thanks for reporting the issue. I could reproduce the same
    > issue with
    > >> > >> >> all these as well:
    > >> > >> >>
    > >> > >> >> pg_logical_slot_peek_changes
    > >> > >> >> pg_logical_slot_get_binary_changes
    > >> > >> >> pg_logical_slot_peek_binary_changes
    > >> > >> >
    > >> > >> >
    > >> > >> > Please find the attached v2 patch that addressed these three
    > cases as well.
    > >> > >> >
    > >> > >>
    > >> > >> Thank You for addressuing these cases. A few comments:
    > >> > >>
    > >> > >> 1)
    > >> > >>
    > >> > >> +-- Test 2: session remains usable after the error
    > (MyReplicationSlot cleared)
    > >> > >>
    > >> > >> It shoudl be part of 'Test 1' itself and thus should not be named
    > as 'Test 2'
    > >> > >>
    > >> > >> 2)
    > >> > >> --------
    > >> > >> +-- Test 4: copy_replication_slot with max_replication_slots
    > exceeded.
    > >> > >> +-- We reduce max_replication_slots artificially by filling all
    > remaining slots.
    > >> > >> +-- Instead, trigger an error by copying to an already-existing
    > name.
    > >> > >> +DO $$
    > >> > >> +BEGIN
    > >> > >> +    PERFORM pg_copy_logical_replication_slot('regression_slot_t3',
    > >> > >> 'regression_slot_t3');
    > >> > >> +EXCEPTION WHEN OTHERS THEN
    > >> > >> +    RAISE NOTICE 'caught: %', SQLERRM;
    > >> > >> +END;
    > >> > >> +$$;
    > >> > >> +-- The original slot must still exist and be usable
    > >> > >> +SELECT count(*) = 1 AS orig_slot_ok FROM pg_replication_slots
    > >> > >> +    WHERE slot_name = 'regression_slot_t3';
    > >> > >> -----------
    > >> > >>
    > >> > >> I don't think we can hit the Assert with above test (at-least I
    > could
    > >> > >> not). Since creation of slot itself will fail as the slot with
    > >> > >> same-name already exists, MyReplicationSlot will never be set and
    > thus
    > >> > >> Assert will not be hit.  A better testcase will be below which
    > fails
    > >> > >> during LoadOutputPlugin() after slot-creation and
    > MyReplicationSlot is
    > >> > >> set already.
    > >> > >>
    > >> > >> SELECT pg_create_logical_replication_slot('src_slot',
    > 'test_decoding');
    > >> > >>
    > >> > >> DO $$
    > >> > >> BEGIN
    > >> > >> PERFORM pg_copy_logical_replication_slot('src_slot', 'dst_slot',
    > >> > >> false, 'nonexistent_plugin');
    > >> > >> EXCEPTION WHEN others THEN
    > >> > >> RAISE NOTICE 'caught: %', SQLERRM;
    > >> > >> END $$;
    > >> > >>
    > >> > >> SELECT count(*) FROM pg_logical_slot_get_changes('src_slot', NULL,
    > NULL);
    > >> > >>
    > >> > >> 3)
    > >> > >> So overall these are the problematic APIs:
    > >> > >>
    > >> > >> pg_create_logical_replication_slot
    > >> > >> pg_replication_slot_advance
    > >> > >> pg_copy_logical_replication_slot
    > >> > >> pg_logical_slot_peek_binary_changes
    > >> > >> pg_logical_slot_peek_changes
    > >> > >> pg_logical_slot_get_changes
    > >> > >> pg_logical_slot_get_binary_changes
    > >> > >>
    > >> > >> First 3 are are mutually exclusive fixes fow which we have added
    > >> > >> testcases. Last 4 are addressed by fixing common function
    > >> > >> pg_logical_slot_get_changes_guts(). I think we should add a test
    > case
    > >> > >> for at-least any one of these APIs to cover
    > >> > >> pg_logical_slot_get_changes_guts().
    > >> > >
    > >> > >
    > >> > > Thanks for reviewing. Please review the attached v3 patch.
    > >> > >
    > >> >
    > >> > A few trivial things:
    > >> >
    > >> > 1)
    > >> > pg_replication_slot_advance:
    > >> > + PG_TRY();
    > >> > + {
    > >> > + /* Acquire the slot so we "own" it */
    > >> > + ReplicationSlotAcquire(NameStr(*slotname), true, true);
    > >> > + /* A slot whose restart_lsn has never been reserved cannot be
    > advanced */
    > >> > + if (!XLogRecPtrIsValid(MyReplicationSlot->data.restart_lsn))
    > >> >
    > >> >
    > >> > We can have a blank line after ReplicationSlotAcquire for better
    > readability.
    > >> >
    > >> > 2)
    > >> >
    > >> > +SELECT 'init' FROM
    > >> > pg_create_logical_replication_slot('regression_slot_t3',
    > >> > 'test_decoding', true);
    > >> > +SELECT count(*) = 1 AS slot_exists FROM pg_replication_slots
    > >> > +    WHERE slot_name = 'regression_slot_t3';
    > >> >
    > >> > The intent is not clear why are we checking existence of
    > >> > regression_slot_t3? I think we can skip it (or else add a comment if
    > >> > really needed). The success of previous
    > >> > pg_create_logical_replication_slot is enough to confirm that session
    > >> > is healthy to run other slot related queries.
    > >> >
    > >> > 3)
    > >> > +SELECT pg_drop_replication_slot('regression_slot_phy');
    > >> > +
    > >> > +-- cleanup
    > >> > +SELECT pg_drop_replication_slot('regression_slot_t3');
    > >> >
    > >> > We can move drop of 'regression_slot_phy' too under '-- cleanup'
    > >> >
    > >> > ~~
    > >> >
    > >> > I have no further comments other than the trivial things mentioned
    > above.
    > >> >
    > >>
    > >> Missed to inform this earlier, I am not able to apply any version of
    > >> the patches shared so far with 'git am'. It gives error, 'patch -p1'
    > >> works.
    > >>
    > >> git am
    > v3-0001-Release-replication-slot-on-error-in-slot-SQL-functions.patch
    > >> Patch format detection failed.
    > >
    > > Thanks , Shveta! Please find the attached v4 patch that addressed your
    > comments.
    > >
    >
    > Thank You for the patch.
    > I noticed that we are creating regression_slot_t3 as a a temporary
    > slot, is that intentional? I think creating a permanent slot here will
    > be a better testcase.
    
    
    No specific reason to use temp slot, ok to create a permanent slot too.
    
    
    >
    > I have made a few cosmetic changes for better readability along with
    > creating the 'permanent' regression_slot_t3 slot. Please incorporate
    > what you think is okay. I have no more comments.
    
    
    Thank you for the changes and review.
    
    Thanks,
    Satya