Re: Standby server with cascade logical replication could not be properly stopped under load

G. Sl <skokoveshat@gmail.com>

From: "G. Sl" <skokoveshat@gmail.com>
To: Michael Paquier <michael@paquier.xyz>
Cc: Ajin Cherian <itsajin@gmail.com>, Alexey Makhmutov <a.makhmutov@postgrespro.ru>, Bertrand Drouvot <bertranddrouvot.pg@gmail.com>, shveta malik <shveta.malik@gmail.com>, pgsql-bugs@lists.postgresql.org
Date: 2025-12-29T05:22:18Z
Lists: pgsql-bugs
Hi Michael ! Thx for clarifying.
No, I don't have a replica with logical slots, just a primary with logical
replication ( cdc debezium ) and sometimes it shutdown(restart) hangs until
killing of replication processes.
So it looks like I should investigate further and maybe, create a new issue.

On Mon, 29 Dec 2025 at 05:10, Michael Paquier <michael@paquier.xyz> wrote:

> On Fri, Dec 26, 2025 at 04:14:17PM +0500, G. Sl wrote:
> > I've found the same behaviour in the 15.12 version. Any chances for
> > backpatch for this version ?
>
> As presented on this thread, the problem reported involves at least a
> cluster configuration A -> B -> C, where:
> - A is a primary.
> - B is a physical standby, streaming changes from A.  In the test
> setup, this node includes replication slots, that have been created
> while the node is in standby mode.  This action can only happen in v16
> and newer versions, where logical decoding on standbys has been added.
> - C is a primary node, streaming logical changes from B.
>
> Based on how the state of B required, this would not apply to v15
> because it is not possible to create replication slots on a standby.
> Also you may want to update to 15.15 first.
>
> I am open to arguments or discussions if you have found a new or
> different problem, of course, but there is nothing we can do without
> knowing more about your setup.  An even better thing would be to have
> a reproducible test case based on v15 to understand your problem, but
> I can say for sure that we may be dealing with something different.
> Please feel free to refer to the top of the thread, which offers an
> excellent example of what a reproducible test case can be.  By that I
> mean something that we can reuse and reproduce the issue.
>
> Also note that the change committed is 5231ed8262c9, that relies on
> the existence of am_cascading_walsender in XLogSendLogical() to not
> use GetFlushRecPtr() in all cases.  Before the commit, we used
> GetStandbyFlushRecPtr() for that in v16.  Again, v15 just uses
> GetFlushRecPtr(), but it should not be possible to find yourself in a
> position where XLogSendLogical() is called while on a standby, no?
> --
> Michael
>