Re: Fix pg_stat_wal_receiver to show CONNECTING status
Chao Li <li.evan.chao@gmail.com>
From: Chao Li <li.evan.chao@gmail.com>
To: Michael Paquier <michael@paquier.xyz>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>,
Michael Paquier <michael.paquier@gmail.com>,
Xuneng Zhou <xunengzhou@gmail.com>
Date: 2026-05-21T07:20:13Z
Lists: pgsql-hackers
> On May 21, 2026, at 07:06, Chao Li <li.evan.chao@gmail.com> wrote: > > > >> On May 21, 2026, at 04:43, Michael Paquier <michael@paquier.xyz> wrote: >> >> On Wed, May 20, 2026 at 03:53:38PM +0800, Chao Li wrote: >>> With v2, slot_name, sender_host, sender_port, and conninfo are >>> already left NULL while the receiver is in CONNECTING state. I feel >>> we don't have to show the timestamp fields either. Since the columns >>> are named last_msg_send_time and last_msg_receipt_time, users may >>> naturally interpret them as the last time a message was sent to or >>> received from >>> the primary. If we show the standby server start time in those >>> columns, I am afraid that could be confusing. >>> >>> But I think it might be useful to show the *_lsn and *_tli values in >>> CONNECTING state if they are available. >> >> The original reason why ready_to_display has been introduced is this >> one, where we wanted to have a strict control over the connection >> information across multiple calls of pg_stat_get_wal_receiver(): >> https://www.postgresql.org/message-id/CAB7nPqQNbHQ7F7wDD_2qvGA_FUW-Leds9HQNM6kJnto7RFNhUg@mail.gmail.com >> >> With v2, ready_to_display is still able to do the job it is defined >> for. This does not need to apply on the time fields, so IMO showing >> them to the values they are initialized is not a big deal, and they >> can actually be useful to know even in the early stage of connection >> as they reveal the state of the code. >> >> Note also that the time values could still show up based on their >> initial values at the early connection stage, even after completing >> walrcv_connect() and after ready_to_display is switched to true, so >> it's not like these values are that confusing: we just expose them a >> bit more at an earlier stage of the connection attempt process. As a >> whole v2 is fine, and addresses your issue. >> -- >> Michael > > Thanks for the detailed explanation. > > Now I see that, based on the original discussion you pointed out, as long as v2 clears conninfo before setting ready_to_display to true, it is okay to do that earlier while the state is still CONNECTING. On that point, I’m good with v2. > > I’m still not fully convinced about displaying the *_time fields, but I don’t have a stronger argument either, so I’m fine with that. Maybe we can add a brief description in the doc like the attached diff? > > Overall, v2 looks good to me now. > > Best regards, > -- > Chao Li (Evan) > HighGo Software Co., Ltd. > https://www.highgo.com/ > > > > > <nocfbot_monitoring.sgml.diff> I spent more time here, and found that it is still possible to leak conninfo in the WAL receiver reuse path: * WalRcvWaitForStartPosition() sets the state to WALRCV_WAITING. * Then RequestXLogStreaming() copies raw conninfo into walrcv->conninfo and sets the state to WALRCV_RESTARTING. * WalRcvWaitForStartPosition() then moves the state to WALRCV_CONNECTING, but this path does not clear walrcv->conninfo again. The attached nocfbot_test.diff demonstrates the leak. Initially I thought we could also set ready_to_display to false when setting the state to WALRCV_WAITING in WalRcvWaitForStartPosition(), and set it back to true when switching back to WALRCV_CONNECTING. However, that would make the WALRCV_WAITING and WALRCV_RESTARTING states invisible in pg_stat_wal_receiver. I ended up with a solution that copies the primary connection info to walrcv->conninfo only when RequestXLogStreaming() is switching to WALRCV_STARTING. In the WALRCV_WAITING reuse path, the WAL receiver keeps using the existing wrconn, so it does not need raw conninfo to be copied into shared memory again. See the attached nocfbot_walreceiverfuncs.c.diff. With that change, the new test passes. I also ran "make check-world" successfully. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/