Re: Requested WAL segment xxx has already been removed
Japin Li <japinli@hotmail.com>
From: Japin Li <japinli@hotmail.com>
To: Alexander Kukushkin <cyberdemn@gmail.com>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-07-15T10:07:56Z
Lists: pgsql-hackers
On Tue, 15 Jul 2025 at 11:24, Alexander Kukushkin <cyberdemn@gmail.com> wrote: > Hi, > > On Mon, 14 Jul 2025 at 11:24, Japin Li <japinli@hotmail.com> wrote: > > The configuration is as expected. My test script simulates two distinct hosts > by utilizing local archive storage. > > For physical replication across distinct hosts without shared WAL archive > storage, WALs are archived locally (in my test). > > When the primary's walsender needs a WAL file from the archive that's not in > its pg_wal directory, manual copying is required to the primary's pg_wal or the > standby's pg_wal (or its archive directory, and use restore_command to fetch it). > > What prevents us from using the primary's restore_command to retrieve the > necessary WALs? > > I am just talking about the practical side of local archive storage. > Yes, it's quite niche in its usage. > Such archives will be gone along with the server in case of disaster and therefore they bring only a little value. > With the same success, physical standby can use restore_command to copy files from the archive on the primary via > ssh/rsync or similar. This approach is used for ages and works just fine. > However, some environments might prohibit password-free scp or the use of shared directories. > What is really painful right now, logical walsenders can only look into pg_wal, and unfortunately replication slots don't > give 100% guarantee for WAL retention because of max_slot_wal_keep_size. > That is, using restore_command for logical walsenders would be really helpful and solve some problems and pain points > with logical replication. > I agree; logical walsenders offer greater value than physical ones. > However, if we start calling restore_command also for physical walsenders it might result in increased resource usage on > primary without providing much additional value. For example, restore_command is failing, but standby indefinitely > continues making replication connection attempts. > IIRC, the standby will indefinitely attempt to connect for replication, even without restore_command configured. > I don't mind if it will also work for physical replication, but IMO there should be a possibility to opt out from it. > -- Regards, Japin Li