Re: Non-blocking archiver process
Patrick Stählin <me@packi.ch>
From: Patrick Stählin <me@packi.ch>
To: Noah Misch <noah@leadboat.com>, Ronan Dunklau <ronan.dunklau@aiven.io>
Cc: pgsql-hackers@lists.postgresql.org
Date: 2025-07-27T15:45:35Z
Lists: pgsql-hackers
Attachments
- 0001-Check-for-interrupts-during-archive_command.patch (text/x-patch) patch 0001
On 05.07.25 05:01, Noah Misch wrote: > On Fri, Jul 04, 2025 at 08:46:08AM +0200, Ronan Dunklau wrote: >> We've noticed a behavior that seems surprising to us. >> Since DROP DATABASE now waits for a ProcSignalBarrier, it can hang up >> indefinitely if the archive_command hangs. >> >> The reason for this is that the builtin archive module doesn't process any >> interrupts while the archiving command is running, as it's run with a system() >> call, blocking undefintely. >> >> Before rushing on to implement a non-blocking archive library (perhaps using >> popen or posix_spawn, while keeping other systems in mind), what unintended >> consequences would it have to actually run the archive_command in a non- >> blocking way, and checking interrupts while it runs ? > > I can't think of any unintended consequences. I think we just missed this > when adding the first use of ProcSignalBarrier (v15). Making this easier to > miss, archiver spent most of its history not connecting to shared memory. Its > shared memory connection appeared in v14. I've taken some time, mostly for WIN32, to implement an interruptible version of archive_command. My WIN32 days are long behind me, so it's quite possible that this has some faults I'm not seeing. Then again, it passes CI. I failed to make it work in WIN32 with popen since the handles it returns can't be made non-blocking so this change is a bit bigger. @Ronan: Let me now if you'd like to be attributed more, I took some inspiration from a private repos with your prototype. I don't know if I should add that to the running commitfest for PG19 or if this is something that would need to be backported. Just let me know. Thanks, Patrick