Thread

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Fix background worker not restarting after crash-and-restart cycle.
- 75f633f54aaa 18.0 landed
- b5d084c5353f 19 (unreleased) landed

Fix background workers not restarting with restart_after_crash = on

Andrey Rudometov <unlimitedhikari@gmail.com> — 2025-06-11T08:26:01Z

Good day, hackers.

Reading through changes committed in master, I noticed that after
CleanupBackend/CleanupBackroundworker refactor background workers will fail
to
start again after postgres' restart with restart_after_crash = on.

The reason is CleanupBackend and HandleChildCrash not setting background
worker's
rw_pid to zero anymore, if backend, well, crashed and failed to call
shmem_exit
and mark PMChild slot as inactive via MarkPostmasterChildInactive.

Suggested solution is to finish CleanupBackend's background worker related
logic
even after treating the child process as crashed. In earlier versions
zeroing of
pids happen in HandleChildCrash anyway, so there should be no harm in doing
the same actions here.

For fast reproduction I used pg_prewarm extension, as it creates observable
bgworker
and is present in postgres tree, so tap test is easy to run.
-- 
best regards,
Andrey Rudometov

Re: Fix background workers not restarting with restart_after_crash = on

Fujii Masao <masao.fujii@gmail.com> — 2025-07-24T09:23:16Z

On Wed, Jun 11, 2025 at 5:26 PM Andrey Rudometov
<unlimitedhikari@gmail.com> wrote:
>
> Good day, hackers.
>
> Reading through changes committed in master, I noticed that after
> CleanupBackend/CleanupBackroundworker refactor background workers will fail to
> start again after postgres' restart with restart_after_crash = on.
>
> The reason is CleanupBackend and HandleChildCrash not setting background worker's
> rw_pid to zero anymore, if backend, well, crashed and failed to call shmem_exit
> and mark PMChild slot as inactive via MarkPostmasterChildInactive.
>
> Suggested solution is to finish CleanupBackend's background worker related logic
> even after treating the child process as crashed. In earlier versions zeroing of
> pids happen in HandleChildCrash anyway, so there should be no harm in doing
> the same actions here.
>
> For fast reproduction I used pg_prewarm extension, as it creates observable bgworker
> and is present in postgres tree, so tap test is easy to run.

Thanks for the report and patch! This same issue was also reported in
thread [1], where there's ongoing discussion about how to address it.

Regards,

[1] https://postgr.es/m/tencent_E00A056B3953EE6440F0F40F80EC30427D09@qq.com

-- 
Fujii Masao