Thread
-
PoC: VALGRIND_MAKE_MEM_NOACCESS for dynamic shared memory
Tomas Vondra <tomas@vondra.me> — 2026-05-04T08:18:31Z
Hi, A couple of days ago we got a report regarding an incorrect shmem size calculation in btree [1], leading to a buffer overflow / memory corruption. Which became a much less common issue in our code, thanks to valgrind and similar tools. But it took me a while to realize valgrind won't catch this because we only instrument private memory (palloc et al), while shmem is left alone. I was wondering if it's feasible to improve this. Attached is a trivial patch that adjusts shm_toc.c to add a couple NOACCESS bytes after each entry in the segment. It seems to do the trick - with this we get a reasonable report (for the reproducer provided in the bug report, before it got fixed by 748d871b7c) from valgrind, with invalid accesses. See the attached .log for an example. It's much better than the confusing crashes due to corrupted state. There's an issue, though. It seems the valgrind memory markings are not shared between processes. The leader sets the shm_toc up, marks the ranges as NOACCESS, and then checks it while accessing the memory. But the parallel workers don't seem to see this, and so will produce no reports. I'm assuming this is the case, because all the reports come from the leader, never from the workers. Maybe there's a different explanation (e.g. maybe it's just the leader touching the memory?). Still, it seems useful even with this limitation. If the leader participates, it will produce a nice report (unless the worker crashes first, because of the corrupted state). But we also have debug_parallel_query = regress in which case everything should run in a single process, and work just fine. Maybe that's good enough. An alternative would be to do mprotect(), but unfortunately it requires page-aligned ranges, which makes it somewhat useless for small overflows of a couple bytes (like here). It should work cross-process, I think, FWIW the PoC patch adds a 32-byte chunk, not just a single byte. This is intentional, because if the state is an array, it's quite possible the invalid access steps over a fair number of bytes. I'm actually thinking it should be even larger. This modifies just the dynamic shmem, but maybe we should do this even for the "regular" shmem allocated at start. Issues in that would likely cause crashes pretty quick (unlike the btree issue, which requires a somewhat special reproducer), but a nice valgrind report helps. regards [1] https://www.postgresql.org/message-id/CAGCUe0Lwk3C0qdkBa%2BOLpYc7yXwW%3Dpbaz8Sju4xMXEQAmyp%2B5g%40mail.gmail.com -- Tomas Vondra