Thread
-
Re: failed NUMA pages inquiry status: Operation not permitted
Tomas Vondra <tomas@vondra.me> — 2025-12-16T15:17:51Z
On 12/16/25 15:48, Christoph Berg wrote: > Re: To Tomas Vondra >> I've managed to reproduce it once, running this loop on >> 18-as-of-today. It errored out after a few 100 iterations: >> >> while psql -c 'SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa'; do :; done >> >> 2025-12-16 11:49:35.982 UTC [621807] myon@postgres ERROR: invalid NUMA node id outside of allowed range [0, 0]: -2 >> 2025-12-16 11:49:35.982 UTC [621807] myon@postgres STATEMENT: SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa >> >> That was on the apt.pg.o amd64 build machine while a few things were >> just building. Maybe ENOENT "The page is not present" means something >> was just swapped out because the machine was under heavy load. > > I played a bit more with it. > > * It seems to trigger only once for a running cluster. The next one > needs a restart > * If it doesn't trigger within the first 30s, it probably never will > * It seems easier to trigger on a system that is under load (I started > a few pgmodeler compile runs in parallel (C++)) > > But none of that answers the "why". > Hmmm, so this is interesting. I tried this on my workstation (with a single NUMA node), and I see this: 1) right after opening a connection, I get this test=# select numa_node, count(*) from pg_buffercache_numa group by 1; numa_node | count -----------+------- 0 | 290 -2 | 32478 (2 rows) 2) but a select from pg_shmem_allocations_numa works fine test=# select numa_node, count(*) from pg_shmem_allocations_numa group by 1; numa_node | count -----------+------- 0 | 72 (1 row) 3) and if I repeat the pg_buffercache_numa query, it now works test=# select numa_node, count(*) from pg_buffercache_numa group by 1; numa_node | count -----------+------- 0 | 32768 (1 row) That's a bit strange. I have no idea why is this happening. If I reconnect, I start getting the failures again. regards -- Tomas Vondra