Changing shared_buffers without restart
Dmitry Dolgov <9erthalion6@gmail.com>
From: Dmitry Dolgov <9erthalion6@gmail.com>
To: pgsql-hackers@postgresql.org
Date: 2024-10-18T19:21:19Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Remove PG_MMAP_FLAGS from mem.h
- c100340729b6 19 (unreleased) landed
-
Improve runtime and output of tests for replication slots checkpointing.
- 4464fddf7b50 18.0 cited
-
Revert support for improved tracking of nested queries
- f85f6ab051b7 18.0 cited
-
Use exported symbols list on macOS for loadable modules as well
- 3feff3916ee1 18.0 cited
-
Add support for basic NUMA awareness
- 65c298f61fc7 18.0 cited
-
Avoid unnecessary copying of a string in pg_restore.c
- 5e1915439085 18.0 cited
-
aio: Infrastructure for io_method=worker
- 55b454d0e140 18.0 cited
-
Improve InitShmemAccess() prototype
- 2a7b2d97171d 18.0 landed
Attachments
TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via
changing shared memory mapping layout. Any feedback is appreciated.
Hi,
Being able to change PostgreSQL configuration on the fly is an important
property for performance tuning, since it reduces the feedback time and
invasiveness of the process. In certain cases it even becomes highly desired,
e.g. when doing automatic tuning. But there are couple of important
configuration options that could not be modified without a restart, the most
notorious example is shared_buffers.
I've been working recently on an idea how to change that, allowing to modify
shared_buffers without a restart. To demonstrate the approach, I've prepared a
PoC that ignores lots of stuff, but works in a limited set of use cases I was
testing. I would like to discuss the idea and get some feedback.
Patches 1-3 prepare the infrastructure and shared memory layout. They could be
useful even with multithreaded PostgreSQL, when there will be no need for
shared memory. I assume, in the multithreaded world there still will be need
for a contiguous chunk of memory to share between threads, and its layout would
be similar to the one with shared memory mappings.
Patch 4 actually does resizing. It's shared memory specific of course, and
utilized Linux specific mremap, meaning open portability questions.
Patch 5 is somewhat independent, but quite convenient to have. It also utilizes
Linux specific call memfd_create.
The patch set still doesn't address lots of things, e.g. shared memory segment
detach/reattach, portability questions, it doesn't touch EXEC_BACKEND code and
huge pages.
So far I was doing some rudimentary testing: spinning up PostgreSQL, then
increasing shared_buffers and running pgbench with the scale factor large
enough to extend the data set into newly allocated buffers:
-- shared_buffers 128 MB
=# SELECT * FROM pg_buffercache_summary();
buffers_used | buffers_unused | buffers_dirty | buffers_pinned
--------------+----------------+---------------+----------------
134 | 16250 | 1 | 0
-- change shared_buffers to 512 MB
=# select pg_reload_conf();
=# SELECT * FROM pg_buffercache_summary();
buffers_used | buffers_unused | buffers_dirty | buffers_pinned
--------------+----------------+---------------+---------------
221 | 65315 | 1 | 0
-- round of pgbench read-only load
=# SELECT * FROM pg_buffercache_summary();
buffers_used | buffers_unused | buffers_dirty | buffers_pinned
--------------+----------------+---------------+---------------
41757 | 23779 | 216 | 0
Here is the breakdown:
v1-0001-Allow-to-use-multiple-shared-memory-mappings.patch
Preparation, introduces the possibility to work with many shmem mappings. To
make it less invasive, I've duplicated the shmem API to extend it with the
shmem_slot argument, while redirecting the original API to it. There are
probably better ways of doing that, I'm open for suggestions.
v1-0002-Allow-placing-shared-memory-mapping-with-an-offse.patch
Implements a new layout of shared memory mappings to include room for resizing.
I've done a couple of tests to verify that such space in between doesn't affect
how the kernel calculates actual used memory, to make sure that e.g. cgroup
will not trigger OOM. The only change seems to be in VmPeak, which is total
mapped pages.
v1-0003-Introduce-multiple-shmem-slots-for-shared-buffers.patch
Splits shared_buffers into multiple slots, moving out structures that depend on
NBuffers into separate mappings. There are two large gaps here:
* Shmem size calculation for those mappings is not correct yet, it includes too
many other things (no particular issues here, just haven't had time).
* It makes hardcoded assumptions about what is the upper limit for resizing,
which is currently low purely for experiments. Ideally there should be a new
configuration option to specify the total available memory, which would be a
base for subsequent calculations.
v1-0004-Allow-to-resize-shared-memory-without-restart.patch
Do shared_buffers change without a restart. Current approach is clumsy, it adds
an assign hook for shared_buffers and goes from there using mremap to resize
mappings. But I haven't immediately found any better approach. Currently it
supports only an increase of shared_buffers.
v1-0005-Use-anonymous-files-to-back-shared-memory-segment.patch
Allows an anonyous file to back a shared mapping. This makes certain things
easier, e.g. mappings visual representation, and gives an fd for possible
future customizations.
In this thread I'm hoping to answer following questions:
* Are there any concerns about this approach?
* What would be a better mechanism to handle resizing than an assign hook?
* Assuming I'll be able to address already known missing bits, what are the
chances the patch series could be accepted?