Re: Changing shared_buffers without restart

Jim Nasby <jnasby@upgrade.com>

From: Jim Nasby <jnasby@upgrade.com>

To: Dmitry Dolgov <9erthalion6@gmail.com>

Cc: Tomas Vondra <tomas@vondra.me>, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, Thomas Munro <thomas.munro@gmail.com>, pgsql-hackers@postgresql.org, Jack Ng <Jack.Ng@huawei.com>, Ni Ku <jakkuniku@gmail.com>

Date: 2025-07-14T22:55:13Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Remove PG_MMAP_FLAGS from mem.h
- c100340729b6 19 (unreleased) landed
Improve runtime and output of tests for replication slots checkpointing.
- 4464fddf7b50 18.0 cited
Revert support for improved tracking of nested queries
- f85f6ab051b7 18.0 cited
Use exported symbols list on macOS for loadable modules as well
- 3feff3916ee1 18.0 cited
Add support for basic NUMA awareness
- 65c298f61fc7 18.0 cited
Avoid unnecessary copying of a string in pg_restore.c
- 5e1915439085 18.0 cited
aio: Infrastructure for io_method=worker
- 55b454d0e140 18.0 cited
Improve InitShmemAccess() prototype
- 2a7b2d97171d 18.0 landed

On Fri, Jul 4, 2025 at 9:42 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:

> > On Fri, Jul 04, 2025 at 02:06:16AM +0200, Tomas Vondra wrote:
>

...

> > 10) what to do about stuck resize?
> >
> > AFAICS the resize can get stuck for various reasons, e.g. because it
> > can't evict pinned buffers, possibly indefinitely. Not great, it's not
> > clear to me if there's a way out (canceling the resize) after a timeout,
> > or something like that? Not great to start an "online resize" only to
> > get stuck with all activity blocked for indefinite amount of time, and
> > get to restart anyway.
> >
> > Seems related to Thomas' message [2], but AFAICS the patch does not do
> > anything about this yet, right? What's the plan here?
>
> It's another open discussion right now, with an idea to eventually allow
> canceling after a timeout. I think canceling when stuck on buffer
> eviction should be pretty straightforward (the evition must take place
> before actual shared memory resize, so we know nothing has changed yet),
> but in some other failure scenarios it would be harder (e.g. if one
> backend is stuck resizing, while other have succeeded -- this would
> require another round of synchronization and some way to figure out what
> is the current status).

From a user standpoint, I would expect any kind of resize like this to be
an online operation that happens in the background. If this is driven by a
GUC I don't see how it could be anything else, but if something else is
decided on I think it'd just be pain to require a session to stay connected
until a resize was complete. (Of course we'd need to provide some means of
monitoring a resize that was in-process, perhaps via a pg_stat_progress
view or a system function.)

Also, while I haven't fully followed discussion about how to synchronize
backends, I will say that I don't think it's at all unreasonable if a
resize doesn't take full effect until every backend has at minimum ended
any running transaction, or potentially even returned back to the
equivalent of `PostgresMain()` for that type of backend. Obviously it'd be
nicer to be more responsive than that, but I don't think the first version
of the feature has to accomplish that.

For that matter, I also feel it'd be fine if the first version didn't even
support shrinking shared buffers.

Finally, while shared buffers is the most visible target here, there are
other shared memory settings that have a *much* smaller surface area, and
in my experience are going to be much more valuable from a tuning
perspective; notably wal_buffers and the MXID SLRUs (and possibly CLOG and
subtrans). I say that because unless you're running a workload that
entirely fits in shared buffers, or a *really* small shared buffers
compared to system memory, increasing shared buffers quickly gets into
diminishing returns. But since the default size for the other fixed sized
areas is so much smaller than normal values for shared_buffers, increasing
those areas can have a much, much larger impact on performance. (Especially
for something like the MXID SLRUs.) I would certainly consider focusing on
one of those areas before trying to tackle shared buffers.