RE: Changing shared_buffers without restart

Jack Ng <jack.ng@huawei.com>

From: Jack Ng <Jack.Ng@huawei.com>

To: Dmitry Dolgov <9erthalion6@gmail.com>

Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>, Robert Haas <robertmhaas@gmail.com>, Ni Ku <jakkuniku@gmail.com>

Date: 2025-05-06T04:23:07Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Remove PG_MMAP_FLAGS from mem.h
- c100340729b6 19 (unreleased) landed
Improve runtime and output of tests for replication slots checkpointing.
- 4464fddf7b50 18.0 cited
Revert support for improved tracking of nested queries
- f85f6ab051b7 18.0 cited
Use exported symbols list on macOS for loadable modules as well
- 3feff3916ee1 18.0 cited
Add support for basic NUMA awareness
- 65c298f61fc7 18.0 cited
Avoid unnecessary copying of a string in pg_restore.c
- 5e1915439085 18.0 cited
aio: Infrastructure for io_method=worker
- 55b454d0e140 18.0 cited
Improve InitShmemAccess() prototype
- 2a7b2d97171d 18.0 landed

Thanks Dmitry. Right, the coordination mechanism in v4-0006 works as expected in various tests (sorry, I misunderstood some details initially).

I also want to report a couple of minor issues found during testing (which you may be aware of already):

1. For memory segments other the first one ('main'), the start address passed to mmap may not be aligned to 4KB or huge page size (since reserved_offset may not be aligned) and cause mmap to fail.

2. Since the ratio for main/desc/iocv/checkpt/strategy in SHMEM_RESIZE_RATIO  are relatively small, I think we need to guard against the case where 'max_available_memory' is too small for the required sizes of these segments (from CalculateShmemSize).
Like when max_available_memory=default and shared_numbers=128kB, 'main' still needs ~109MB, but since only 10% of max_available_memory is reserved for it (~102MB) and start address of the next segment is calculated based on reserved_offset, this would cause the mappings to overlap and memory problems later (I hit this after fixing 1.)
I suppose we can change the minimum value of max_available_memory to be large enough, and may also adjust the ratios in SHMEM_RESIZE_RATIO to ensure the reserved space of those segments are sufficient.

Regards,

Jack Ng

-----Original Message-----
From: Dmitry Dolgov <9erthalion6@gmail.com> 
Sent: Monday, April 21, 2025 5:33 AM
To: Ni Ku <jakkuniku@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>; pgsql-hackers@postgresql.org; Robert Haas <robertmhaas@gmail.com>
Subject: Re: Changing shared_buffers without restart

> On Thu, Apr 17, 2025 at 07:05:36PM GMT, Ni Ku wrote:
> I also have a related question about how ftruncate() is used in the patch.
> In my testing I also see that when using ftruncate to shrink a shared 
> segment, the memory is freed immediately after the call, even if other 
> processes still have that memory mapped, and they will hit SIGBUS if 
> they try to access that memory again as the manpage says.
>
> So am I correct to think that, to support the bufferpool shrinking 
> case, it would not be safe to call ftruncate in AnonymousShmemResize 
> as-is, since at that point other processes may still be using pages 
> that belong to the truncated memory?
> It appears that for shrinking we should only call ftruncate when we're 
> sure no process will access those pages again (eg, all processes have 
> handled the resize interrupt signal barrier). I suppose this can be 
> done by the resize coordinator after synchronizing with all the other processes.
> But in that case it seems we cannot use the postmaster as the 
> coordinator then? b/c I see some code comments saying the postmaster 
> does not have waiting infrastructure... (maybe even if the postmaster 
> has waiting infra we don't want to use it anyway since it can be 
> blocked for a long time and won't be able to serve other requests).

There is already a coordination infrastructure, implemented in the patch 0006, which will take care of this and prevent access to the shared memory until everything is resized.