RE: Changing shared_buffers without restart

Jack Ng <jack.ng@huawei.com>

From: Jack Ng <Jack.Ng@huawei.com>
To: Dmitry Dolgov <9erthalion6@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>, Robert Haas <robertmhaas@gmail.com>, Ni Ku <jakkuniku@gmail.com>
Date: 2025-05-06T04:23:07Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Remove PG_MMAP_FLAGS from mem.h

  2. Improve runtime and output of tests for replication slots checkpointing.

  3. Revert support for improved tracking of nested queries

  4. Use exported symbols list on macOS for loadable modules as well

  5. Add support for basic NUMA awareness

  6. Avoid unnecessary copying of a string in pg_restore.c

  7. aio: Infrastructure for io_method=worker

  8. Improve InitShmemAccess() prototype

Thanks Dmitry. Right, the coordination mechanism in v4-0006 works as expected in various tests (sorry, I misunderstood some details initially).

I also want to report a couple of minor issues found during testing (which you may be aware of already):

1. For memory segments other the first one ('main'), the start address passed to mmap may not be aligned to 4KB or huge page size (since reserved_offset may not be aligned) and cause mmap to fail.

2. Since the ratio for main/desc/iocv/checkpt/strategy in SHMEM_RESIZE_RATIO  are relatively small, I think we need to guard against the case where 'max_available_memory' is too small for the required sizes of these segments (from CalculateShmemSize).
Like when max_available_memory=default and shared_numbers=128kB, 'main' still needs ~109MB, but since only 10% of max_available_memory is reserved for it (~102MB) and start address of the next segment is calculated based on reserved_offset, this would cause the mappings to overlap and memory problems later (I hit this after fixing 1.)
I suppose we can change the minimum value of max_available_memory to be large enough, and may also adjust the ratios in SHMEM_RESIZE_RATIO to ensure the reserved space of those segments are sufficient.

Regards,

Jack Ng

-----Original Message-----
From: Dmitry Dolgov <9erthalion6@gmail.com> 
Sent: Monday, April 21, 2025 5:33 AM
To: Ni Ku <jakkuniku@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>; pgsql-hackers@postgresql.org; Robert Haas <robertmhaas@gmail.com>
Subject: Re: Changing shared_buffers without restart

> On Thu, Apr 17, 2025 at 07:05:36PM GMT, Ni Ku wrote:
> I also have a related question about how ftruncate() is used in the patch.
> In my testing I also see that when using ftruncate to shrink a shared 
> segment, the memory is freed immediately after the call, even if other 
> processes still have that memory mapped, and they will hit SIGBUS if 
> they try to access that memory again as the manpage says.
>
> So am I correct to think that, to support the bufferpool shrinking 
> case, it would not be safe to call ftruncate in AnonymousShmemResize 
> as-is, since at that point other processes may still be using pages 
> that belong to the truncated memory?
> It appears that for shrinking we should only call ftruncate when we're 
> sure no process will access those pages again (eg, all processes have 
> handled the resize interrupt signal barrier). I suppose this can be 
> done by the resize coordinator after synchronizing with all the other processes.
> But in that case it seems we cannot use the postmaster as the 
> coordinator then? b/c I see some code comments saying the postmaster 
> does not have waiting infrastructure... (maybe even if the postmaster 
> has waiting infra we don't want to use it anyway since it can be 
> blocked for a long time and won't be able to serve other requests).

There is already a coordination infrastructure, implemented in the patch 0006, which will take care of this and prevent access to the shared memory until everything is resized.