Re: Changing shared_buffers without restart

Peter Eisentraut <peter@eisentraut.org>

From: Peter Eisentraut <peter@eisentraut.org>

To: Dmitry Dolgov <9erthalion6@gmail.com>, pgsql-hackers@postgresql.org

Date: 2024-11-19T12:57:00Z

Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →

Remove PG_MMAP_FLAGS from mem.h
- c100340729b6 19 (unreleased) landed
Improve runtime and output of tests for replication slots checkpointing.
- 4464fddf7b50 18.0 cited
Revert support for improved tracking of nested queries
- f85f6ab051b7 18.0 cited
Use exported symbols list on macOS for loadable modules as well
- 3feff3916ee1 18.0 cited
Add support for basic NUMA awareness
- 65c298f61fc7 18.0 cited
Avoid unnecessary copying of a string in pg_restore.c
- 5e1915439085 18.0 cited
aio: Infrastructure for io_method=worker
- 55b454d0e140 18.0 cited
Improve InitShmemAccess() prototype
- 2a7b2d97171d 18.0 landed

Attachments

0001-More-thorough-use-of-PGShmemHeader-instead-of-void.patch.nocfbot (text/plain)

On 18.10.24 21:21, Dmitry Dolgov wrote:
> v1-0001-Allow-to-use-multiple-shared-memory-mappings.patch
> 
> Preparation, introduces the possibility to work with many shmem mappings. To
> make it less invasive, I've duplicated the shmem API to extend it with the
> shmem_slot argument, while redirecting the original API to it. There are
> probably better ways of doing that, I'm open for suggestions.

After studying this a bit, I tend to think you should just change the 
existing APIs in place.  So for example,

void *ShmemAlloc(Size size);

becomes

void *ShmemAlloc(int shmem_slot, Size size);

There aren't that many callers, and all these duplicated interfaces 
almost add more new code than they save.

It might be worth making exceptions for interfaces that are likely to be 
used by extensions.  For example, I see pg_stat_statements using 
ShmemInitStruct() and ShmemInitHash().  But that seems to be it.  Are 
there any other examples out there?  Maybe there are many more that I 
don't see right now.  But at least for the initialization functions, it 
doesn't seem worth it to preserve the existing interfaces exactly.

In any case, I think the slot number should be the first argument.  This 
matches how MemoryContextAlloc() or also talloc() work.

(Now here is an idea:  Could these just be memory contexts?  Instead of 
making six shared memory slots, could you make six memory contexts with 
a special shared memory type.  And ShmemAlloc becomes the allocation 
function, etc.?)

I noticed the existing code made inconsistent use of PGShmemHeader * vs. 
void *, which also bled into your patch.  I made the attached little 
patch to clean that up a bit.

I suggest splitting the struct ShmemSegment into one struct for the 
three memory addresses and a separate array just for the slock_t's.  The 
former struct can then stay private in storage/ipc/shmem.c, only the 
locks need to be exported.

Maybe rename ANON_MAPPINGS to something like NUM_ANON_MAPPINGS.

Also, maybe some of this should be declared in storage/shmem.h rather 
than in storage/pg_shmem.h.  We have the existing ShmemLock in there, so 
it would be a bit confusing to have the per-segment locks elsewhere.

> v1-0003-Introduce-multiple-shmem-slots-for-shared-buffers.patch
> 
> Splits shared_buffers into multiple slots, moving out structures that depend on
> NBuffers into separate mappings. There are two large gaps here:
> 
> * Shmem size calculation for those mappings is not correct yet, it includes too
>    many other things (no particular issues here, just haven't had time).
> * It makes hardcoded assumptions about what is the upper limit for resizing,
>    which is currently low purely for experiments. Ideally there should be a new
>    configuration option to specify the total available memory, which would be a
>    base for subsequent calculations.

Yes, I imagine a shared_buffers_hard_limit setting.  We could maybe 
default that to the total available memory, but it would also be good to 
be able to specify it directly, for testing.

> v1-0005-Use-anonymous-files-to-back-shared-memory-segment.patch
> 
> Allows an anonyous file to back a shared mapping. This makes certain things
> easier, e.g. mappings visual representation, and gives an fd for possible
> future customizations.

I think this could be a useful patch just by itself, without the rest of 
the series, because of

 > * By default, Linux will not add file-backed shared mappings into a
 > core dump, making it more convenient to work with them in PostgreSQL:
 > no more huge dumps to process.

This could be significant operational benefit.

When you say "by default", is this adjustable?  Does someone actually 
want the whole shared memory in their core file?  (If it's adjustable, 
is it also adjustable for anonymous mappings?)

I'm wondering about this change:

-#define PG_MMAP_FLAGS 
(MAP_SHARED|MAP_ANONYMOUS|MAP_HASSEMAPHORE)
+#define PG_MMAP_FLAGS                  (MAP_SHARED|MAP_HASSEMAPHORE)

It looks like this would affect all mmap() calls, not only the one 
you're changing.  But that's the only one that uses this macro!  I don't 
understand why we need this; I don't see anything in the commit log 
about this ever being used for any portability.  I think we should just 
get rid of it and have mmap() use the right flags directly.

I see that FreeBSD has a memfd_create() function.  Might be worth a try. 
  Obviously, this whole thing needs a configure test for memfd_create() 
anyway.

I see that memfd_create() has a MFD_HUGETLB flag.  It's not very clear 
how that interacts with the MAP_HUGETLB flag for mmap().  Do you need to 
specify both of them if you want huge pages?