Thread

  1. BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    PG Bug reporting form <noreply@postgresql.org> — 2023-01-20T22:48:09Z

    The following bug has been logged on the website:
    
    Bug reference:      17757
    Logged by:          David Angel
    Email address:      david_sisson@dell.com
    PostgreSQL version: 14.5
    Operating system:   Linux
    Description:        
    
    On an OS where hugepages are enabled, if no hugepages resources are assigned
    in Kubernetes and the postgres instance is set to hugepages = off in the
    config then one would assume that the DB would not use hugepages.
    However, because the initdb process uses postgresql.conf.sample or
    postgresql.conf.template instead of the actual specified configuration the
    applied setting is actually hugepages = try during initdb.
    In these cases, the initdb phase will attempt to allocate huge pages that
    are available in the OS, but it will be denied access by Kubernetes and
    fail.
    
    Here is a PR with a possible fix:
    https://github.com/postgres/postgres/pull/114/files
    
    Here are some links for further information
    Ours: https://github.com/CrunchyData/postgres-operator/issues/3477
    
    Others with the same having no solution to disable huge pages.
    https://github.com/CrunchyData/postgres-operator/issues/3039
    https://github.com/CrunchyData/postgres-operator/issues/2258
    https://github.com/CrunchyData/postgres-operator/issues/3126
    https://github.com/CrunchyData/postgres-operator/issues/3421
    
    Bitnami
    https://github.com/bitnami/charts/issues/7901
    
    
  2. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tomas Vondra <tomas.vondra@enterprisedb.com> — 2023-01-21T23:10:29Z

    
    On 1/20/23 23:48, PG Bug reporting form wrote:
    > The following bug has been logged on the website:
    > 
    > Bug reference:      17757
    > Logged by:          David Angel
    > Email address:      david_sisson@dell.com
    > PostgreSQL version: 14.5
    > Operating system:   Linux
    > Description:        
    > 
    > On an OS where hugepages are enabled, if no hugepages resources are assigned
    > in Kubernetes and the postgres instance is set to hugepages = off in the
    > config then one would assume that the DB would not use hugepages.
    
    There's no config at that point - it's initdb that creates it, by
    copying the .sample file, IIRC. So not sure which file you're modifying.
    
    > However, because the initdb process uses postgresql.conf.sample or
    > postgresql.conf.template instead of the actual specified configuration the
    > applied setting is actually hugepages = try during initdb.
    
    Specified how?
    
    > In these cases, the initdb phase will attempt to allocate huge pages that
    > are available in the OS, but it will be denied access by Kubernetes and
    > fail.
    
    Well, so how exactly this fails? Does that mean Kubernetes broke mmap()
    with MAP_HUGETLB so that it doesn't return MAP_FAILED when hugepages are
    not available, or what? Because that's the only explanation I can see,
    looking at the code.
    
    Or it just does not realize there are no hugepages, returns something
    and then crashes with SIGBUS later when trying to access it?
    
    > 
    > Here is a PR with a possible fix:
    > https://github.com/postgres/postgres/pull/114/files
    > 
    
    I doubt we want to just go straight to changing the default value for
    everyone. IMHO if the "try" logic is somehow broken, we should fix the
    try logic, not mess with the defaults.
    
    In the worst case, the operator can probably tweak the .sample config
    before calling initdb.
    
    
    regards
    
    -- 
    Tomas Vondra
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
    
    
  3. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Andres Freund <andres@anarazel.de> — 2023-01-21T23:29:22Z

    Hi,
    
    On 2023-01-22 00:10:29 +0100, Tomas Vondra wrote:
    > On 1/20/23 23:48, PG Bug reporting form wrote:
    > > In these cases, the initdb phase will attempt to allocate huge pages that
    > > are available in the OS, but it will be denied access by Kubernetes and
    > > fail.
    > 
    > Well, so how exactly this fails? Does that mean Kubernetes broke mmap()
    > with MAP_HUGETLB so that it doesn't return MAP_FAILED when hugepages are
    > not available, or what? Because that's the only explanation I can see,
    > looking at the code.
    
    Yea, that's what I was wondering about as well.
    
    
    > Or it just does not realize there are no hugepages, returns something
    > and then crashes with SIGBUS later when trying to access it?
    
    I assume that that's the case. There's references to bus errors in a bunch of
    the linked issues. E.g.
    https://github.com/CrunchyData/postgres-operator/issues/413
    
    selecting default max_connections ... sh: line 1:    60 Bus error               (core dumped) "/usr/pgsql-10/bin/postgres" --boot -x0 -F -c max_connections=100 -c shared_buffers=1000 -c dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
    
    It's possible that the problem would go away if we used MAP_POPULATE for the
    allocation.
    
    I'd guess that this is annoying cgroups stuff :(
    
    
    > I doubt we want to just go straight to changing the default value for
    > everyone. IMHO if the "try" logic is somehow broken, we should fix the
    > try logic, not mess with the defaults.
    
    Agreed. But we could disable huge pages explicitly inside initdb - there's
    really no point in using it there...
    
    Greetings,
    
    Andres Freund
    
    
    
    
  4. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tom Lane <tgl@sss.pgh.pa.us> — 2023-01-21T23:30:27Z

    Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
    > On 1/20/23 23:48, PG Bug reporting form wrote:
    >> Here is a PR with a possible fix:
    >> https://github.com/postgres/postgres/pull/114/files
    
    > I doubt we want to just go straight to changing the default value for
    > everyone.
    
    Yeah, that proposal is a non-starter.  I could see providing an
    initdb option to adjust the value applied during initdb, though.
    
    Ideally, maybe what we want is a generalized switch that could
    replace any variable in the sample config, along the lines of
    the server's "-c foo=bar".  I recall having tried to do that and
    having run into quoting hazards, but I did not try very hard.
    
    			regards, tom lane
    
    
    
    
  5. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tom Lane <tgl@sss.pgh.pa.us> — 2023-01-21T23:33:03Z

    Andres Freund <andres@anarazel.de> writes:
    > On 2023-01-22 00:10:29 +0100, Tomas Vondra wrote:
    >> I doubt we want to just go straight to changing the default value for
    >> everyone. IMHO if the "try" logic is somehow broken, we should fix the
    >> try logic, not mess with the defaults.
    
    > Agreed. But we could disable huge pages explicitly inside initdb - there's
    > really no point in using it there...
    
    One of the things initdb is trying to do is establish a set of values
    that is known to allow the server to start.  Not using the same settings
    that the server is expected to use would break that idea completely.
    
    			regards, tom lane
    
    
    
    
  6. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Andres Freund <andres@anarazel.de> — 2023-01-21T23:45:01Z

    Hi,
    
    On 2023-01-21 18:33:03 -0500, Tom Lane wrote:
    > Andres Freund <andres@anarazel.de> writes:
    > > On 2023-01-22 00:10:29 +0100, Tomas Vondra wrote:
    > >> I doubt we want to just go straight to changing the default value for
    > >> everyone. IMHO if the "try" logic is somehow broken, we should fix the
    > >> try logic, not mess with the defaults.
    
    > > Agreed. But we could disable huge pages explicitly inside initdb - there's
    > > really no point in using it there...
    > 
    > One of the things initdb is trying to do is establish a set of values
    > that is known to allow the server to start.  Not using the same settings
    > that the server is expected to use would break that idea completely.
    
    Yea, I'm not saying like the approach. OTOH, we don't provide a proper way to
    influence the configuration, which is bad, as this issue shows.
    
    Perhaps we should add an option to force MAP_POPULATE being used? I'm fairly
    certain that'd avoid the SIGBUS in this case. And it'd make sense to ensure
    that we can actually use the memory in initdb.
    
    Unfortunately it's not unproblematic to use it in general, because with large
    shared_buffers values it can be quite slow, because the kernel initializes the
    memory in a single thread. I've seen ~3GB/s on multi-socket machines.
    
    Greetings,
    
    Andres Freund
    
    
    
    
  7. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tom Lane <tgl@sss.pgh.pa.us> — 2023-01-22T00:08:01Z

    Andres Freund <andres@anarazel.de> writes:
    > Perhaps we should add an option to force MAP_POPULATE being used? I'm fairly
    > certain that'd avoid the SIGBUS in this case. And it'd make sense to ensure
    > that we can actually use the memory in initdb.
    
    > Unfortunately it's not unproblematic to use it in general, because with large
    > shared_buffers values it can be quite slow, because the kernel initializes the
    > memory in a single thread. I've seen ~3GB/s on multi-socket machines.
    
    Hmm ... but if we can't use it by default, we're still back to the
    problem of needing a way to tell initdb to do things differently.
    I'd just as soon keep that to "set huge_pages = off" rather than
    inventing whole new things.
    
    			regards, tom lane
    
    
    
    
  8. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Andres Freund <andres@anarazel.de> — 2023-01-22T00:27:04Z

    Hi,
    
    On 2023-01-21 15:29:22 -0800, Andres Freund wrote:
    > On 2023-01-22 00:10:29 +0100, Tomas Vondra wrote:
    > > On 1/20/23 23:48, PG Bug reporting form wrote:
    > > > In these cases, the initdb phase will attempt to allocate huge pages that
    > > > are available in the OS, but it will be denied access by Kubernetes and
    > > > fail.
    > >
    > > Well, so how exactly this fails? Does that mean Kubernetes broke mmap()
    > > with MAP_HUGETLB so that it doesn't return MAP_FAILED when hugepages are
    > > not available, or what? Because that's the only explanation I can see,
    > > looking at the code.
    >
    > Yea, that's what I was wondering about as well.
    >
    >
    > > Or it just does not realize there are no hugepages, returns something
    > > and then crashes with SIGBUS later when trying to access it?
    >
    > I assume that that's the case. There's references to bus errors in a bunch of
    > the linked issues. E.g.
    > https://github.com/CrunchyData/postgres-operator/issues/413
    >
    > selecting default max_connections ... sh: line 1:    60 Bus error               (core dumped) "/usr/pgsql-10/bin/postgres" --boot -x0 -F -c max_connections=100 -c shared_buffers=1000 -c dynamic_shared_memory_type=none < "/dev/null" > "/dev/null" 2>&1
    >
    > It's possible that the problem would go away if we used MAP_POPULATE for the
    > allocation.
    
    > I'd guess that this is annoying cgroups stuff :(
    
    Ah, the fun:
    https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/hugetlb.html
    
      The HugeTLB controller allows users to limit the HugeTLB usage (page fault) per
      control group and enforces the limit during page fault. Since HugeTLB
      doesn't support page reclaim, enforcing the limit at page fault time implies
      that, the application will get SIGBUS signal if it tries to fault in HugeTLB
      pages beyond its limit. Therefore the application needs to know exactly how many
      HugeTLB pages it uses before hand, and the sysadmin needs to make sure that
      there are enough available on the machine for all the users to avoid processes
      getting SIGBUS.
    
    but there's also
    
          Reservation accounting
    
      hugetlb.<hugepagesize>.rsvd.limit_in_bytes hugetlb.<hugepagesize>.rsvd.max_usage_in_bytes hugetlb.<hugepagesize>.rsvd.usage_in_bytes hugetlb.<hugepagesize>.rsvd.failcnt
    
      The HugeTLB controller allows to limit the HugeTLB reservations per control
      group and enforces the controller limit at reservation time and at the fault
      of HugeTLB memory for which no reservation exists. Since reservation limits
      are enforced at reservation time (on mmap or shget), reservation limits
      never causes the application to get SIGBUS signal if the memory was reserved
      before hand. For MAP_NORESERVE allocations, the reservation limit behaves
      the same as the fault limit, enforcing memory usage at fault time and
      causing the application to receive a SIGBUS if it’s crossing its limit.
    
      Reservation limits are superior to page fault limits described above, since
      reservation limits are enforced at reservation time (on mmap or shget), and
      never causes the application to get SIGBUS signal if the memory was reserved
      before hand. This allows for easier fallback to alternatives such as
      non-HugeTLB memory for example. In the case of page fault accounting, it’s
      very hard to avoid processes getting SIGBUS since the sysadmin needs
      precisely know the HugeTLB usage of all the tasks in the system and make
      sure there is enough pages to satisfy all requests. Avoiding tasks getting
      SIGBUS on overcommited systems is practically impossible with page fault
      accounting.
    
    So the problem is that the wrong time of cgroup limits are used. I don't know
    if that's a kubernetes or a postgres-operator issue.
    
    Greetings,
    
    Andres Freund
    
    
    
    
  9. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tomas Vondra <tomas.vondra@enterprisedb.com> — 2023-01-22T00:55:01Z

    
    On 1/22/23 00:30, Tom Lane wrote:
    > Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
    >> On 1/20/23 23:48, PG Bug reporting form wrote:
    >>> Here is a PR with a possible fix:
    >>> https://github.com/postgres/postgres/pull/114/files
    > 
    >> I doubt we want to just go straight to changing the default value for
    >> everyone.
    > 
    > Yeah, that proposal is a non-starter.  I could see providing an
    > initdb option to adjust the value applied during initdb, though.
    > 
    > Ideally, maybe what we want is a generalized switch that could
    > replace any variable in the sample config, along the lines of
    > the server's "-c foo=bar".  I recall having tried to do that and
    > having run into quoting hazards, but I did not try very hard.
    > 
    
    Yeah, I was looking for something like "-c" in initdb, only to realize
    there's nothing like that. The main "problem" with adding that is that
    we're unlikely to backpatch that (I guess), and thus it does not really
    solve the issue for the OP.
    
    I'm not sure we'd be keen to backpatch a change of the default, but
    maybe we would ...
    
    regards
    
    -- 
    Tomas Vondra
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
    
    
  10. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tom Lane <tgl@sss.pgh.pa.us> — 2023-01-22T01:01:08Z

    Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
    > On 1/22/23 00:30, Tom Lane wrote:
    >> Yeah, that proposal is a non-starter.  I could see providing an
    >> initdb option to adjust the value applied during initdb, though.
    >> Ideally, maybe what we want is a generalized switch that could
    >> replace any variable in the sample config, along the lines of
    >> the server's "-c foo=bar".  I recall having tried to do that and
    >> having run into quoting hazards, but I did not try very hard.
    
    > Yeah, I was looking for something like "-c" in initdb, only to realize
    > there's nothing like that. The main "problem" with adding that is that
    > we're unlikely to backpatch that (I guess), and thus it does not really
    > solve the issue for the OP.
    
    > I'm not sure we'd be keen to backpatch a change of the default, but
    > maybe we would ...
    
    Back-patching a change of default seems like REALLY a non-starter.
    Perhaps adding a switch (which would break nothing if not used)
    could be discussed, though.
    
    			regards, tom lane
    
    
    
    
  11. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Andres Freund <andres@anarazel.de> — 2023-01-22T02:08:26Z

    Hi,
    
    On 2023-01-22 01:55:01 +0100, Tomas Vondra wrote:
    > I'm not sure we'd be keen to backpatch a change of the default, but
    > maybe we would ...
    
    After figuring out that it's clearly a configuration issue *somewhere* outside
    of postgres's remit, I'm not that sure it's worth doing something concretely
    to avoid the SIGBUS issue.
    
    
    But if we end up doing something, I think a parameter triggering use of
    MAP_POPULATE would be a good idea. It's actually useful outside of the SIGBUS
    issue, because benchmarks reach a steady state noticably more quickly when
    using it.
    
    OTOH, in a production scenario with large shared_buffers I'd probably not want
    to use it, because getting up more quickly and and distributing the memory
    initialization across across cores is more important.
    
    
    I think it'd be ok to explicitly specify such an option in initdb - after all,
    initdb does do work to determine the correct shared buffers size etc, and
    MAP_POPULATE will lead to a more reliable determination.  Not just with huge
    pages, but also with "small" pages and system-level memory overcommit.
    
    Greetings,
    
    Andres Freund
    
    
    
    
  12. RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Sisson, David <david.sisson@dell.com> — 2023-01-23T19:26:09Z

    I believe something should be done with PostgreSQL because we are configuring huge_pages = off in the standard "postgresql.conf" file.
    huge_pages can be turned on through outside manipulation but it can't be turned off.
    Not without altering the sample config file.
    
    Thanks,
    David Angel   😊
    
    
    
    Internal Use - Confidential
    
    -----Original Message-----
    From: Andres Freund <andres@anarazel.de> 
    Sent: Saturday, January 21, 2023 8:08 PM
    To: Tomas Vondra
    Cc: Tom Lane; Sisson, David; pgsql-bugs@lists.postgresql.org
    Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes
    
    
    [EXTERNAL EMAIL] 
    
    Hi,
    
    On 2023-01-22 01:55:01 +0100, Tomas Vondra wrote:
    > I'm not sure we'd be keen to backpatch a change of the default, but 
    > maybe we would ...
    
    After figuring out that it's clearly a configuration issue *somewhere* outside of postgres's remit, I'm not that sure it's worth doing something concretely to avoid the SIGBUS issue.
    
    
    But if we end up doing something, I think a parameter triggering use of MAP_POPULATE would be a good idea. It's actually useful outside of the SIGBUS issue, because benchmarks reach a steady state noticably more quickly when using it.
    
    OTOH, in a production scenario with large shared_buffers I'd probably not want to use it, because getting up more quickly and and distributing the memory initialization across across cores is more important.
    
    
    I think it'd be ok to explicitly specify such an option in initdb - after all, initdb does do work to determine the correct shared buffers size etc, and MAP_POPULATE will lead to a more reliable determination.  Not just with huge pages, but also with "small" pages and system-level memory overcommit.
    
    Greetings,
    
    Andres Freund
    
  13. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Christophe Pettus <xof@thebuild.com> — 2023-01-23T19:38:09Z

    
    > On Jan 23, 2023, at 11:26, Sisson, David <David.Sisson@dell.com> wrote:
    > 
    > I believe something should be done with PostgreSQL because we are configuring huge_pages = off in the standard "postgresql.conf" file.
    
    We are?  I believe the default is "huge_pages = try", not off.
    
    
    
  14. RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Sisson, David <david.sisson@dell.com> — 2023-01-23T19:51:14Z

    The default is "huge_pages = try" which is commented out in the "postgresql.conf.sample" file.
    When a consumer like myself turns it off in the standard "postgresql.conf" file, it should not be turned on when initdb runs.
    There is no way to turn it off without altering the sample config file.
    
    It is quite difficult to nearly impossible to alter the "postgresql.conf.sample" file using a 3rd party controller.
    The file is read-only at runtime within Kubernetes.
    Only some controllers let you modify the sample file without rebuilding their code.
    
    You guys are awesome with truly outstanding responses.
    I certainly didn't expect my initial solution to be used but to help in finding a good solution.  😊
    
    Thanks,
    David Angel
    
    
    
    
    
    Internal Use - Confidential
    
    -----Original Message-----
    From: Christophe Pettus <xof@thebuild.com> 
    Sent: Monday, January 23, 2023 1:38 PM
    To: Sisson, David
    Cc: Andres Freund; Tomas Vondra; Tom Lane; pgsql-bugs@lists.postgresql.org
    Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes
    
    
    [EXTERNAL EMAIL] 
    
    
    
    > On Jan 23, 2023, at 11:26, Sisson, David <David.Sisson@dell.com> wrote:
    > 
    > I believe something should be done with PostgreSQL because we are configuring huge_pages = off in the standard "postgresql.conf" file.
    
    We are?  I believe the default is "huge_pages = try", not off.
    
  15. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Andres Freund <andres@anarazel.de> — 2023-01-23T19:55:04Z

    Hi,
    
    On 2023-01-23 19:26:09 +0000, Sisson, David wrote:
    > I believe something should be done with PostgreSQL because we are configuring huge_pages = off in the standard "postgresql.conf" file.
    > huge_pages can be turned on through outside manipulation but it can't be
    > turned off.
    
    It's a fault of the environment if mmap(MAP_HUGETLB) causes a SIGBUS. Normally
    huge_pages = try is harmless, because it'll just fall back. That source of
    SIGBUSes needs to be fixed regardless of anything else - plenty allocators try
    to use huge pages for example, so you'll run into problems regardless of
    postgres' default.
    
    That said, I'm for allowing to specify options to initdb.
    
    Greetings,
    
    Andres Freund
    
    
    
    
  16. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tom Lane <tgl@sss.pgh.pa.us> — 2023-01-23T19:55:50Z

    "Sisson, David" <David.Sisson@dell.com> writes:
    > The default is "huge_pages = try" which is commented out in the "postgresql.conf.sample" file.
    > When a consumer like myself turns it off in the standard "postgresql.conf" file, it should not be turned on when initdb runs.
    
    What "standard postgresql.conf file"?  There is no such thing until
    initdb creates it.
    
    > There is no way to turn it off without altering the sample config file.
    
    Yup, that's exactly why we are having this discussion.
    
    			regards, tom lane
    
    
    
    
  17. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    David G. Johnston <david.g.johnston@gmail.com> — 2023-01-23T20:00:45Z

    On Mon, Jan 23, 2023 at 12:51 PM Sisson, David <David.Sisson@dell.com>
    wrote:
    
    > The default is "huge_pages = try" which is commented out in the
    > "postgresql.conf.sample" file.
    > When a consumer like myself turns it off in the standard "postgresql.conf"
    > file, it should not be turned on when initdb runs.
    > There is no way to turn it off without altering the sample config file.
    >
    >
    Right, the present way to control what is seen by initdb is
    postgresql.conf.sample since that is the template that initdb uses to then
    produce an actual postgresql.conf for the newly created instance.
    postgresql.conf is only ever a per-instance configuration file.  It doesn't
    make sense to "change postgresql.conf in hopes of influencing some future
    initdb run."
    
    David J.
    
  18. RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Sisson, David <david.sisson@dell.com> — 2023-01-23T20:12:18Z

    That makes sense, the PostgreSQL controllers are calling initdb to create the "postgresql.conf" file before they apply customizations to it.
    To the consumer, it is just yaml to be added to the "postgresql.conf" file.
    
    That makes it much harder to fix and means it is really the controllers at fault.
    
    This probably needs to be explicitly documented when creating a HA cluster or within initdb docs.
    https://www.postgresql.org/docs/15/app-initdb.html
    
    Maybe something about how initdb uses sample and what configuration settings must be pre-configured.
    
    Thanks,
    David Angel
    
    
    
    
    
    
    
    
    Internal Use - Confidential
    
    -----Original Message-----
    From: Tom Lane <tgl@sss.pgh.pa.us> 
    Sent: Monday, January 23, 2023 1:56 PM
    To: Sisson, David
    Cc: Christophe Pettus; Andres Freund; Tomas Vondra; pgsql-bugs@lists.postgresql.org
    Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes
    
    
    [EXTERNAL EMAIL] 
    
    "Sisson, David" <David.Sisson@dell.com> writes:
    > The default is "huge_pages = try" which is commented out in the "postgresql.conf.sample" file.
    > When a consumer like myself turns it off in the standard "postgresql.conf" file, it should not be turned on when initdb runs.
    
    What "standard postgresql.conf file"?  There is no such thing until initdb creates it.
    
    > There is no way to turn it off without altering the sample config file.
    
    Yup, that's exactly why we are having this discussion.
    
    			regards, tom lane
    
    
    
    
  19. RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Sisson, David <david.sisson@dell.com> — 2023-01-23T20:35:17Z

    A quick and dirty solution could be to alter initdb to catch the exception and retry using a copy of the sample with "huge_pages=false".
    Would that be acceptable?
    
    Passing in a config setting into initdb would still require a rebuild of all controllers.
    That could take months to years at best.
    
    Thanks,
    David Angel
    
    
    
    
    Internal Use - Confidential
    
    -----Original Message-----
    From: Sisson, David <David_Sisson@Dell.com> 
    Sent: Monday, January 23, 2023 2:12 PM
    To: Tom Lane
    Cc: Christophe Pettus; Andres Freund; Tomas Vondra; pgsql-bugs@lists.postgresql.org; Sisson, David; Howell, Stephen
    Subject: RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes
    
    That makes sense, the PostgreSQL controllers are calling initdb to create the "postgresql.conf" file before they apply customizations to it.
    To the consumer, it is just yaml to be added to the "postgresql.conf" file.
    
    That makes it much harder to fix and means it is really the controllers at fault.
    
    This probably needs to be explicitly documented when creating a HA cluster or within initdb docs.
    https://www.postgresql.org/docs/15/app-initdb.html
    
    Maybe something about how initdb uses sample and what configuration settings must be pre-configured.
    
    Thanks,
    David Angel
    
    
    
    
    
    
    
    
    Internal Use - Confidential
    
    -----Original Message-----
    From: Tom Lane <tgl@sss.pgh.pa.us> 
    Sent: Monday, January 23, 2023 1:56 PM
    To: Sisson, David
    Cc: Christophe Pettus; Andres Freund; Tomas Vondra; pgsql-bugs@lists.postgresql.org
    Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes
    
    
    [EXTERNAL EMAIL] 
    
    "Sisson, David" <David.Sisson@dell.com> writes:
    > The default is "huge_pages = try" which is commented out in the "postgresql.conf.sample" file.
    > When a consumer like myself turns it off in the standard "postgresql.conf" file, it should not be turned on when initdb runs.
    
    What "standard postgresql.conf file"?  There is no such thing until initdb creates it.
    
    > There is no way to turn it off without altering the sample config file.
    
    Yup, that's exactly why we are having this discussion.
    
    			regards, tom lane
    
    
    
    
  20. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Andres Freund <andres@anarazel.de> — 2023-01-23T21:10:13Z

    Hi,
    
    On 2023-01-23 20:35:17 +0000, Sisson, David wrote:
    > A quick and dirty solution could be to alter initdb to catch the exception and retry using a copy of the sample with "huge_pages=false".
    > Would that be acceptable?
    
    This is a kubernetes or postgres-operator bug (setting up the wrong cgroup
    limit, which the docs explicitly warn against doing). I don't think we want to
    accumulate workarounds like that in postgres.
    
    
    > Passing in a config setting into initdb would still require a rebuild of all controllers.
    > That could take months to years at best.
    
    Huh. I don't know anything about the controller, but that seems problematic
    independent of this specific issue. And you'd still need to deploy a new
    version of postgres to get such changes...
    
    
    > Internal Use - Confidential
    
    Hardly.
    
    Greetings,
    
    Andres Freund
    
    
    
    
  21. RE: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Sisson, David <david.sisson@dell.com> — 2023-01-23T21:41:15Z

    The controllers generally always pull in the latest PostgreSQL.
    It is easy to get the latest version with PostgreSQL updated.
    
    Unfortunately, getting a bug fix is a lot harder.
    One controller currently holding this defect for over a year with no end in sight.
    
    
    Found this:
    https://github.com/opencontainers/runtime-spec/issues/1050
    
    Looks like a PR exists for it but the solution is invalid.
    https://github.com/kailun-qin/runtime-spec/commit/a6505339204535150260d8e4f0bc112628f1fa87
    
    
    More info:
    https://www.postgresql.org/message-id/flat/20200218093240.jd3lgoxmisyl2tt5%40localhost#61c2c7fc3d3dd80512c9130b6967be16
    
    
    It would be nice if "try" worked as expected.
    I totally understand it is not a PostgreSQL issue but any assistance would be very appreciated.
    
    
    Thanks,
    David Angel
    
    
    
    
    Internal Use - Confidential
    
    -----Original Message-----
    From: Andres Freund <andres@anarazel.de> 
    Sent: Monday, January 23, 2023 3:10 PM
    To: Sisson, David
    Cc: Tom Lane; Christophe Pettus; Tomas Vondra; pgsql-bugs@lists.postgresql.org; Howell, Stephen
    Subject: Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes
    
    
    [EXTERNAL EMAIL] 
    
    Hi,
    
    On 2023-01-23 20:35:17 +0000, Sisson, David wrote:
    > A quick and dirty solution could be to alter initdb to catch the exception and retry using a copy of the sample with "huge_pages=false".
    > Would that be acceptable?
    
    This is a kubernetes or postgres-operator bug (setting up the wrong cgroup limit, which the docs explicitly warn against doing). I don't think we want to accumulate workarounds like that in postgres.
    
    
    > Passing in a config setting into initdb would still require a rebuild of all controllers.
    > That could take months to years at best.
    
    Huh. I don't know anything about the controller, but that seems problematic independent of this specific issue. And you'd still need to deploy a new version of postgres to get such changes...
    
    
    > Internal Use - Confidential
    
    Hardly.
    
    Greetings,
    
    Andres Freund
    
    
    
    
  22. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tom Lane <tgl@sss.pgh.pa.us> — 2023-01-23T22:51:46Z

    Andres Freund <andres@anarazel.de> writes:
    > It's a fault of the environment if mmap(MAP_HUGETLB) causes a SIGBUS. Normally
    > huge_pages = try is harmless, because it'll just fall back. That source of
    > SIGBUSes needs to be fixed regardless of anything else - plenty allocators try
    > to use huge pages for example, so you'll run into problems regardless of
    > postgres' default.
    
    That seems likely to me too.
    
    > That said, I'm for allowing to specify options to initdb.
    
    Yeah, I think that has enough other potential applications to be worth
    doing.  Here's a quick draft patch (sans user-facing docs as yet).
    It injects any given values into postgresql.auto.conf, not
    postgresql.conf proper.  I did that mainly because the latter looked
    beyond the abilities of the primitive string-munging code we have in
    there, but I think it can be argued to be a reasonable choice anyway.
    
    			regards, tom lane
    
    
  23. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Andres Freund <andres@anarazel.de> — 2023-01-24T00:37:37Z

    Hi,
    
    On 2023-01-23 17:51:46 -0500, Tom Lane wrote:
    > Andres Freund <andres@anarazel.de> writes:
    > > That said, I'm for allowing to specify options to initdb.
    > 
    > Yeah, I think that has enough other potential applications to be worth
    > doing.  Here's a quick draft patch (sans user-facing docs as yet).
    > It injects any given values into postgresql.auto.conf, not
    > postgresql.conf proper.  I did that mainly because the latter looked
    > beyond the abilities of the primitive string-munging code we have in
    > there, but I think it can be argued to be a reasonable choice anyway.
    
    Oh, I had thought we'd just pass them on with -c to the processes that initdb
    starts. But perhaps just persisting them isn't a bad idea...
    
    - Andres
    
    
    
    
  24. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Tom Lane <tgl@sss.pgh.pa.us> — 2023-01-24T00:45:19Z

    Andres Freund <andres@anarazel.de> writes:
    > On 2023-01-23 17:51:46 -0500, Tom Lane wrote:
    >> Yeah, I think that has enough other potential applications to be worth
    >> doing.  Here's a quick draft patch (sans user-facing docs as yet).
    >> It injects any given values into postgresql.auto.conf, not
    >> postgresql.conf proper.  I did that mainly because the latter looked
    >> beyond the abilities of the primitive string-munging code we have in
    >> there, but I think it can be argued to be a reasonable choice anyway.
    
    > Oh, I had thought we'd just pass them on with -c to the processes that initdb
    > starts. But perhaps just persisting them isn't a bad idea...
    
    It certainly seems to me that that would be the mainstream use-case,
    so why not fill in the file as the user probably wants?  They can
    always change it.  Also, as I mentioned, the expectation is that
    initdb will set up a known-working combination of settings; and
    we don't really know that if we leave off whatever was injected by
    "-c".  In the case at hand, if we don't propagate "huge_pages = off"
    to the installed configuration, the server still won't work.
    
    			regards, tom lane
    
    
    
    
  25. Re: BUG #17757: Not honoring huge_pages setting during initdb causes DB crash in Kubernetes

    Andres Freund <andres@anarazel.de> — 2023-01-24T01:00:20Z

    Hi,
    
    On 2023-01-23 19:45:19 -0500, Tom Lane wrote:
    > Andres Freund <andres@anarazel.de> writes:
    > > On 2023-01-23 17:51:46 -0500, Tom Lane wrote:
    > >> Yeah, I think that has enough other potential applications to be worth
    > >> doing.  Here's a quick draft patch (sans user-facing docs as yet).
    > >> It injects any given values into postgresql.auto.conf, not
    > >> postgresql.conf proper.  I did that mainly because the latter looked
    > >> beyond the abilities of the primitive string-munging code we have in
    > >> there, but I think it can be argued to be a reasonable choice anyway.
    > 
    > > Oh, I had thought we'd just pass them on with -c to the processes that initdb
    > > starts. But perhaps just persisting them isn't a bad idea...
    > 
    > It certainly seems to me that that would be the mainstream use-case,
    > so why not fill in the file as the user probably wants?  They can
    > always change it.  Also, as I mentioned, the expectation is that
    > initdb will set up a known-working combination of settings; and
    > we don't really know that if we leave off whatever was injected by
    > "-c".  In the case at hand, if we don't propagate "huge_pages = off"
    > to the installed configuration, the server still won't work.
    
    Yea, makes sense.
    
    Greetings,
    
    Andres Freund