Thread

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Add pg_stat_database counters for sessions and session time

  1. Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-06-30T08:01:00Z

    Hi hackers,
    
    While commit 960869da08 added some information about connections that 
    have been successfully authenticated, there is no metrics for 
    connections that have not (or did not reached the authentication stage).
    
    Adding metrics about failed connections attempts could also help, for 
    example with proper sampling, to:
    
      * detect spikes in failed login attempts
      * check if there is a correlation between spikes in successful and
        failed connection attempts
    
    While the number of successful connections could also already been 
    tracked with the ClientAuthentication_hook (and also the ones that 
    failed the authentication) we are missing metrics about:
    
      * why the connection failed (could be bad password, bad database, bad
        user, missing CONNECT privilege...)
      * number of times the authentication stage has not been reached
      * why the authentication stage has not been reached (bad startup
        packets, timeout while processing startup packet,...)
    
    Those missing metrics (in addition to the ones that can be already 
    gathered) could provide value for:
    
      * security investigations
      * anomalies detections
      * tracking application misconfigurations
    
    In an attempt to be able to provide those metrics, please find attached 
    a patch proposal to add new hooks in the connection path, that would be 
    fired if:
    
      * there is a bad startup packet
      * there is a timeout while processing the startup packet
      * user does not have CONNECT privilege
      * database does not exist
    
    For safety those hooks request the use of a const Port parameter, so 
    that they could be used only for reporting purpose (for example, we are 
    working on an extension to record detailed login metrics counters).
    
    Another option could be to add those metrics in the engine itself 
    (instead of providing new hooks to get them), but the new hooks option 
    gives more flexibility on how to render and exploit them (there is a lot 
    of information in the Port Struct that one could be interested with).
    
    I’m adding this patch proposal to the commitfest.
    Looking forward to your feedback,
    
    Regards,
    Bertrand
    
  2. Re: Patch proposal: New hooks in the connection path

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2022-06-30T09:23:33Z

    On Thu, Jun 30, 2022 at 1:31 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >
    > Hi hackers,
    >
    > While commit 960869da08 added some information about connections that have been successfully authenticated, there is no metrics for connections that have not (or did not reached the authentication stage).
    >
    > Adding metrics about failed connections attempts could also help, for example with proper sampling, to:
    >
    > detect spikes in failed login attempts
    > check if there is a correlation between spikes in successful and failed connection attempts
    >
    > While the number of successful connections could also already been tracked with the ClientAuthentication_hook (and also the ones that failed the authentication) we are missing metrics about:
    >
    > why the connection failed (could be bad password, bad database, bad user, missing CONNECT privilege...)
    > number of times the authentication stage has not been reached
    > why the authentication stage has not been reached (bad startup packets, timeout while processing startup packet,...)
    >
    > Those missing metrics (in addition to the ones that can be already gathered) could provide value for:
    >
    > security investigations
    > anomalies detections
    > tracking application misconfigurations
    >
    > In an attempt to be able to provide those metrics, please find attached a patch proposal to add new hooks in the connection path, that would be fired if:
    >
    > there is a bad startup packet
    > there is a timeout while processing the startup packet
    > user does not have CONNECT privilege
    > database does not exist
    >
    > For safety those hooks request the use of a const Port parameter, so that they could be used only for reporting purpose (for example, we are working on an extension to record detailed login metrics counters).
    >
    > Another option could be to add those metrics in the engine itself (instead of providing new hooks to get them), but the new hooks option gives more flexibility on how to render and exploit them (there is a lot of information in the Port Struct that one could be interested with).
    >
    > I’m adding this patch proposal to the commitfest.
    > Looking forward to your feedback,
    
    +1 for the idea. I've seen numerous cases where the login metrics
    (especially failed connections) are handy in analyzing stuff. And I'm
    okay with the hook approach than the postgres emitting the necessary
    metrics. However, I'm personally not okay with having multiple hooks
    as proposed in the v1 patch. Can we think of having a single hook or
    enhancing the existing ClientAuthentication_hook where we pass a
    PURPOSE parameter (CONN_SUCCESS, CONN_FAILURE, CONN_FOO, CONN_BAR
    ....) tp the hook? With this approach, we don't need to spread out the
    postgres code with many hooks and the hook implementers will look at
    the PURPOSE parameter and deal with it accordingly.
    
    On the security aspect, we must ensure we don't leak any sensitive
    information such as password or SSH key to the new hook - if PGPORT
    has this information, maybe we need to mask that structure a bit
    before handing it off to the hook.
    
    Regards,
    Bharath Rupireddy.
    
    
    
    
  3. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-07-01T07:48:40Z

    Hi,
    
    On 6/30/22 11:23 AM, Bharath Rupireddy wrote:
    > On Thu, Jun 30, 2022 at 1:31 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >> Hi hackers,
    >>
    >> While commit 960869da08 added some information about connections that have been successfully authenticated, there is no metrics for connections that have not (or did not reached the authentication stage).
    >>
    >> Adding metrics about failed connections attempts could also help, for example with proper sampling, to:
    >>
    >> detect spikes in failed login attempts
    >> check if there is a correlation between spikes in successful and failed connection attempts
    >>
    >> While the number of successful connections could also already been tracked with the ClientAuthentication_hook (and also the ones that failed the authentication) we are missing metrics about:
    >>
    >> why the connection failed (could be bad password, bad database, bad user, missing CONNECT privilege...)
    >> number of times the authentication stage has not been reached
    >> why the authentication stage has not been reached (bad startup packets, timeout while processing startup packet,...)
    >>
    >> Those missing metrics (in addition to the ones that can be already gathered) could provide value for:
    >>
    >> security investigations
    >> anomalies detections
    >> tracking application misconfigurations
    >>
    >> In an attempt to be able to provide those metrics, please find attached a patch proposal to add new hooks in the connection path, that would be fired if:
    >>
    >> there is a bad startup packet
    >> there is a timeout while processing the startup packet
    >> user does not have CONNECT privilege
    >> database does not exist
    >>
    >> For safety those hooks request the use of a const Port parameter, so that they could be used only for reporting purpose (for example, we are working on an extension to record detailed login metrics counters).
    >>
    >> Another option could be to add those metrics in the engine itself (instead of providing new hooks to get them), but the new hooks option gives more flexibility on how to render and exploit them (there is a lot of information in the Port Struct that one could be interested with).
    >>
    >> I’m adding this patch proposal to the commitfest.
    >> Looking forward to your feedback,
    > +1 for the idea. I've seen numerous cases where the login metrics
    > (especially failed connections) are handy in analyzing stuff. And I'm
    > okay with the hook approach than the postgres emitting the necessary
    > metrics.
    
    Thanks for looking at it!
    
    > However, I'm personally not okay with having multiple hooks
    > as proposed in the v1 patch.
    
    I agree that it would be great to reduce the number of proposed hooks.
    
    But,
    
    >   Can we think of having a single hook
    
    The proposed hooks are triggered during errors (means that the 
    connection attempt break) and:
    
    - In the connection paths that will not reach the 
    ClientAuthentication_hook at all: those are the ones related to the bad 
    startup packet and timeout while processing the startup packet.
    
    or
    
    - After the ClientAuthentication_hook is fired: those are the bad db 
    oid, bad db name and bad perm ones.
    
    So, It does look like having only one hook would require refactoring in 
    the connection path and I'm not sure if this is worth it.
    
    > or
    > enhancing the existing ClientAuthentication_hook where we pass a
    > PURPOSE parameter (CONN_SUCCESS, CONN_FAILURE, CONN_FOO, CONN_BAR
    > ....) tp the hook?
    
    I think one could already "predict" the bad db and bad perm errors 
    within the current ClientAuthentication_hook.
    
    But in case of multiple "possible" errors (within the same connection 
    attempt) how could we know for sure the one that will be actually 
    reported? That's why i think the best way is to put new hooks as close 
    as possible to the place where the related errors are reported.
    
    What do you think?
    
    > With this approach, we don't need to spread out the
    > postgres code with many hooks and the hook implementers will look at
    > the PURPOSE parameter and deal with it accordingly.
    >
    > On the security aspect, we must ensure we don't leak any sensitive
    > information such as password or SSH key to the new hook - if PGPORT
    > has this information, maybe we need to mask that structure a bit
    > before handing it off to the hook.
    
    Yeah good point, I'll have a closer look if there is any information 
    that could be present in the Port struct before the 
    ClientAuthentication_hook is fired (means during the ones that are 
    related to the startup packet) and that we would want to mask.
    
    Regards,
    
    Bertrand
    
    
    
    
    
  4. Re: Patch proposal: New hooks in the connection path

    Nathan Bossart <nathandbossart@gmail.com> — 2022-07-01T23:00:27Z

    On Fri, Jul 01, 2022 at 09:48:40AM +0200, Drouvot, Bertrand wrote:
    >> However, I'm personally not okay with having multiple hooks
    >> as proposed in the v1 patch.
    > 
    > I agree that it would be great to reduce the number of proposed hooks.
    > 
    > But,
    > 
    >>   Can we think of having a single hook
    > 
    > The proposed hooks are triggered during errors (means that the connection
    > attempt break) and:
    > 
    > - In the connection paths that will not reach the ClientAuthentication_hook
    > at all: those are the ones related to the bad startup packet and timeout
    > while processing the startup packet.
    > 
    > or
    > 
    > - After the ClientAuthentication_hook is fired: those are the bad db oid,
    > bad db name and bad perm ones.
    > 
    > So, It does look like having only one hook would require refactoring in the
    > connection path and I'm not sure if this is worth it.
    > 
    >> or
    >> enhancing the existing ClientAuthentication_hook where we pass a
    >> PURPOSE parameter (CONN_SUCCESS, CONN_FAILURE, CONN_FOO, CONN_BAR
    >> ....) tp the hook?
    > 
    > I think one could already "predict" the bad db and bad perm errors within
    > the current ClientAuthentication_hook.
    > 
    > But in case of multiple "possible" errors (within the same connection
    > attempt) how could we know for sure the one that will be actually reported?
    > That's why i think the best way is to put new hooks as close as possible to
    > the place where the related errors are reported.
    > 
    > What do you think?
    
    Could we model this after fmgr_hook?  The first argument in that hook
    indicates where it is being called from.  This doesn't alleviate the need
    for several calls to the hook in the authentication logic, but extension
    authors would only need to define one hook.
    
    That being said, I don't see why this information couldn't be provided in a
    system view.  IMO it is generically useful.
    
    -- 
    Nathan Bossart
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
  5. Re: Patch proposal: New hooks in the connection path

    Roberto Mello <roberto.mello@gmail.com> — 2022-07-02T00:49:09Z

    On Fri, Jul 1, 2022 at 5:00 PM Nathan Bossart <nathandbossart@gmail.com>
    wrote:
    
    >
    >
    > That being said, I don't see why this information couldn't be provided in a
    > system view.  IMO it is generically useful.
    
    
    +1 for a system view with appropriate permissions, in addition to the
    hooks.
    
    That would make the information easily accessible to a number or monitoring
    systems besides the admin.
    
    Roberto
    
    —
    Crunchy Data — passion for open source PostgreSQL
    
    >
    
  6. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-07-04T12:53:24Z

    Hi,
    
    On 7/2/22 1:00 AM, Nathan Bossart wrote:
    > Could we model this after fmgr_hook?  The first argument in that hook
    > indicates where it is being called from.  This doesn't alleviate the need
    > for several calls to the hook in the authentication logic, but extension
    > authors would only need to define one hook.
    
    I like the idea and indeed fmgr.h looks a good place to model it.
    
    Attached a new patch version doing so.
    
    Thanks
    
    -- 
    
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
  7. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-07-04T12:58:50Z

    Hi,
    
    On 7/2/22 2:49 AM, Roberto Mello wrote:
    >
    > On Fri, Jul 1, 2022 at 5:00 PM Nathan Bossart 
    > <nathandbossart@gmail.com> wrote:
    >
    >
    >
    >     That being said, I don't see why this information couldn't be
    >     provided in a
    >     system view.  IMO it is generically useful.
    >
    >
    > +1 for a system view with appropriate permissions, in addition to the 
    > hooks.
    >
    > That would make the information easily accessible to a number or 
    > monitoring systems besides the admin.
    >
    Agree about that.
    
    I'll start another thread and propose a dedicated patch for the 
    "internal counters" and how to expose them.
    
    Thanks for the feedback,
    
    -- 
    Bertrand Drouvot
    Amazon Web Services:https://aws.amazon.com
    
  8. Re: Patch proposal: New hooks in the connection path

    Zhihong Yu <zyu@yugabyte.com> — 2022-07-04T13:12:42Z

    On Mon, Jul 4, 2022 at 5:54 AM Drouvot, Bertrand <bdrouvot@amazon.com>
    wrote:
    
    > Hi,
    >
    > On 7/2/22 1:00 AM, Nathan Bossart wrote:
    > > Could we model this after fmgr_hook?  The first argument in that hook
    > > indicates where it is being called from.  This doesn't alleviate the need
    > > for several calls to the hook in the authentication logic, but extension
    > > authors would only need to define one hook.
    >
    > I like the idea and indeed fmgr.h looks a good place to model it.
    >
    > Attached a new patch version doing so.
    >
    > Thanks
    >
    > --
    >
    > Bertrand Drouvot
    > Amazon Web Services: https://aws.amazon.com
    
    Hi,
    +   FCET_SPT,   /* startup packet timeout */
    +   FCET_BSP,   /* bad startup packet */
    
    Looking at existing enum type, such as FmgrHookEventType, the part after
    underscore is a word.
    I think it would be good to follow existing practice and make the enums
    more readable.
    
    Cheers
    
  9. Re: Patch proposal: New hooks in the connection path

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2022-07-05T07:07:23Z

    On Mon, Jul 4, 2022 at 6:29 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >
    > On 7/2/22 2:49 AM, Roberto Mello wrote:
    >
    > On Fri, Jul 1, 2022 at 5:00 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
    >>
    >> That being said, I don't see why this information couldn't be provided in a
    >> system view.  IMO it is generically useful.
    >
    > +1 for a system view with appropriate permissions, in addition to the hooks.
    >
    > That would make the information easily accessible to a number or monitoring systems besides the admin.
    >
    > Agree about that.
    
    Are we going to have it as a part of shared memory stats? Or a
    separate shared memory for connection stats exposing these via a
    function and a view can be built on this function like
    pg_get_replication_slots and pg_replication_slots?
    
    > I'll start another thread and propose a dedicated patch for the "internal counters" and how to expose them.
    
    IMHO, let's have the discussion here in this thread and the patch can be 0002.
    
    Regards,
    Bharath Rupireddy.
    
    
    
    
  10. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-07-05T07:17:03Z

    Hi,
    
    On 7/4/22 3:12 PM, Zhihong Yu wrote:
    > Hi,
    > +   FCET_SPT,   /* startup packet timeout */
    > +   FCET_BSP,   /* bad startup packet */
    >
    > Looking at existing enum type, such as FmgrHookEventType, the part 
    > after underscore is a word.
    > I think it would be good to follow existing practice and make the 
    > enums more readable.
    
    Fair point.
    
    Attached a new version which makes the enums more readable.
    
    Thanks for the feedback
    
    Regards
    
    -- 
    Bertrand Drouvot
    Amazon Web Services:https://aws.amazon.com
    
  11. Re: Patch proposal: New hooks in the connection path

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2022-07-05T07:37:24Z

    On Mon, Jul 4, 2022 at 6:23 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >
    > Hi,
    >
    > On 7/2/22 1:00 AM, Nathan Bossart wrote:
    > > Could we model this after fmgr_hook?  The first argument in that hook
    > > indicates where it is being called from.  This doesn't alleviate the need
    > > for several calls to the hook in the authentication logic, but extension
    > > authors would only need to define one hook.
    >
    > I like the idea and indeed fmgr.h looks a good place to model it.
    >
    > Attached a new patch version doing so.
    
    Thanks for the patch. Can we think of enhancing
    ClientAuthentication_hook_type itself i.e. make it a generic hook for
    all sorts of authentication metrics, info etc. with the type parameter
    embedded to it instead of new hook FailedConnection_hook?We can either
    add a new parameter for the "event" (the existing
    ClientAuthentication_hook_type implementers will have problems), or
    embed/multiplex the "event" info to existing Port structure or status
    variable (macro or enum) (existing implementers will not have
    compatibility problems).  IMO, this looks cleaner going forward.
    
    On the v2 patch:
    
    1. Why do we need to place the hook and structures in fmgr.h? Why not in auth.h?
    
    2. Timeout Handler is a signal handler, called as part of SIGALRM
    signal handler, most of the times, signal handlers ought to be doing
    small things, now that we are handing off the control to hook, which
    can do any long running work (writing to a remote storage, file,
    aggregate etc.), I don't think it's the right thing to do here.
     static void
     StartupPacketTimeoutHandler(void)
     {
    + if (FailedConnection_hook)
    + (*FailedConnection_hook) (FCET_STARTUP_PACKET_TIMEOUT, MyProcPort);
    + ereport(COMMERROR,
    + (errcode(ERRCODE_PROTOCOL_VIOLATION),
    + errmsg("timeout while processing startup packet")));
    
    3. On "not letting these hooks (ClientAuthentication_hook_type or
    FailedConnection_hook_type) expose sensitive info via Port structure"
    - it seems like the Port structure has sensitive info like HbaLine,
    host address, username, etc. but that's what it is so I think we are
    okay with the structure as-is.
    
    Regards,
    Bharath Rupireddy.
    
    
    
    
  12. Re: Patch proposal: New hooks in the connection path

    Brindle, Joshua <joshuqbr@amazon.com> — 2022-07-05T13:27:06Z

    On 6/30/22 5:23 AM, Bharath Rupireddy wrote:
    > <snip>
    > On the security aspect, we must ensure we don't leak any sensitive
    > information such as password or SSH key to the new hook - if PGPORT
    > has this information, maybe we need to mask that structure a bit
    > before handing it off to the hook.
    
    Can you elaborate more on why you see this as necessary? Extensions run 
    in-process and have no real memory access limits, "masking", which 
    really means copying data to another struct, is just extra work and 
    overhead with no actual security gain, IMO.
    
    
    
    
    
    
    
  13. Re: Patch proposal: New hooks in the connection path

    Joe Conway <mail@joeconway.com> — 2022-07-05T22:11:12Z

    On 7/5/22 03:37, Bharath Rupireddy wrote:
    > On Mon, Jul 4, 2022 at 6:23 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >> On 7/2/22 1:00 AM, Nathan Bossart wrote:
    >> > Could we model this after fmgr_hook?  The first argument in that hook
    >> > indicates where it is being called from.  This doesn't alleviate the need
    >> > for several calls to the hook in the authentication logic, but extension
    >> > authors would only need to define one hook.
    >>
    >> I like the idea and indeed fmgr.h looks a good place to model it.
    >>
    >> Attached a new patch version doing so.
    
    I was thinking along the same lines, so +1 for the general approach
    
    > Thanks for the patch. Can we think of enhancing
    > ClientAuthentication_hook_type itself i.e. make it a generic hook for
    > all sorts of authentication metrics, info etc. with the type parameter
    > embedded to it instead of new hook FailedConnection_hook?We can either
    > add a new parameter for the "event" (the existing
    > ClientAuthentication_hook_type implementers will have problems), or
    > embed/multiplex the "event" info to existing Port structure or status
    > variable (macro or enum) (existing implementers will not have
    > compatibility problems).  IMO, this looks cleaner going forward.
    
    Not sure I like this though -- I'll have to think about that
    
    > On the v2 patch:
    > 
    > 1. Why do we need to place the hook and structures in fmgr.h? Why not in auth.h?
    
    agreed -- it does not belong in fmgr.h
    
    > 2. Timeout Handler is a signal handler, called as part of SIGALRM
    > signal handler, most of the times, signal handlers ought to be doing
    > small things, now that we are handing off the control to hook, which
    > can do any long running work (writing to a remote storage, file,
    > aggregate etc.), I don't think it's the right thing to do here.
    >   static void
    >   StartupPacketTimeoutHandler(void)
    >   {
    > + if (FailedConnection_hook)
    > + (*FailedConnection_hook) (FCET_STARTUP_PACKET_TIMEOUT, MyProcPort);
    > + ereport(COMMERROR,
    > + (errcode(ERRCODE_PROTOCOL_VIOLATION),
    > + errmsg("timeout while processing startup packet")));
    
    Why add the ereport()?
    
    But more to Bharath's point, perhaps this is a case that is better 
    served by incrementing a stat counter and not exposed as a hook?
    
    Also, a teeny nit:
    8<--------------
    +	if (status != STATUS_OK) {
    +		if (FailedConnection_hook)
    8<--------------
    
    does not follow usual practice and probably should be:
    
    8<--------------
    +	if (status != STATUS_OK)
    +	{
    +		if (FailedConnection_hook)
    8<--------------
    
    
    -- 
    Joe Conway
    RDS Open Source Databases
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
  14. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-07-06T08:13:51Z

    Hi,
    
    On 7/6/22 12:11 AM, Joe Conway wrote:
    >
    > On 7/5/22 03:37, Bharath Rupireddy wrote:
    >> On Mon, Jul 4, 2022 at 6:23 PM Drouvot, Bertrand 
    >> <bdrouvot@amazon.com> wrote:
    >>> On 7/2/22 1:00 AM, Nathan Bossart wrote:
    >>> > Could we model this after fmgr_hook?  The first argument in that hook
    >>> > indicates where it is being called from.  This doesn't alleviate 
    >>> the need
    >>> > for several calls to the hook in the authentication logic, but 
    >>> extension
    >>> > authors would only need to define one hook.
    >>>
    >>> I like the idea and indeed fmgr.h looks a good place to model it.
    >>>
    >>> Attached a new patch version doing so.
    >
    > I was thinking along the same lines, so +1 for the general approach
    
    Thanks for the review!
    
    >
    >> Thanks for the patch. Can we think of enhancing
    >> ClientAuthentication_hook_type itself i.e. make it a generic hook for
    >> all sorts of authentication metrics, info etc. with the type parameter
    >> embedded to it instead of new hook FailedConnection_hook?We can either
    >> add a new parameter for the "event" (the existing
    >> ClientAuthentication_hook_type implementers will have problems), or
    >> embed/multiplex the "event" info to existing Port structure or status
    >> variable (macro or enum) (existing implementers will not have
    >> compatibility problems).  IMO, this looks cleaner going forward.
    >
    > Not sure I like this though -- I'll have to think about that
    
    Not sure about this one neither.
    
    The "enhanced" ClientAuthentication_hook will have to be fired at the 
    same places as the new FailedConnection_hook is, but i think those 
    places are not necessary linked to real authentication per say (making 
    the name confusing).
    
    >
    >> On the v2 patch:
    >>
    >> 1. Why do we need to place the hook and structures in fmgr.h? Why not 
    >> in auth.h?
    >
    > agreed -- it does not belong in fmgr.h
    
    Moved to auth.h.
    
    >
    >> 2. Timeout Handler is a signal handler, called as part of SIGALRM
    >> signal handler, most of the times, signal handlers ought to be doing
    >> small things, now that we are handing off the control to hook, which
    >> can do any long running work (writing to a remote storage, file,
    >> aggregate etc.), I don't think it's the right thing to do here.
    >>   static void
    >>   StartupPacketTimeoutHandler(void)
    >>   {
    >> + if (FailedConnection_hook)
    >> + (*FailedConnection_hook) (FCET_STARTUP_PACKET_TIMEOUT, MyProcPort);
    >> + ereport(COMMERROR,
    >> + (errcode(ERRCODE_PROTOCOL_VIOLATION),
    >> + errmsg("timeout while processing startup packet")));
    >
    > Why add the ereport()?
    
    removed it.
    
    >
    > But more to Bharath's point, perhaps this is a case that is better
    > served by incrementing a stat counter and not exposed as a hook?
    
    I think that the advantage of the hook is that it gives the extension 
    author the ability/flexibility to aggregate the counter based on 
    information available in the Port Struct (say the client addr for 
    example) at this stage.
    
    What about to aggregate the stat counter based on the client addr? (Not 
    sure if there is more useful information (than the client addr) at this 
    stage though)
    
    That said, i agree that having a hook in a time out handler might not be 
    the right thing to do (even if at the end that would be to the extension 
    author responsibility to do "small things" in it), so it has been 
    removed in the new attached version.
    
    >
    > Also, a teeny nit:
    > 8<--------------
    > +       if (status != STATUS_OK) {
    > +               if (FailedConnection_hook)
    > 8<--------------
    >
    > does not follow usual practice and probably should be:
    >
    > 8<--------------
    > +       if (status != STATUS_OK)
    > +       {
    > +               if (FailedConnection_hook)
    > 8<--------------
    >
    >
    Thanks!, fixed.
    
    -- 
    
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
  15. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-07-06T08:18:08Z

    Hi,
    
    On 7/5/22 9:07 AM, Bharath Rupireddy wrote:
    > On Mon, Jul 4, 2022 at 6:29 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >> On 7/2/22 2:49 AM, Roberto Mello wrote:
    >>
    >> On Fri, Jul 1, 2022 at 5:00 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
    >>> That being said, I don't see why this information couldn't be provided in a
    >>> system view.  IMO it is generically useful.
    >> +1 for a system view with appropriate permissions, in addition to the hooks.
    >>
    >> That would make the information easily accessible to a number or monitoring systems besides the admin.
    >>
    >> Agree about that.
    > Are we going to have it as a part of shared memory stats? Or a
    > separate shared memory for connection stats exposing these via a
    > function and a view can be built on this function like
    > pg_get_replication_slots and pg_replication_slots?
    
    Thanks for looking at it.
    
    I don't have any proposal yet, I'll have to look at it.
    
    -- 
    
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
    
  16. Re: Patch proposal: New hooks in the connection path

    Joe Conway <mail@joeconway.com> — 2022-07-07T19:56:10Z

    On 7/6/22 04:13, Drouvot, Bertrand wrote:
    > On 7/6/22 12:11 AM, Joe Conway wrote:
    >> On 7/5/22 03:37, Bharath Rupireddy wrote:
    >>> 2. Timeout Handler is a signal handler, called as part of SIGALRM
    >>> signal handler, most of the times, signal handlers ought to be doing
    >>> small things, now that we are handing off the control to hook, which
    >>> can do any long running work (writing to a remote storage, file,
    >>> aggregate etc.), I don't think it's the right thing to do here.
    >>>   static void
    >>>   StartupPacketTimeoutHandler(void)
    >>>   {
    >>> + if (FailedConnection_hook)
    >>> + (*FailedConnection_hook) (FCET_STARTUP_PACKET_TIMEOUT, MyProcPort);
    
    >> But more to Bharath's point, perhaps this is a case that is better
    >> served by incrementing a stat counter and not exposed as a hook?
    > 
    > I think that the advantage of the hook is that it gives the extension
    > author the ability/flexibility to aggregate the counter based on
    > information available in the Port Struct (say the client addr for
    > example) at this stage.
    > 
    > What about to aggregate the stat counter based on the client addr? (Not
    > sure if there is more useful information (than the client addr) at this
    > stage though)
    > 
    > That said, i agree that having a hook in a time out handler might not be
    > the right thing to do (even if at the end that would be to the extension
    > author responsibility to do "small things" in it), so it has been
    > removed in the new attached version.
    
    It isn't clear to me if having a hook in the timeout handler is a 
    nonstarter -- perhaps a comment with suitable warning for prospective 
    extension authors is enough? Anyone else want to weigh in on this issue 
    specifically?
    
    -- 
    Joe Conway
    RDS Open Source Databases
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
  17. Re: Patch proposal: New hooks in the connection path

    Tom Lane <tgl@sss.pgh.pa.us> — 2022-07-07T20:10:34Z

    Joe Conway <mail@joeconway.com> writes:
    > It isn't clear to me if having a hook in the timeout handler is a 
    > nonstarter -- perhaps a comment with suitable warning for prospective 
    > extension authors is enough? Anyone else want to weigh in on this issue 
    > specifically?
    
    It doesn't seem like a great place for a hook, because the list of stuff
    you could safely do there would be mighty short, possibly the empty set.
    Write to shared memory?  Not too safe.  Write to a file?  Even less.
    Write to local memory?  Pointless, because we're about to _exit(1).
    Pretty much anything I can think of that you'd want to do is something
    we've already decided the core code can't safely do, and putting it
    in a hook won't make it safer.
    
    If someone wants to argue for this hook, I'd like to see a credible
    example of a *safe* use-case, keeping in mind the points raised in
    the comments in BackendInitialize and process_startup_packet_die.
    
    			regards, tom lane
    
    
    
    
  18. Re: Patch proposal: New hooks in the connection path

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2022-07-08T12:24:13Z

    On Fri, Jul 8, 2022 at 1:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    >
    > Joe Conway <mail@joeconway.com> writes:
    > > It isn't clear to me if having a hook in the timeout handler is a
    > > nonstarter -- perhaps a comment with suitable warning for prospective
    > > extension authors is enough? Anyone else want to weigh in on this issue
    > > specifically?
    >
    > It doesn't seem like a great place for a hook, because the list of stuff
    > you could safely do there would be mighty short, possibly the empty set.
    > Write to shared memory?  Not too safe.  Write to a file?  Even less.
    > Write to local memory?  Pointless, because we're about to _exit(1).
    > Pretty much anything I can think of that you'd want to do is something
    > we've already decided the core code can't safely do, and putting it
    > in a hook won't make it safer.
    
    I agree with this. But, all of the areas that v2-0003 touched for
    connectivity failures, they typically are emitting
    ereport(FATAL,/ereport(COMMERROR, (in ProcessStartupPacket) and we
    have emit_log_hook already being exposed and the implementers can,
    literally, do anything the hook.
    
    Looking at v2-0003 patch and emit_log_hook, how about we filter out
    for those connectivity errors either based on error codes and if they
    aren't unique, perhaps passing special flags to ereport API indicating
    that it's a connectivity error and in the emit_log_hook we can look
    for those connectivity error codes or flags to collect the stats about
    the failure connections (with MyProcPort being present in
    emit_log_hook)? This way, we don't need a new hook. Thoughts?
    
    Regards,
    Bharath Rupireddy.
    
    
    
    
  19. Re: Patch proposal: New hooks in the connection path

    Tom Lane <tgl@sss.pgh.pa.us> — 2022-07-08T13:43:22Z

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
    > On Fri, Jul 8, 2022 at 1:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    >> It doesn't seem like a great place for a hook, because the list of stuff
    >> you could safely do there would be mighty short, possibly the empty set.
    
    > I agree with this. But, all of the areas that v2-0003 touched for
    > connectivity failures, they typically are emitting
    > ereport(FATAL,/ereport(COMMERROR, (in ProcessStartupPacket) and we
    > have emit_log_hook already being exposed and the implementers can,
    > literally, do anything the hook.
    
    This is utterly off-point, because those calls are not inside
    signal handlers.
    
    			regards, tom lane
    
    
    
    
  20. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-07-11T06:18:46Z

    Hi,
    
    On 7/7/22 10:10 PM, Tom Lane wrote:
    > Joe Conway <mail@joeconway.com> writes:
    >> It isn't clear to me if having a hook in the timeout handler is a
    >> nonstarter -- perhaps a comment with suitable warning for prospective
    >> extension authors is enough? Anyone else want to weigh in on this issue
    >> specifically?
    > It doesn't seem like a great place for a hook, because the list of stuff
    > you could safely do there would be mighty short, possibly the empty set.
    > Write to shared memory?  Not too safe.  Write to a file?  Even less.
    > Write to local memory?  Pointless, because we're about to _exit(1).
    > Pretty much anything I can think of that you'd want to do is something
    > we've already decided the core code can't safely do, and putting it
    > in a hook won't make it safer.
    >
    > If someone wants to argue for this hook, I'd like to see a credible
    > example of a *safe* use-case, keeping in mind the points raised in
    > the comments in BackendInitialize and process_startup_packet_die.
    
    The use case would be to increment a counter in shared memory (or most 
    probably within an hash table) to reflect the number of times a startup 
    packet timeout occurred.
    
    Reading the comments in/related to BackendInitialize() I understand 
    that's definitely not safe to write in shared memory for the 
    EXEC_BACKEND case, but wouldn't it be safe for the non EXEC_BACKEND case?
    
    BTW, it makes me realize that the hook being fired in the bad startup 
    packet case:
    
             /*
              * Stop here if it was bad or a cancel packet. ProcessStartupPacket
              * already did any appropriate error reporting.
              */
             if (status != STATUS_OK)
    +       {
    +               if (FailedConnection_hook)
    +                       (*FailedConnection_hook) 
    (FCET_BAD_STARTUP_PACKET, port);
                     proc_exit(0);
    +       }
    
    is not safe for the EXEC_BACKEND case.
    
    Regards,
    
    -- 
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
    
  21. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-07-12T12:58:27Z

    Hi,
    
    On 7/11/22 8:18 AM, Drouvot, Bertrand wrote:
    > Hi,
    >
    > On 7/7/22 10:10 PM, Tom Lane wrote:
    >> Joe Conway <mail@joeconway.com> writes:
    >>> It isn't clear to me if having a hook in the timeout handler is a
    >>> nonstarter -- perhaps a comment with suitable warning for prospective
    >>> extension authors is enough? Anyone else want to weigh in on this issue
    >>> specifically?
    >> It doesn't seem like a great place for a hook, because the list of stuff
    >> you could safely do there would be mighty short, possibly the empty set.
    >> Write to shared memory?  Not too safe.  Write to a file?  Even less.
    >> Write to local memory?  Pointless, because we're about to _exit(1).
    >> Pretty much anything I can think of that you'd want to do is something
    >> we've already decided the core code can't safely do, and putting it
    >> in a hook won't make it safer.
    >>
    >> If someone wants to argue for this hook, I'd like to see a credible
    >> example of a *safe* use-case, keeping in mind the points raised in
    >> the comments in BackendInitialize and process_startup_packet_die.
    >
    > The use case would be to increment a counter in shared memory (or most 
    > probably within an hash table) to reflect the number of times a 
    > startup packet timeout occurred.
    >
    > Reading the comments in/related to BackendInitialize() I understand 
    > that's definitely not safe to write in shared memory for the 
    > EXEC_BACKEND case, but wouldn't it be safe for the non EXEC_BACKEND case?
    >
    > BTW, it makes me realize that the hook being fired in the bad startup 
    > packet case:
    >
    >         /*
    >          * Stop here if it was bad or a cancel packet. 
    > ProcessStartupPacket
    >          * already did any appropriate error reporting.
    >          */
    >         if (status != STATUS_OK)
    > +       {
    > +               if (FailedConnection_hook)
    > +                       (*FailedConnection_hook) 
    > (FCET_BAD_STARTUP_PACKET, port);
    >                 proc_exit(0);
    > +       }
    >
    > is not safe for the EXEC_BACKEND case.
    >
    What about the idea to trigger the hook for the STARTUP PACKET TIMEOUT 
    and BAD STARTUP PACKET only for the non EXEC_BACKEND/Windows cases?
    
    I'm tempted to think it's better to have some cases where one could 
    benefit from the hook as opposed to none.
    
    Thoughts?
    
    Regards,
    
    -- 
    
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
    
  22. Re: Patch proposal: New hooks in the connection path

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2022-07-14T09:43:37Z

    On Fri, Jul 8, 2022 at 5:54 PM Bharath Rupireddy
    <bharath.rupireddyforpostgres@gmail.com> wrote:
    >
    > Looking at v2-0003 patch and emit_log_hook, how about we filter out
    > for those connectivity errors either based on error codes and if they
    > aren't unique, perhaps passing special flags to ereport API indicating
    > that it's a connectivity error and in the emit_log_hook we can look
    > for those connectivity error codes or flags to collect the stats about
    > the failure connections (with MyProcPort being present in
    > emit_log_hook)? This way, we don't need a new hook. Thoughts?
    
    Bertrand and Other Hackers, above comment may have been lost in the
    wild - any thoughts on it?
    
    Regards,
    Bharath Rupireddy.
    
    
    
    
  23. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-08-02T13:23:35Z

    Hi Bharath,
    
    On 7/14/22 11:43 AM, Bharath Rupireddy wrote:
    > On Fri, Jul 8, 2022 at 5:54 PM Bharath Rupireddy
    > <bharath.rupireddyforpostgres@gmail.com> wrote:
    >> Looking at v2-0003 patch and emit_log_hook, how about we filter out
    >> for those connectivity errors either based on error codes and if they
    >> aren't unique, perhaps passing special flags to ereport API indicating
    >> that it's a connectivity error and in the emit_log_hook we can look
    >> for those connectivity error codes or flags to collect the stats about
    >> the failure connections (with MyProcPort being present in
    >> emit_log_hook)? This way, we don't need a new hook. Thoughts?
    > Bertrand and Other Hackers, above comment may have been lost in the
    > wild - any thoughts on it?
    
    Thanks for your feedback!
    
    I can see 2 issues with that approach:
    
    - We’ll not be able to track the “startup timeout case” (well, we may 
    not be able to track it anyway depending of what next to [1] will be) as 
    it does not emit any log messages.
    - We’ll depend of the log_min_messages value (means 
    edata->output_to_server needs to be true for the emit_log_hook to be 
    triggered).
    
    [1]: 
    https://www.postgresql.org/message-id/a1558d12-c1c4-0fe5-f8a5-2b6c2294e55f%40amazon.com
    
    Regards,
    
    -- 
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
    
  24. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-08-08T10:51:20Z

    Hi,
    
    On 7/12/22 2:58 PM, Drouvot, Bertrand wrote:
    > Hi,
    >
    > On 7/11/22 8:18 AM, Drouvot, Bertrand wrote:
    >> Hi,
    >>
    >> The use case would be to increment a counter in shared memory (or 
    >> most probably within an hash table) to reflect the number of times a 
    >> startup packet timeout occurred.
    >>
    >> Reading the comments in/related to BackendInitialize() I understand 
    >> that's definitely not safe to write in shared memory for the 
    >> EXEC_BACKEND case, but wouldn't it be safe for the non EXEC_BACKEND 
    >> case?
    >>
    >> BTW, it makes me realize that the hook being fired in the bad startup 
    >> packet case:
    >>
    >>         /*
    >>          * Stop here if it was bad or a cancel packet. 
    >> ProcessStartupPacket
    >>          * already did any appropriate error reporting.
    >>          */
    >>         if (status != STATUS_OK)
    >> +       {
    >> +               if (FailedConnection_hook)
    >> +                       (*FailedConnection_hook) 
    >> (FCET_BAD_STARTUP_PACKET, port);
    >>                 proc_exit(0);
    >> +       }
    >>
    >> is not safe for the EXEC_BACKEND case.
    >>
    > What about the idea to trigger the hook for the STARTUP PACKET TIMEOUT 
    > and BAD STARTUP PACKET only for the non EXEC_BACKEND/Windows cases?
    >
    > I'm tempted to think it's better to have some cases where one could 
    > benefit from the hook as opposed to none.
    >
    > Thoughts?
    
    Please find attached v2-0004-connection_hooks.patch as an attempt of 
    doing so.
    
    Thanks
    
    -- 
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
  25. Re: Patch proposal: New hooks in the connection path

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2022-08-13T08:47:33Z

    On Tue, Aug 2, 2022 at 6:55 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >
    > Hi Bharath,
    >
    > On 7/14/22 11:43 AM, Bharath Rupireddy wrote:
    > > On Fri, Jul 8, 2022 at 5:54 PM Bharath Rupireddy
    > > <bharath.rupireddyforpostgres@gmail.com> wrote:
    > >> Looking at v2-0003 patch and emit_log_hook, how about we filter out
    > >> for those connectivity errors either based on error codes and if they
    > >> aren't unique, perhaps passing special flags to ereport API indicating
    > >> that it's a connectivity error and in the emit_log_hook we can look
    > >> for those connectivity error codes or flags to collect the stats about
    > >> the failure connections (with MyProcPort being present in
    > >> emit_log_hook)? This way, we don't need a new hook. Thoughts?
    > > Bertrand and Other Hackers, above comment may have been lost in the
    > > wild - any thoughts on it?
    >
    > Thanks for your feedback!
    >
    > I can see 2 issues with that approach:
    >
    > - We’ll not be able to track the “startup timeout case” (well, we may
    > not be able to track it anyway depending of what next to [1] will be) as
    > it does not emit any log messages.
    >
    > [1]:
    > https://www.postgresql.org/message-id/a1558d12-c1c4-0fe5-f8a5-2b6c2294e55f%40amazon.com
    
    Yes, we wanted to be very quick in StartupPacketTimeoutHandler because
    it is a timeout signal handler after all.
    
    > - We’ll depend of the log_min_messages value (means
    > edata->output_to_server needs to be true for the emit_log_hook to be
    > triggered).
    
    Hm, we can just say that 'log_min_message setting will enable/disable
    the feature'.
    
    I agree with your first point of not having an error in
    StartupPacketTimeoutHandler hence I don't think using emit log hook
    for the connection failure stats helps.
    
    -- 
    Bharath Rupireddy
    RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
    
    
    
    
  26. Re: Patch proposal: New hooks in the connection path

    Gurjeet Singh <gurjeet@singh.im> — 2022-08-14T05:45:53Z

    On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    > Please find attached v2-0004-connection_hooks.patch
    
         /*
          * Stop here if it was bad or a cancel packet.  ProcessStartupPacket
          * already did any appropriate error reporting.
          */
         if (status != STATUS_OK)
    +    {
    +#ifndef EXEC_BACKEND
    +        if (FailedConnection_hook)
    +            (*FailedConnection_hook) (FCET_BAD_STARTUP_PACKET, port);
    +#endif
             proc_exit(0);
    +    }
    
    Per the comment above the if condition, the `status != OK` may
    represent a cancel packet, as well. Clearly, a cancel packet is not
    the same as a _bad_ packet. So I think here you need to differentiate
    between a cancel packet and a genuinely bad packet; I don't see
    anything good coming good out of us, or the hook-developer, lumping
    those 2 cases together.
    
    I think we can reduce the number of places the hook is called, if we
    call the hook from proc_exit(), and all the other places we simply set
    a global variable to signify the reason for the failure. The case of
    _exit(1) from the signal-handler cannot use such a mechanism, but I
    think all the other cases of interest can simply register one of the
    FCET_* value, and the hook call from proc_exit() can pass that value
    to the hook.
    
    If we can convinces ourselves that we can use proc_exit(1) in
    StartupPacketTimeoutHandler(), instead of calling _exit(1), I think we
    cal eliminate replace all call sites for this hook with
    set-global-variable variant.
    
    > ...
    > *     This should be the only function to call exit().
    > *     -cim 2/6/90
    >...
    > proc_exit(int code)
    
    The comment on proc_exit() claims that should be the only place
    calling exit(), except the add-on/extension hooks. So there must be a
    strong reason why the signal-handler uses _exit() to bypass all
    callbacks.
    
    Best regards,
    Gurjeet
    http://Gurje.et
    
    
    
    
  27. Re: Patch proposal: New hooks in the connection path

    Gurjeet Singh <gurjeet@singh.im> — 2022-08-14T05:52:33Z

    (reposting the same review, with many grammatical fixes)
    
    On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    > Please find attached v2-0004-connection_hooks.patch
    
         /*
          * Stop here if it was bad or a cancel packet.  ProcessStartupPacket
          * already did any appropriate error reporting.
          */
         if (status != STATUS_OK)
    +    {
    +#ifndef EXEC_BACKEND
    +        if (FailedConnection_hook)
    +            (*FailedConnection_hook) (FCET_BAD_STARTUP_PACKET, port);
    +#endif
             proc_exit(0);
    +    }
    
    Per the comment above the if condition, the `status != OK` may
    represent a cancel packet, as well. Clearly, a cancel packet is not
    the same as a _bad_ packet. So I think here you need to differentiate
    between a cancel packet and a genuinely bad packet; I don't see
    anything good coming out of us, or the hook-developer, lumping
    those 2 cases together.
    
    I think we can reduce the number of places the hook is called, if we
    call the hook from proc_exit(), and at all the other places we simply set
    a global variable to signify the reason for the failure. The case of
    _exit(1) from the signal-handler cannot use such a mechanism, but I
    think all the other cases of interest can simply register one of the
    FCET_* values, and let the call from proc_exit() pass that value
    to the hook.
    
    If we can convince ourselves that we can use proc_exit(1) in
    StartupPacketTimeoutHandler(), instead of calling _exit(1), I think we
    cal replace all call sites for this hook with the
    set-global-variable variant.
    
    > ...
    > *     This should be the only function to call exit().
    > *     -cim 2/6/90
    >...
    > proc_exit(int code)
    
    The comment on proc_exit() claims that it should be the only place
    calling exit(), except that the add-on/extension hooks may ignore this.
    So there must be a strong reason why the signal-handler uses _exit()
    to bypass all callbacks.
    
    Best regards,
    Gurjeet
    http://Gurje.et
    
    
    
    
  28. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-08-16T08:01:13Z

    Hi,
    
    On 8/14/22 7:52 AM, Gurjeet Singh wrote:
    > On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >> Please find attached v2-0004-connection_hooks.patch
    >       /*
    >        * Stop here if it was bad or a cancel packet.  ProcessStartupPacket
    >        * already did any appropriate error reporting.
    >        */
    >       if (status != STATUS_OK)
    > +    {
    > +#ifndef EXEC_BACKEND
    > +        if (FailedConnection_hook)
    > +            (*FailedConnection_hook) (FCET_BAD_STARTUP_PACKET, port);
    > +#endif
    >           proc_exit(0);
    > +    }
    >
    > Per the comment above the if condition, the `status != OK` may
    > represent a cancel packet, as well. Clearly, a cancel packet is not
    > the same as a _bad_ packet. So I think here you need to differentiate
    > between a cancel packet and a genuinely bad packet; I don't see
    > anything good coming out of us, or the hook-developer, lumping
    > those 2 cases together.
    
    Thanks for the feedback!
    
    Yeah, good point. I agree that it would be better to make the distinction.
    
    > I think we can reduce the number of places the hook is called, if we
    > call the hook from proc_exit(), and at all the other places we simply set
    > a global variable to signify the reason for the failure. The case of
    > _exit(1) from the signal-handler cannot use such a mechanism, but I
    > think all the other cases of interest can simply register one of the
    > FCET_* values, and let the call from proc_exit() pass that value
    > to the hook.
    
    That looks like a good idea to me. I'm tempted to rewrite the patch that 
    way (and addressing the first comment in the same time).
    
    Curious to hear about others hackers thoughts too.
    
    > If we can convince ourselves that we can use proc_exit(1) in
    > StartupPacketTimeoutHandler(), instead of calling _exit(1), I think we
    > cal replace all call sites for this hook with the
    > set-global-variable variant.
    
    I can see why this is not safe in the EXEC_BACKEND case, but it's not 
    clear to me why it would not be safe in the non EXEC_BACKEND case.
    
    >> ...
    >> *     This should be the only function to call exit().
    >> *     -cim 2/6/90
    >> ...
    >> proc_exit(int code)
    > The comment on proc_exit() claims that it should be the only place
    > calling exit(), except that the add-on/extension hooks may ignore this.
    > So there must be a strong reason why the signal-handler uses _exit()
    > to bypass all callbacks.
    
    yeah.
    
    Regards,
    
    -- 
    
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
    
  29. Re: Patch proposal: New hooks in the connection path

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2022-08-16T08:10:29Z

    On Tue, Aug 16, 2022 at 1:31 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >
    > On 8/14/22 7:52 AM, Gurjeet Singh wrote:
    > > On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >
    > > I think we can reduce the number of places the hook is called, if we
    > > call the hook from proc_exit(), and at all the other places we simply set
    > > a global variable to signify the reason for the failure. The case of
    > > _exit(1) from the signal-handler cannot use such a mechanism, but I
    > > think all the other cases of interest can simply register one of the
    > > FCET_* values, and let the call from proc_exit() pass that value
    > > to the hook.
    >
    > That looks like a good idea to me. I'm tempted to rewrite the patch that
    > way (and addressing the first comment in the same time).
    >
    > Curious to hear about others hackers thoughts too.
    
    IMO, calling the hook from proc_exit() is not a good design as
    proc_exit() is a generic code called from many places in the source
    code, even the simple code of kind  if(call_failed_conn_hook) {
    falied_conn_hook(params);} can come in the way of many exit code paths
    which is undesirable, and the likelihood of introducing new bugs may
    increase.
    
    -- 
    Bharath Rupireddy
    RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
    
    
    
    
  30. Re: Patch proposal: New hooks in the connection path

    bdrouvotAWS <bdrouvot@amazon.com> — 2022-08-16T08:25:19Z

    Hi,
    
    On 8/16/22 10:10 AM, Bharath Rupireddy wrote:
    > On Tue, Aug 16, 2022 at 1:31 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >> On 8/14/22 7:52 AM, Gurjeet Singh wrote:
    >>> On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >>> I think we can reduce the number of places the hook is called, if we
    >>> call the hook from proc_exit(), and at all the other places we simply set
    >>> a global variable to signify the reason for the failure. The case of
    >>> _exit(1) from the signal-handler cannot use such a mechanism, but I
    >>> think all the other cases of interest can simply register one of the
    >>> FCET_* values, and let the call from proc_exit() pass that value
    >>> to the hook.
    >> That looks like a good idea to me. I'm tempted to rewrite the patch that
    >> way (and addressing the first comment in the same time).
    >>
    >> Curious to hear about others hackers thoughts too.
    > IMO, calling the hook from proc_exit() is not a good design as
    > proc_exit() is a generic code called from many places in the source
    > code, even the simple code of kind  if(call_failed_conn_hook) {
    > falied_conn_hook(params);} can come in the way of many exit code paths
    > which is undesirable, and the likelihood of introducing new bugs may
    > increase.
    
    Thanks for the feedback.
    
    What do you think about calling the hook only if the new global variable 
    is not equal to its default value (which would mean don't trigger the 
    hook)?
    
    Regards,
    
    -- 
    Bertrand Drouvot
    Amazon Web Services: https://aws.amazon.com
    
    
    
    
    
  31. Re: Patch proposal: New hooks in the connection path

    Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> — 2022-08-16T10:16:00Z

    On Tue, Aug 16, 2022 at 1:55 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    >
    > Hi,
    >
    > On 8/16/22 10:10 AM, Bharath Rupireddy wrote:
    > > On Tue, Aug 16, 2022 at 1:31 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    > >> On 8/14/22 7:52 AM, Gurjeet Singh wrote:
    > >>> On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    > >>> I think we can reduce the number of places the hook is called, if we
    > >>> call the hook from proc_exit(), and at all the other places we simply set
    > >>> a global variable to signify the reason for the failure. The case of
    > >>> _exit(1) from the signal-handler cannot use such a mechanism, but I
    > >>> think all the other cases of interest can simply register one of the
    > >>> FCET_* values, and let the call from proc_exit() pass that value
    > >>> to the hook.
    > >> That looks like a good idea to me. I'm tempted to rewrite the patch that
    > >> way (and addressing the first comment in the same time).
    > >>
    > >> Curious to hear about others hackers thoughts too.
    > > IMO, calling the hook from proc_exit() is not a good design as
    > > proc_exit() is a generic code called from many places in the source
    > > code, even the simple code of kind  if(call_failed_conn_hook) {
    > > falied_conn_hook(params);} can come in the way of many exit code paths
    > > which is undesirable, and the likelihood of introducing new bugs may
    > > increase.
    >
    > Thanks for the feedback.
    >
    > What do you think about calling the hook only if the new global variable
    > is not equal to its default value (which would mean don't trigger the
    > hook)?
    
    IMO, that's not a good design as explained above. Why should the
    failed connection hook related code get hit for each and every
    proc_exit() call? Here, the code duplication i.e. the number of places
    the failed connection hook gets called mustn't be the reason to move
    that code to proc_exit().
    
    -- 
    Bharath Rupireddy
    RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
    
    
    
    
  32. Re: Patch proposal: New hooks in the connection path

    Gurjeet Singh <gurjeet@singh.im> — 2022-08-16T14:26:33Z

    On Tue, Aug 16, 2022 at 3:16 AM Bharath Rupireddy
    <bharath.rupireddyforpostgres@gmail.com> wrote:
    >
    > On Tue, Aug 16, 2022 at 1:55 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    > >
    > > Hi,
    > >
    > > On 8/16/22 10:10 AM, Bharath Rupireddy wrote:
    > > > On Tue, Aug 16, 2022 at 1:31 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    > > >> On 8/14/22 7:52 AM, Gurjeet Singh wrote:
    > > >>> On Mon, Aug 8, 2022 at 3:51 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
    > > >>> I think we can reduce the number of places the hook is called, if we
    > > >>> call the hook from proc_exit(), and at all the other places we simply set
    > > >>> a global variable to signify the reason for the failure. The case of
    > > >>> _exit(1) from the signal-handler cannot use such a mechanism, but I
    > > >>> think all the other cases of interest can simply register one of the
    > > >>> FCET_* values, and let the call from proc_exit() pass that value
    > > >>> to the hook.
    > > >> That looks like a good idea to me. I'm tempted to rewrite the patch that
    > > >> way (and addressing the first comment in the same time).
    > > >>
    > > >> Curious to hear about others hackers thoughts too.
    
    I agree that we need feedback from long-timers here,  on the decision
    of whether to use proc_exit() for this purpose.
    
    > > > IMO, calling the hook from proc_exit() is not a good design as
    > > > proc_exit() is a generic code called from many places in the source
    > > > code, even the simple code of kind  if(call_failed_conn_hook) {
    > > > falied_conn_hook(params);} can come in the way of many exit code paths
    > > > which is undesirable, and the likelihood of introducing new bugs may
    > > > increase.
    > >
    > > Thanks for the feedback.
    > >
    > > What do you think about calling the hook only if the new global variable
    > > is not equal to its default value (which would mean don't trigger the
    > > hook)?
    >
    > IMO, that's not a good design as explained above. Why should the
    > failed connection hook related code get hit for each and every
    > proc_exit() call? Here, the code duplication i.e. the number of places
    > the failed connection hook gets called mustn't be the reason to move
    > that code to proc_exit().
    
    I agree, it doesn't feel _clean_, having to maintain a global
    variable, pass it to hook at exit, etc. But the alternative feels less
    cleaner.
    
    This hook needs to be called when the process has decided to exit, so
    it makes sense to place this call in stack above proc_exit(), whose
    sole job is to let the process die gracefully, and take care of things
    on the way out.
    
    There are quite a few places in core that leverage proc_exit()'s
    facilities (by registering on_proc_exit callbacks), so an
    extension/hook doing so wouldn't be out of the ordinary; (apparently
    contrib/sepgsql has already set the precedent on an extension using
    the on_proc_exit callback). Admittedly, in this case the core will be
    managing and passing it the additional global variable needed to
    record the connection failure reason, FCET_*.
    
    If we agree that proc_exit() is a good place to place this call, then
    this hook can be converted into a on_proc_exit callback. If the global
    variable is exported, then the extension(s) can access it in the
    callback to ascertain why the process is exiting, and proc_exit()
    won't have to know anything special about the extension, or hook, or
    the global variable.
    
    The on_proc_exit callback method wouldn't work for the _exit() called
    in StartupPacketTimeoutHandler(), so that will need to be handled
    separately.
    
    Best regards,
    Gurjeet
    http://Gurje.et
    
    
    
    
  33. Re: Patch proposal: New hooks in the connection path

    Gregory Stark (as CFM) <stark.cfm@gmail.com> — 2023-04-03T22:08:06Z

    This looks like it was a good discussion -- last summer. But it
    doesn't seem to be a patch under active development now.
    
    It sounds like there were some design constraints that still need some
    new ideas to solve and a new patch will be needed to address them.
    
    Should this be marked Returned With Feedback?
    
    -- 
    Gregory Stark
    As Commitfest Manager
    
    
    
    
  34. Re: Patch proposal: New hooks in the connection path

    Bertrand Drouvot <bertranddrouvot.pg@gmail.com> — 2023-04-04T10:11:55Z

    Hi,
    
    On 4/4/23 12:08 AM, Gregory Stark (as CFM) wrote:
    > This looks like it was a good discussion -- last summer. But it
    > doesn't seem to be a patch under active development now.
    > 
    > It sounds like there were some design constraints that still need some
    > new ideas to solve and a new patch will be needed to address them.
    > 
    > Should this be marked Returned With Feedback?
    > 
    
    I just marked it as Returned With Feedback.
    
    I may re-open it later on to resume the discussion or share
    new ideas though.
    
    Regards,
    
    -- 
    Bertrand Drouvot
    PostgreSQL Contributors Team
    RDS Open Source Databases
    Amazon Web Services: https://aws.amazon.com