Thread

  1. General Bug Report: Server hangs, with backends still running on accept() failure

    Unprivileged user <nobody> — 1999-05-27T18:30:20Z

    ============================================================================
                            POSTGRESQL BUG REPORT TEMPLATE
    ============================================================================
    
    
    Your name               : Shez
    Your email address      : shez@nsl.net
    
    Category                : runtime: back-end
    Severity                : critical
    
    Summary: Server hangs,with backends still running on accept() failure
    
    System Configuration
    --------------------
      Operating System   : RH Linux 5.2(x86 on K6II)
    
      PostgreSQL version : 6.4.2
    
      Compiler used      : gcc-2.7.2.3-14
    
    Hardware:
    ---------
    RH Linux 5.2(x86 on K6II)
    Linux media.nsl.net 2.0.36 #1 Tue Oct 13 22:17:11 EDT 1998 i586 unknown
    
    
    Versions of other tools:
    ------------------------
    
    
    --------------------------------------------------------------------------
    
    Problem Description:
    --------------------
    A lightly loaded server with about 16 backends running at 
    anytime hung dead.  Attempts to connect simply hung and 
    existing backend processes also hung.  Last entry in the logs were:
    ERROR:  postmaster: StreamConnection: accept: Invalid argument      
    This seems to have come in the mail list, 
    but I could see no resolution on the list.
    As far as I can see it happens if the accept call fails on 
    an inet connection.
    I have a small test program (below) which causes a similar but
    not identical crash.
    
    --------------------------------------------------------------------------
    
    Test Case:
    ----------
    This connects to a given machine port combination and then 
    attempts to kill its self before the operation can be
    completed, resulting in the accept call on the server
    returning:
    StreamConnection: accept: Connection reset by peer
    
    I have put it up on http://sheznet.nsl.net/crash.c rather
    than try and paste it into this form.
    usage: ./crash dbhost 5432
    The crash program only works for me when ran from a different
    machine than the server.
    
    
    Note that this is a serious security issue, with anybody who
    can send packets to a listening server being able to hang it.
    
    
    --------------------------------------------------------------------------
    
    Solution:
    ---------
    I believe that when accept() fails the backend should at
    least quit the server properly so that supervise programs 
    can restart it, or just silently ignore the failure - this
    has worked for me in similar situations.
    
    Sorry I don't have time to investigate further, but please
    feel free to contact me if you have further questions.
    
    Sincerely.
    Shez
    
    --------------------------------------------------------------------------