Thread

  1. Re: Make PQgetResult() not return NULL on out-of-memory error

    Yugo Nagata <nagata@sraoss.co.jp> — 2025-11-12T07:52:16Z

    On Wed, 12 Nov 2025 16:20:13 +0900
    Yugo Nagata <nagata@sraoss.co.jp> wrote:
    
    > On Tue, 11 Nov 2025 20:16:11 +0900
    > Fujii Masao <masao.fujii@gmail.com> wrote:
    > 
    > > On Tue, Nov 11, 2025 at 2:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > > >
    > > > Fujii Masao <masao.fujii@gmail.com> writes:
    > > > > To address this, callers need a way to distinguish between PGRES_FATAL_ERROR
    > > > > and OOM. Functions that loop until PQgetResult() returns NULL should continue
    > > > > if the result is PGRES_FATAL_ERROR, but should break if it's an OOM.
    > > >
    > > > Not sure about that.  We might or might not be able to make progress
    > > > if the caller keeps calling PQgetResult.  But if it stops after an
    > > > OOM report then we might as well just close the connection, because
    > > > there is no hope of getting back in sync.
    > > >
    > > > I'm inclined to think that it's correct that we should return
    > > > OOM_result not NULL if we couldn't make a PGresult, but further
    > > > analysis is needed to make sure that libpq can make progress
    > > > afterwards.  I don't think we should expect applications to
    > > > involve themselves in that recovery logic, because they won't,
    > > > and most certainly won't test it carefully.
    > > >
    > > > The whole project might be doomed to failure really.  I think
    > > > that one very likely failure if we are approaching OOM is that
    > > > we can't enlarge libpq's input buffer enough to accept all of
    > > > a (long) incoming server message.  What hope have we got of
    > > > getting out of that situation?
    > > 
    > > When pqCheckInBufferSpace() fails to enlarge the input buffer and PQgetResult()
    > > returns PGRES_FATAL_ERROR, I noticed that PQgetResult() sets asyncStatus to
    > > PGASYNC_IDLE. This means subsequent calls to PQgetResult() will return NULL
    > > even in OOM case, so functions looping until NULL won't end up in an
    > > infinite loop.
    > > Of course, issuing another query on the same connection afterward could be
    > > problematic, though.
    > > 
    > > I'm still not sure this behavior is correct, but I'm just wondering if
    > > asyncStatus should be reset to PGASYNC_IDLE also when OOM_result is returned.
    > 
    > I agree that PQgetResult() should return NULL after returning OOM_result, and
    > resetting asyncStatus to PGASYNC_IDLE would work in the normal mode.
    > 
    > I also agree that continuing to use the same connection seems problematic,
    > especially in pipeline mode; the pipeline status remains PQ_PIPELINE_OK, PQgetResult()
    > never returns PGRES_PIPELINE_SYNC, and the sent commands are left in the command queue.
    > 
    > Therefore, I wonder about closing the connection and resetting the status
    > when OOM_result is retunred, by callling pqDropConnection() as handleFatalError() does.
    > In this case, the connection status should also be set to CONNECTINO_BAD.
    > 
    > This would prevent applications from continueing to use potentially problamatic
    > connection without providing the way to distinguish OOM from other errors.
    > 
    > What do you think?
    > 
    > I've attached an updated patch in this direction.
    
    I'm sorry, the patch attached to my previous post was incorrect.
    I've attached the correct one.
    
    Regards,
    Yugo Nagata
    
    -- 
    Yugo Nagata <nagata@sraoss.co.jp>