Thread

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Reset reindex-in-progress state before reverifying an exclusion constraint.

  1. Postmaster holding unlinked files for pg_largeobject table

    Alex Shulgin <ash@commandprompt.com> — 2011-06-03T21:45:28Z

    Hello Hackers,
    
    There is some strange behavior we're experiencing with one of the customer's DBs (8.4)
    
    We've noticed that free disk space went down heavily on a system, and after a short analysis determined that the reason was that postmaster was holding lots of unlinked files open.  A sample of lsof output was something like this:
    
    postmaste 15484  postgres   57u      REG              253,0 1073741824   41125093 /srv/pgsql/data/base/352483309/2613.2 (deleted)
    postmaste 15484  postgres   58u      REG              253,0 1073741824   41125094 /srv/pgsql/data/base/352483309/2613.3 (deleted)
    postmaste 15484  postgres   59u      REG              253,0 1073741824   41125095 /srv/pgsql/data/base/352483309/2613.4 (deleted)
    
    There were about 450 such (or similar) files, all of them having /2613 in the filename.  Since 2613 is a regclass of pg_largeobject and we are indeed working with quite a few large objects in that DB so this is where our problem lies we suspect.
    
    Restarting PostgreSQL obviously helps the issue and the disk space occupied by those unlinked files (about 63GB actually) is reclaimed.
    
    So what happens on that host is that we drop/restore a fresh version of the DB from the production host, followed by a migration script which among other things loads around 16GB of data files as large objects.  This happens nightly.
    
    But if we go and run the whole drop/restore and migration manually, the problem doesn't manifest itself right after migration is successfully finished.
    
    Our next thought was that it must be dropdb part of the nightly script that removes the pg_largeobject's files (still we don't know what makes it keep them opened,) but dropping the DB doesn't manifest the problem either.
    
    I'm currently running a VACUUM pg_largeobject on the problematic DB, to see if it triggers the problem, but this didn't happen so far.
    
    At this point it would be nice to hear what are your thoughts.  What could cause such behavior?
    
    --
    Regards,
    Alex
    
    
  2. Re: [HACKERS] Postmaster holding unlinked files for pg_largeobject table

    Jeff Davis <pgsql@j-davis.com> — 2011-06-03T22:24:28Z

    [ For future reference, -general is the appropriate list. Moving
    discussion there. ]
    
    On Sat, 2011-06-04 at 00:45 +0300, Alexander Shulgin wrote:
    > We've noticed that free disk space went down heavily on a system, and
    > after a short analysis determined that the reason was that postmaster
    > was holding lots of unlinked files open.  A sample of lsof output was
    > something like this:
    
    ...
    
    > Restarting PostgreSQL obviously helps the issue and the disk space
    > occupied by those unlinked files (about 63GB actually) is reclaimed.
    
    Normally postgres closes unlinked files during a checkpoint. How long
    between checkpoints on this system? Is it possible that you noticed
    before postgresql caused an automatic checkpoint?
    
    Also, you can do a manual checkpoint with the CHECKPOINT command.
    
    Regards,
    	Jeff Davis
    
    
    
  3. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-04T01:32:56Z

    Excerpts from Alexander Shulgin's message of vie jun 03 17:45:28 -0400 2011:
    
    > There were about 450 such (or similar) files, all of them having /2613 in the filename.  Since 2613 is a regclass of pg_largeobject and we are indeed working with quite a few large objects in that DB so this is where our problem lies we suspect.
    > 
    > Restarting PostgreSQL obviously helps the issue and the disk space occupied by those unlinked files (about 63GB actually) is reclaimed.
    > 
    > So what happens on that host is that we drop/restore a fresh version of the DB from the production host, followed by a migration script which among other things loads around 16GB of data files as large objects.  This happens nightly.
    
    What surprises me is that the open references remain after a database
    drop.  Surely this means that no backends keep open file descriptors to
    any table in that database, because there are no connections.
    
    I also requested Alexander to run a checkpoint and see if that made the
    FDs go away (on the theory that bgwriter could be the culprit) -- no dice.
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  4. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-04T16:49:05Z

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    > What surprises me is that the open references remain after a database
    > drop.  Surely this means that no backends keep open file descriptors to
    > any table in that database, because there are no connections.
    
    bgwriter ...
    
    			regards, tom lane
    
    
  5. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-06T15:45:08Z

    Excerpts from Tom Lane's message of sáb jun 04 12:49:05 -0400 2011:
    > Alvaro Herrera <alvherre@commandprompt.com> writes:
    > > What surprises me is that the open references remain after a database
    > > drop.  Surely this means that no backends keep open file descriptors to
    > > any table in that database, because there are no connections.
    > 
    > bgwriter ...
    
    Actually you were both wrong, hah.  It's not bgwriter, and this doesn't
    belong on pgsql-general.  It's a backend.  However, as we mentioned
    initially, the database to which this file belongs is dropped.
    
    What we found out after more careful investigation is that the file is
    kept open by a backend connected to a different database.  I have a
    suspicion that what happened here is that this backend was forced to
    flush out a page from shared buffers to read some other page; and it was
    forced to do a fsync of this file.  And then it forgets to close the
    file descriptor.
    
    Actually, there are 11 processes holding open file descriptors to the
    table, each to a slightly different subset of the many segments of the
    table.  (There's also one holding a FD to the deleted
    pg_largeobject_loid_pn_index -- remember, this is a dropped database).
    All those backends belong to Zabbix, the monitoring platform, which are
    connected to a different database.
    
    I think what we have here is a new bug.
    
    (This is running 8.4.8, by the way.)
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  6. Re: Postmaster holding unlinked files for pg_largeobject table

    Kevin Grittner <kevin.grittner@wicourts.gov> — 2011-06-06T15:58:51Z

    Alvaro Herrera <alvherre@commandprompt.com> wrote:
     
    > What we found out after more careful investigation is that the
    > file is kept open by a backend connected to a different database. 
    > I have a suspicion that what happened here is that this backend
    > was forced to flush out a page from shared buffers to read some
    > other page; and it was forced to do a fsync of this file.  And
    > then it forgets to close the file descriptor.
     
    This sounds vaguely similar to what I found with WAL files being
    held open for days after they were deleted by read-only backends:
     
    http://archives.postgresql.org/message-id/15412.1259630304@sss.pgh.pa.us
     
    I mention it only because there might be one place to fix both....
     
    -Kevin
    
    
  7. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-06T16:04:51Z

    Excerpts from Kevin Grittner's message of lun jun 06 11:58:51 -0400 2011:
    > Alvaro Herrera <alvherre@commandprompt.com> wrote:
    >  
    > > What we found out after more careful investigation is that the
    > > file is kept open by a backend connected to a different database. 
    > > I have a suspicion that what happened here is that this backend
    > > was forced to flush out a page from shared buffers to read some
    > > other page; and it was forced to do a fsync of this file.  And
    > > then it forgets to close the file descriptor.
    >  
    > This sounds vaguely similar to what I found with WAL files being
    > held open for days after they were deleted by read-only backends:
    >  
    > http://archives.postgresql.org/message-id/15412.1259630304@sss.pgh.pa.us
    >  
    > I mention it only because there might be one place to fix both....
    
    Hmm interesting.  I don't think the placement suggested by Tom would be
    useful, because the Zabbix backends are particularly busy all the time,
    so they wouldn't run ProcessCatchupEvent at all.
    
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  8. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-06T16:10:24Z

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    > Excerpts from Kevin Grittner's message of lun jun 06 11:58:51 -0400 2011:
    >> Alvaro Herrera <alvherre@commandprompt.com> wrote:
    >>> What we found out after more careful investigation is that the
    >>> file is kept open by a backend connected to a different database. 
    >>> I have a suspicion that what happened here is that this backend
    >>> was forced to flush out a page from shared buffers to read some
    >>> other page; and it was forced to do a fsync of this file.  And
    >>> then it forgets to close the file descriptor.
    
    It doesn't "forget" to close the descriptor; it intentionally keeps it
    for possible further use.
    
    >> This sounds vaguely similar to what I found with WAL files being
    >> held open for days after they were deleted by read-only backends:
    >> http://archives.postgresql.org/message-id/15412.1259630304@sss.pgh.pa.us
    >> I mention it only because there might be one place to fix both....
    
    > Hmm interesting.  I don't think the placement suggested by Tom would be
    > useful, because the Zabbix backends are particularly busy all the time,
    > so they wouldn't run ProcessCatchupEvent at all.
    
    Yeah, I wasn't that thrilled with the suggestion either.  But we can't
    just have backends constantly closing every open FD they hold, or
    performance will suffer.  I don't see any very good place to do this...
    
    			regards, tom lane
    
    
  9. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-06T16:17:10Z

    Excerpts from Tom Lane's message of lun jun 06 12:10:24 -0400 2011:
    > Alvaro Herrera <alvherre@commandprompt.com> writes:
    
    > > Hmm interesting.  I don't think the placement suggested by Tom would be
    > > useful, because the Zabbix backends are particularly busy all the time,
    > > so they wouldn't run ProcessCatchupEvent at all.
    > 
    > Yeah, I wasn't that thrilled with the suggestion either.  But we can't
    > just have backends constantly closing every open FD they hold, or
    > performance will suffer.  I don't see any very good place to do this...
    
    How about doing something on an sinval message for pg_database?
    That doesn't solve the WAL problem Kevin found, of course ...
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  10. Re: Postmaster holding unlinked files for pg_largeobject table

    Kevin Grittner <kevin.grittner@wicourts.gov> — 2011-06-06T16:23:52Z

    Alvaro Herrera <alvherre@commandprompt.com> wrote:
     
    > That doesn't solve the WAL problem Kevin found, of course ...
     
    I wouldn't worry about that too much -- the WAL issue is
    self-limiting and not likely to ever cause a failure.  The biggest
    risk is that every now and then some new individual will notice it
    and waste a bit of time investigating.  The LO issue, on the other
    hand, could easily eat enough disk to cause a failure.
     
    -Kevin
    
    
  11. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-06T16:30:25Z

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    > Excerpts from Tom Lane's message of lun jun 06 12:10:24 -0400 2011:
    >> Yeah, I wasn't that thrilled with the suggestion either.  But we can't
    >> just have backends constantly closing every open FD they hold, or
    >> performance will suffer.  I don't see any very good place to do this...
    
    > How about doing something on an sinval message for pg_database?
    > That doesn't solve the WAL problem Kevin found, of course ...
    
    Hmm ... that would help for the specific scenario of dropped databases,
    but we've also heard complaints about narrower cases such as a single
    dropped table.
    
    A bigger issue is that I don't believe it's very practical to scan the
    FD array looking for files associated with a particular database (or
    table).  They aren't labeled that way, and parsing the file path to
    find out the info seems pretty grotty.
    
    On reflection I think this behavior is probably limited to the case
    where we've done what we used to call a "blind write" of a block that
    is unrelated to our database or tables.  For normal SQL-driven accesses,
    there's a relcache entry, and flushing of that entry will lead to
    closure of associated files.  I wonder whether we should go back to
    forcibly closing the FD after a blind write.  This would suck if a
    backend had to do many dirty-buffer flushes for the same relation,
    but hopefully the bgwriter is doing most of those.  We'd want to make
    sure such forced closure *doesn't* occur in the bgwriter.  (If memory
    serves, it has a checkpoint-driven closure mechanism instead.)
    
    			regards, tom lane
    
    
  12. Re: Postmaster holding unlinked files for pg_largeobject table

    Robert Haas <robertmhaas@gmail.com> — 2011-06-06T16:43:45Z

    On Mon, Jun 6, 2011 at 12:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Alvaro Herrera <alvherre@commandprompt.com> writes:
    >> Excerpts from Tom Lane's message of lun jun 06 12:10:24 -0400 2011:
    >>> Yeah, I wasn't that thrilled with the suggestion either.  But we can't
    >>> just have backends constantly closing every open FD they hold, or
    >>> performance will suffer.  I don't see any very good place to do this...
    >
    >> How about doing something on an sinval message for pg_database?
    >> That doesn't solve the WAL problem Kevin found, of course ...
    >
    > Hmm ... that would help for the specific scenario of dropped databases,
    > but we've also heard complaints about narrower cases such as a single
    > dropped table.
    >
    > A bigger issue is that I don't believe it's very practical to scan the
    > FD array looking for files associated with a particular database (or
    > table).  They aren't labeled that way, and parsing the file path to
    > find out the info seems pretty grotty.
    >
    > On reflection I think this behavior is probably limited to the case
    > where we've done what we used to call a "blind write" of a block that
    > is unrelated to our database or tables.  For normal SQL-driven accesses,
    > there's a relcache entry, and flushing of that entry will lead to
    > closure of associated files.  I wonder whether we should go back to
    > forcibly closing the FD after a blind write.  This would suck if a
    > backend had to do many dirty-buffer flushes for the same relation,
    > but hopefully the bgwriter is doing most of those.  We'd want to make
    > sure such forced closure *doesn't* occur in the bgwriter.  (If memory
    > serves, it has a checkpoint-driven closure mechanism instead.)
    
    Instead of closing them immediately, how about flagging the FD and
    closing all the flagged FDs at the end of each query, or something
    like that?
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  13. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-06T16:49:46Z

    Robert Haas <robertmhaas@gmail.com> writes:
    > On Mon, Jun 6, 2011 at 12:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    >> On reflection I think this behavior is probably limited to the case
    >> where we've done what we used to call a "blind write" of a block that
    >> is unrelated to our database or tables. For normal SQL-driven accesses,
    >> there's a relcache entry, and flushing of that entry will lead to
    >> closure of associated files. I wonder whether we should go back to
    >> forcibly closing the FD after a blind write. This would suck if a
    >> backend had to do many dirty-buffer flushes for the same relation,
    >> but hopefully the bgwriter is doing most of those. We'd want to make
    >> sure such forced closure *doesn't* occur in the bgwriter. (If memory
    >> serves, it has a checkpoint-driven closure mechanism instead.)
    
    > Instead of closing them immediately, how about flagging the FD and
    > closing all the flagged FDs at the end of each query, or something
    > like that?
    
    Hmm, there's already a mechanism for closing "temp" FDs at the end of a
    query ... maybe blind writes could use temp-like FDs?
    
    			regards, tom lane
    
    
  14. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-06T17:04:41Z

    Excerpts from Tom Lane's message of lun jun 06 12:49:46 -0400 2011:
    > Robert Haas <robertmhaas@gmail.com> writes:
    > > On Mon, Jun 6, 2011 at 12:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > >> On reflection I think this behavior is probably limited to the case
    > >> where we've done what we used to call a "blind write" of a block that
    > >> is unrelated to our database or tables. For normal SQL-driven accesses,
    > >> there's a relcache entry, and flushing of that entry will lead to
    > >> closure of associated files. I wonder whether we should go back to
    > >> forcibly closing the FD after a blind write. This would suck if a
    > >> backend had to do many dirty-buffer flushes for the same relation,
    > >> but hopefully the bgwriter is doing most of those. We'd want to make
    > >> sure such forced closure *doesn't* occur in the bgwriter. (If memory
    > >> serves, it has a checkpoint-driven closure mechanism instead.)
    > 
    > > Instead of closing them immediately, how about flagging the FD and
    > > closing all the flagged FDs at the end of each query, or something
    > > like that?
    > 
    > Hmm, there's already a mechanism for closing "temp" FDs at the end of a
    > query ... maybe blind writes could use temp-like FDs?
    
    OK, I'll have a look at how blind writes work this afternoon and propose
    something.
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  15. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-06T21:59:31Z

    Excerpts from Tom Lane's message of lun jun 06 12:49:46 -0400 2011:
    > Robert Haas <robertmhaas@gmail.com> writes:
    
    > > Instead of closing them immediately, how about flagging the FD and
    > > closing all the flagged FDs at the end of each query, or something
    > > like that?
    > 
    > Hmm, there's already a mechanism for closing "temp" FDs at the end of a
    > query ... maybe blind writes could use temp-like FDs?
    
    I don't think it can be made to work exactly like that.  If I understand
    correctly, the code involved here is the FlushBuffer() call that happens
    during BufferAlloc(), and what we have at that point is a SMgrRelation;
    we're several levels removed from actually being able to set the
    FD_XACT_TEMPORARY flag which is what I think you're thinking of.
    
    What I think would make some sense is to keep a list of SMgrRelations
    that we opened during FlushBuffer that are foreign to the current
    database, in the current ResourceOwner.  That way, when the resowner is
    destroyed, we would be able to smgrclose() them.  This would only be
    done when called by a backend, not bgwriter.
    
    (I'm not really sure about requiring the buffer to belong to a relation
    in another database, given the report that this is also a problem with
    dropped tables.  However, it certainly makes no sense to try to remember
    *all* buffers.)
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  16. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-07T16:26:34Z

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    > Excerpts from Tom Lane's message of lun jun 06 12:49:46 -0400 2011:
    >> Hmm, there's already a mechanism for closing "temp" FDs at the end of a
    >> query ... maybe blind writes could use temp-like FDs?
    
    > I don't think it can be made to work exactly like that.  If I understand
    > correctly, the code involved here is the FlushBuffer() call that happens
    > during BufferAlloc(), and what we have at that point is a SMgrRelation;
    > we're several levels removed from actually being able to set the
    > FD_XACT_TEMPORARY flag which is what I think you're thinking of.
    
    It's not *that* many levels: in fact, I think md.c is the only level
    that would just have to pass it through without doing anything useful.
    I think that working from there is a saner and more efficient approach
    than what you're sketching.
    
    If you want a concrete design sketch, consider this:
    
    1. Add a flag to the SMgrRelation struct that has the semantics of "all
    files opened through this SMgrRelation should be marked as transient,
    causing them to be automatically closed at end of xact".
    
    2. *Any* normal smgropen() call would reset this flag (since it suggests
    that we are accessing the relation because of SQL activity).  In the
    single case where FlushBuffer() is called with reln == NULL, it would
    set the flag after doing its local smgropen().
    
    3. Then, modify md.c to pass the flag down to fd.c whenever opening an
    FD file.  fd.c sets a bit in the resulting VFD.
    
    4. Extend CleanupTempFiles to close the kernel FD (but not release the
    VFD) when a VFD has the bit set.
    
    I'm fairly sure that CleanupTempFiles is never called in the bgwriter,
    so we don't even need any special hack to prevent the flag from becoming
    set in the bgwriter.
    
    			regards, tom lane
    
    
  17. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-08T17:52:22Z

    Excerpts from Tom Lane's message of mar jun 07 12:26:34 -0400 2011:
    
    > It's not *that* many levels: in fact, I think md.c is the only level
    > that would just have to pass it through without doing anything useful.
    > I think that working from there is a saner and more efficient approach
    > than what you're sketching.
    > 
    > If you want a concrete design sketch, consider this:
    
    Okay, here's a patch implementing this idea.  It seems to work quite
    well, and it solves the problem in a limited testing scenario -- I
    haven't yet tested on the customer machines.
    
    This customer is running on 8.4 so I started from there; should I
    backpatch this to 8.2, or not at all?  (I have all the branches ready
    anyway.)
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
  18. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-08T18:28:02Z

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    > Okay, here's a patch implementing this idea.  It seems to work quite
    > well, and it solves the problem in a limited testing scenario -- I
    > haven't yet tested on the customer machines.
    
    This seems mostly sane, except I think you have not considered the
    issue of when to clear the smgr_transient flag on an existing
    SMgrRelation: if it starts getting used for "normal" accesses after
    having by chance been used for a blind write, we don't want the
    transient marking to persist.  That's why I suggested having smgropen
    always clear it.
    
    Likewise, I think the FD_XACT_TRANSIENT flag on a VFD needs to go away
    at some point, probably once it's actually been closed at EOXACT, though
    there's doubtless more than one way to handle that.
    
    > This customer is running on 8.4 so I started from there; should I
    > backpatch this to 8.2, or not at all?
    
    I'm not excited about back-patching it...
    
    			regards, tom lane
    
    
  19. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-09T17:59:00Z

    Excerpts from Tom Lane's message of mié jun 08 14:28:02 -0400 2011:
    > Alvaro Herrera <alvherre@commandprompt.com> writes:
    > > Okay, here's a patch implementing this idea.  It seems to work quite
    > > well, and it solves the problem in a limited testing scenario -- I
    > > haven't yet tested on the customer machines.
    > 
    > This seems mostly sane, except I think you have not considered the
    > issue of when to clear the smgr_transient flag on an existing
    > SMgrRelation: if it starts getting used for "normal" accesses after
    > having by chance been used for a blind write, we don't want the
    > transient marking to persist.  That's why I suggested having smgropen
    > always clear it.
    > 
    > Likewise, I think the FD_XACT_TRANSIENT flag on a VFD needs to go away
    > at some point, probably once it's actually been closed at EOXACT, though
    > there's doubtless more than one way to handle that.
    
    Aha, I see -- makes sense.  Here's an updated patch.
    
    > > This customer is running on 8.4 so I started from there; should I
    > > backpatch this to 8.2, or not at all?
    > 
    > I'm not excited about back-patching it...
    
    Bummer.
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
  20. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-09T18:45:31Z

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    > Excerpts from Tom Lane's message of mi jun 08 14:28:02 -0400 2011:
    >> Alvaro Herrera <alvherre@commandprompt.com> writes:
    >>> This customer is running on 8.4 so I started from there; should I
    >>> backpatch this to 8.2, or not at all?
    
    >> I'm not excited about back-patching it...
    
    > Bummer.
    
    Well, of course mine is only one opinion; anybody else feel this *is*
    worth risking a back-patch for?
    
    My thought is that it needs some beta testing.  Perhaps it'd be sane to
    push it into beta2 now, and then back-patch sometime after 9.1 final,
    if no problems pop up.
    
    			regards, tom lane
    
    
  21. Re: Postmaster holding unlinked files for pg_largeobject table

    Robert Haas <robertmhaas@gmail.com> — 2011-06-09T18:49:23Z

    On Thu, Jun 9, 2011 at 2:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Alvaro Herrera <alvherre@commandprompt.com> writes:
    >> Excerpts from Tom Lane's message of mié jun 08 14:28:02 -0400 2011:
    >>> Alvaro Herrera <alvherre@commandprompt.com> writes:
    >>>> This customer is running on 8.4 so I started from there; should I
    >>>> backpatch this to 8.2, or not at all?
    >
    >>> I'm not excited about back-patching it...
    >
    >> Bummer.
    >
    > Well, of course mine is only one opinion; anybody else feel this *is*
    > worth risking a back-patch for?
    >
    > My thought is that it needs some beta testing.  Perhaps it'd be sane to
    > push it into beta2 now, and then back-patch sometime after 9.1 final,
    > if no problems pop up.
    
    I think it'd be sensible to back-patch it.  I'm not sure whether now
    or later.  It's a bug fix that is biting real users in the field, so
    it seems like we ought to do something about it.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  22. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-09T19:02:33Z

    Robert Haas <robertmhaas@gmail.com> writes:
    > On Thu, Jun 9, 2011 at 2:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    >> My thought is that it needs some beta testing. Perhaps it'd be sane to
    >> push it into beta2 now, and then back-patch sometime after 9.1 final,
    >> if no problems pop up.
    
    > I think it'd be sensible to back-patch it.  I'm not sure whether now
    > or later.  It's a bug fix that is biting real users in the field, so
    > it seems like we ought to do something about it.
    
    I don't deny it's a bug fix; I'm just dubious about the risk-reward
    ratio.  As to risk: the patch isn't trivial (notice Alvaro didn't get it
    right the first time).  As to reward: it's been like that since forever,
    so if the problem were really serious, we'd have identified it before.
    
    Letting it age a bit during beta would do a lot to damp down the risk
    side of the equation.
    
    			regards, tom lane
    
    
  23. Re: Postmaster holding unlinked files for pg_largeobject table

    Robert Haas <robertmhaas@gmail.com> — 2011-06-09T19:03:26Z

    On Thu, Jun 9, 2011 at 3:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    > Robert Haas <robertmhaas@gmail.com> writes:
    >> On Thu, Jun 9, 2011 at 2:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
    >>> My thought is that it needs some beta testing.  Perhaps it'd be sane to
    >>> push it into beta2 now, and then back-patch sometime after 9.1 final,
    >>> if no problems pop up.
    >
    >> I think it'd be sensible to back-patch it.  I'm not sure whether now
    >> or later.  It's a bug fix that is biting real users in the field, so
    >> it seems like we ought to do something about it.
    >
    > I don't deny it's a bug fix; I'm just dubious about the risk-reward
    > ratio.  As to risk: the patch isn't trivial (notice Alvaro didn't get it
    > right the first time).  As to reward: it's been like that since forever,
    > so if the problem were really serious, we'd have identified it before.
    >
    > Letting it age a bit during beta would do a lot to damp down the risk
    > side of the equation.
    
    OK by me.
    
    -- 
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
    
    
  24. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-09T20:02:14Z

    Excerpts from Tom Lane's message of jue jun 09 14:45:31 -0400 2011:
    > Alvaro Herrera <alvherre@commandprompt.com> writes:
    > > Excerpts from Tom Lane's message of mié jun 08 14:28:02 -0400 2011:
    > >> Alvaro Herrera <alvherre@commandprompt.com> writes:
    > >>> This customer is running on 8.4 so I started from there; should I
    > >>> backpatch this to 8.2, or not at all?
    > 
    > >> I'm not excited about back-patching it...
    > 
    > > Bummer.
    > 
    > Well, of course mine is only one opinion; anybody else feel this *is*
    > worth risking a back-patch for?
    > 
    > My thought is that it needs some beta testing.  Perhaps it'd be sane to
    > push it into beta2 now, and then back-patch sometime after 9.1 final,
    > if no problems pop up.
    
    FWIW I was about to push it but noticed that regression tests fail with
    this:
    
    TRAP: FailedAssertion("!(!ReindexIsProcessingIndex(((indexRelation)->rd_id)))", File: "/pgsql/source/HEAD/src/backend/access/index/indexam.c", Line: 283)
    
    I just make distclean'd -- still there.  I'm gonna revert my patch and
    retry.
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  25. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-09T20:34:13Z

    Excerpts from Alvaro Herrera's message of jue jun 09 16:02:14 -0400 2011:
    > Excerpts from Tom Lane's message of jue jun 09 14:45:31 -0400 2011:
    
    > > My thought is that it needs some beta testing.  Perhaps it'd be sane to
    > > push it into beta2 now, and then back-patch sometime after 9.1 final,
    > > if no problems pop up.
    
    > FWIW I was about to push it but noticed that regression tests fail with
    > this:
    > 
    > TRAP: FailedAssertion("!(!ReindexIsProcessingIndex(((indexRelation)->rd_id)))", File: "/pgsql/source/HEAD/src/backend/access/index/indexam.c", Line: 283)
    > 
    > I just make distclean'd -- still there.  I'm gonna revert my patch and
    > retry.
    
    That was pretty weird.  I had rm'd the build tree and rebuilt, failure
    still there.  I then did a git reset FETCH_HEAD, recompiled, and the
    problem was gone.  git reset to my revision, failed.  Then git clean
    -dfx, nuked the build tree, redid the whole thing from scratch, no
    problem.  I guess there was a mismatch on the files that we build in the
    source tree.
    
    I have pushed it now.
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  26. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-09T20:41:03Z

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    > That was pretty weird.  I had rm'd the build tree and rebuilt, failure
    > still there.  I then did a git reset FETCH_HEAD, recompiled, and the
    > problem was gone.  git reset to my revision, failed.  Then git clean
    > -dfx, nuked the build tree, redid the whole thing from scratch, no
    > problem.  I guess there was a mismatch on the files that we build in the
    > source tree.
    
    git bug?  I'd expect exactly that failure if you had the test changes
    but not the source fixes from commit dccfb72892acabd25568539ec882cc44c57c25bd.
    
    			regards, tom lane
    
    
  27. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-09T20:55:41Z

    Excerpts from Alvaro Herrera's message of jue jun 09 16:34:13 -0400 2011:
    
    > I have pushed it now.
    
    ... and it caused a failure on the buildfarm, so I panicked and reverted
    it.  I think the patch below fixes it.  Let me know if you think I
    should push the whole thing again.
    
    *** a/src/backend/storage/file/fd.c
    --- b/src/backend/storage/file/fd.c
    ***************
    *** 1045,1070 **** FileSetTransient(File file)
      }
      
      /*
    -  * Close a file at the kernel level, but keep the VFD open
    -  */
    - static void
    - FileKernelClose(File file)
    - {
    - 	Vfd		  *vfdP;
    - 
    - 	Assert(FileIsValid(file));
    - 
    - 	vfdP = &VfdCache[file];
    - 
    - 	if (!FileIsNotOpen(file))
    - 	{
    - 		if (close(vfdP->fd))
    - 			elog(ERROR, "could not close file \"%s\": %m", vfdP->fileName);
    - 		vfdP->fd = VFD_CLOSED;
    - 	}
    - }
    - 
    - /*
       * close a file when done with it
       */
      void
    --- 1045,1050 ----
    ***************
    *** 1892,1903 **** CleanupTempFiles(bool isProcExit)
      				else if (fdstate & FD_XACT_TRANSIENT)
      				{
      					/*
    ! 					 * Close the kernel file descriptor, but also remove the
    ! 					 * flag from the VFD.  This is to ensure that if the VFD is
    ! 					 * reused in the future for non-transient access, we don't
    ! 					 * close it inappropriately then.
      					 */
    ! 					FileKernelClose(i);
      					VfdCache[i].fdstate &= ~FD_XACT_TRANSIENT;
      				}
      			}
    --- 1872,1884 ----
      				else if (fdstate & FD_XACT_TRANSIENT)
      				{
      					/*
    ! 					 * Close the FD, and remove the entry from the LRU ring,
    ! 					 * but also remove the flag from the VFD.  This is to
    ! 					 * ensure that if the VFD is reused in the future for
    ! 					 * non-transient access, we don't close it inappropriately
    ! 					 * then.
      					 */
    ! 					LruDelete(i);
      					VfdCache[i].fdstate &= ~FD_XACT_TRANSIENT;
      				}
      			}
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
    
    
  28. Re: Postmaster holding unlinked files for pg_largeobject table

    Tom Lane <tgl@sss.pgh.pa.us> — 2011-06-09T21:10:10Z

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    > Excerpts from Alvaro Herrera's message of jue jun 09 16:34:13 -0400 2011:
    >> I have pushed it now.
    
    > ... and it caused a failure on the buildfarm, so I panicked and reverted
    > it.  I think the patch below fixes it.
    
    Actually, I think what you want is what closeAllVfds does, which
    suggests that you need a FileIsNotOpen test too.
    
    > Let me know if you think I should push the whole thing again.
    
    Yesterday I would have said okay, but now we're too close to the beta2
    wrap deadline.  Please hold off until beta2 is out.
    
    			regards, tom lane
    
    
  29. Re: Postmaster holding unlinked files for pg_largeobject table

    Alvaro Herrera <alvherre@commandprompt.com> — 2011-06-09T21:43:27Z

    Excerpts from Tom Lane's message of jue jun 09 17:10:10 -0400 2011:
    > Alvaro Herrera <alvherre@commandprompt.com> writes:
    > > Excerpts from Alvaro Herrera's message of jue jun 09 16:34:13 -0400 2011:
    > >> I have pushed it now.
    > 
    > > ... and it caused a failure on the buildfarm, so I panicked and reverted
    > > it.  I think the patch below fixes it.
    > 
    > Actually, I think what you want is what closeAllVfds does, which
    > suggests that you need a FileIsNotOpen test too.
    
    Yeah, I noticed that my proposed change doesn't solve the problem.
    
    > > Let me know if you think I should push the whole thing again.
    > 
    > Yesterday I would have said okay, but now we're too close to the beta2
    > wrap deadline.  Please hold off until beta2 is out.
    
    Will do.
    
    -- 
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support