[HACKERS] spinlock freeze ?(Re: INSERT/UPDATE waiting (another example))

Hiroshi Inoue <inoue@tpf.co.jp>

From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Tom Lane" <tgl@sss.pgh.pa.us>, "Wayne Piekarski" <wayne@senet.com.au>
Cc: <pgsql-hackers@postgreSQL.org>
Date: 1999-05-13T10:28:49Z
Lists: pgsql-hackers
Hello all,

>
> Hi everyone!
>
>
> Tom Lane Writes:
>
> > Wayne Piekarski <wayne@senet.com.au> writes:
> > > Currently, I start up postmaster with -B 192, which I guess
> puts it below
> > > the value of 256 which causes problems. Apart from when I got past 256
> > > buffers, does the patch fix anything else that might be
> causing problems?
> >

[snip]

>
> Then another one after restarting everything:
>
> ERROR:  cannot open segment 1 of relation sessions_done_id_index
>

I got the same error in my test cases.
I don't understand the cause of this error.

But it seems I found another problem instead.

    spinlock io_in_progress_lock of a buffer page is not
    released by operations called by elog() such as
    ProcReleaseSpins(),ResetBufferPool() etc.

    For example,the error we have encountered probably occured
    in ReadBufferWithBufferLock().
    When elog(ERROR/FATAL) occurs in smgrread/extend() which
    is called from ReadBufferWithBufferLock(),smgrread/extend()
    don't release the io_in_progress_lock spinlock of the page.
    If other transactions get that page as a free Buffer page,those
    transactions wait the release of io_in_progress_lock spinlock
    and would abort with message such as

> FATAL: s_lock(1800d37c) at bufmgr.c:657, stuck spinlock. Aborting.
>
> FATAL: s_lock(1800d37c) at bufmgr.c:657, stuck spinlock. Aborting.

Comments ?

I don't know details about spinlock stuff.
Sorry,if my thought is off the point.

And I have another question.

It seems elog(FATAL) doesn't release allocated buffer pages.
It's OK ?
AFAIC elog(FATAL) causes proc_exit(0) and proc_exit() doesn't
call ResetBufferPool().

Thanks.

Hiroshi Inoue
Inoue@tpf.co.jp