Re: random isolation test failures

Noah Misch <noah@leadboat.com>

From: Noah Misch <noah@leadboat.com>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Andrew Dunstan <andrew@dunslane.net>, Kevin Grittner <Kevin.Grittner@wicourts.gov>, Alvaro Herrera <alvherre@commandprompt.com>, PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2011-09-27T00:57:40Z
Lists: pgsql-hackers

Attachments

On Mon, Sep 26, 2011 at 01:10:27PM -0400, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > We are seeing numerous occasional buildfarm failures of the fk-deadlock2 
> > isolation test,
> 
> Yeah, I complained about this already, but Kevin disclaims all
> responsibility for the fk isolation tests.  It looks like Alvaro
> and Noah Misch are the people to be harassing.

Yep; I took advantage of Kevin's test harness for some unrelated tests.

These sporadic failures happen whenever the test case takes longer than
deadlock_timeout (currently 100ms for these tests) to setup the deadlock.  I
outlined some mitigating strategies here:
http://archives.postgresql.org/message-id/20110727171438.GE18910@tornado.leadboat.com

I'd vote for #1: let's double the deadlock_timeout until the failures stop.
Other opinions?

Thanks,
nm