Re: BUG #17619: AllocSizeIsValid violation in parallel hash join

Dmytro Astapov <dastapov@gmail.com>
From: Dmitry Astapov <dastapov@gmail.com>
To: Thomas Munro <thomas.munro@gmail.com>
Cc: pgsql-bugs@lists.postgresql.org
Date: 2022-09-22T12:51:27Z
Lists: pgsql-bugs
On Thu, Sep 22, 2022 at 9:44 AM Thomas Munro <thomas.munro@gmail.com> wrote:

> On Thu, Sep 22, 2022 at 7:46 PM PG Bug reporting form
> <noreply@postgresql.org> wrote:
> > (gdb) p size
> > $2 = 1702125924
>
> Thanks for the detailed report.  Hmm.  That size, on a little-endian
> system, is equivalent to the byte sequence "date\0\0\0\0", which looks
> pretty suspiciously like the inside of a tuple, and not its size.  We
> must have got out of sync somehow.
>

Great catch! I added some debugging output to the BufFileRead and
sts_read_tuple and it looks to me that the sequence of events immediately
before the problem happens is like this:


   1. sts_read_tuple is called multiple times to read from a particular
   file, and a bunch of tuples are read this way
   2. sts_read_tuple is called once more. In my experiments, I always had
   read_ntuples_available - 2 == read_ntuples at this point, but maybe this is
   just a coincidence.
      1.  inside this sts_read_tuple_call, BufFileRead is called to read
      metadata, and succeeds (and this chunk of the file looks -- to
my naive eye
      -- exactly like any other tuple start)
      2. Second BufFileRead call happens to get the size, and reads the
      correct size (I see what's in the file, I could find corresponding values
      in the database tables, and it all lines up).
      3. Third BufFileRead happens to read the actual tuple. So far so
      good, and sts_read_tuple finishes.
   3. At this point, the position in the file (buf_file -> currOffset +
   post) points right after the last tuple read, but what's there does not
   look like the start of the next tuple to me. It seems like there was
   another, longer tuple written there before (with quite long rhs.payload
   value) and somehow the tuple that was just read was written over it, so we
   have "<tuple read on step 2 above><current read position here><what looks
   like leftovers from another tuple><a bunch of zeroes><new tuple start>"
   4. So another sts_read_tuple call happens, and it reads 4 bytes of this
   garbage as metadata, and the next 4 bytes as length, and this is when
   everything derails...

I also double-checked at the logs of the (test) database where this issue
was first spotted (and where a lot of queries similar to the one I am using
to trigger this issue are happening) and found out that along with "invalid
memory alloc request size" we also got a couple of occasions of "Unexpected
chunk in shared tuplestore temporary file. Expected overflow chunk", which
looks like another error that could be potentially explained by
BufFileRead's inside sts_read_tuple reading "garbage".

I'm happy to build a patched version of 13.8 (or another revision if you
want me to) to collect more debug information for you if this could help.

-- 
D. Astapov