RE: [HACKERS] vacuum process size
Hiroshi Inoue <inoue@tpf.co.jp>
From: "Hiroshi Inoue" <Inoue@tpf.co.jp>
To: "Tom Lane" <tgl@sss.pgh.pa.us>, <t-ishii@sra.co.jp>
Cc: "Mike Mascari" <mascarim@yahoo.com>, <pgsql-hackers@postgreSQL.org>
Date: 1999-08-25T01:11:42Z
Lists: pgsql-hackers
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Wednesday, August 25, 1999 1:20 AM > To: t-ishii@sra.co.jp > Cc: Mike Mascari; Hiroshi Inoue; pgsql-hackers@postgreSQL.org > Subject: Re: [HACKERS] vacuum process size > > > I have been looking some more at the vacuum-process-size issue, and > I am having a hard time understanding why the VPageList data structure > is the critical one. As far as I can see, there should be at most one > pointer in it for each disk page of the relation. OK, you were > vacuuming a table with something like a quarter million pages, so > the end size of the VPageList would have been something like a megabyte, > and given the inefficient usage of repalloc() in the original code, > a lot more space than that would have been wasted as the list grew. > So doubling the array size at each step is a good change. > > But there are a lot more tuples than pages in most relations. > > I see two lists with per-tuple data in vacuum.c, "vtlinks" in > vc_scanheap and "vtmove" in vc_rpfheap, that are both being grown with > essentially the same technique of repalloc() after every N entries. > I'm not entirely clear on how many tuples get put into each of these > lists, but it sure seems like in ordinary circumstances they'd be much > bigger space hogs than any of the three VPageList lists. > AFAIK,both vtlinks and vtmove are NULL if vacuum is executed without concurrent transactions. They won't be so big unless loooong concurrent transactions exist. Regards. Hiroshi Inoue Inoue@tpf.co.jp