Thread

Re: Use BumpContext contexts for TupleHashTables' tablecxt

Jeff Davis <pgsql@j-davis.com> — 2025-10-30T00:08:47Z
On Sun, 2025-10-26 at 18:11 -0400, Tom Lane wrote:
> Related to this, while I was chasing Jeff's complaint I realized that
> the none-too-small simplehash table for this is getting made in the
> query's ExecutorState.  That's pretty awful from the standpoint of
> being able to blame memory consumption on the hash node.  I'm not
> sure though if we want to go so far as to make another context just
> for the simplehash table.  We could keep it in that same "tablectx"
> at the price of destroying and rebuilding the simplehash table, not
> just resetting it, at each node rescan.  But that's not ideal either.

I had investigated the idea of destroying/rebuilding the simplehash
table regardless because, in some cases, it can crowd out space for the
transition states, and we end up with a mostly-empty bucket array that
causes recursive spilling.

I'm not sure if that's a practical problem -- all the heuristics are
designed to avoid that situation -- but I can create the situation with
injection points.

Also, in the case of spilled grouping sets, once one group is
completely done with all spilled data it would be good to destroy that
hash table. By adjusting the order we process the spilled data, we
could minimize the creation/destruction of the bucket array.

However, rebuilding it has a cost, so we should only do that when there
is really a problem. I came to the conclusion it would require
significant refactoring to make that work well, and I didn't get around
to it yet.

Regards,
	Jeff Davis