Thread

  1. Re: Use BumpContext contexts for TupleHashTables' tablecxt

    Jeff Davis <pgsql@j-davis.com> — 2025-10-30T00:08:47Z

    On Sun, 2025-10-26 at 18:11 -0400, Tom Lane wrote:
    > Related to this, while I was chasing Jeff's complaint I realized that
    > the none-too-small simplehash table for this is getting made in the
    > query's ExecutorState.  That's pretty awful from the standpoint of
    > being able to blame memory consumption on the hash node.  I'm not
    > sure though if we want to go so far as to make another context just
    > for the simplehash table.  We could keep it in that same "tablectx"
    > at the price of destroying and rebuilding the simplehash table, not
    > just resetting it, at each node rescan.  But that's not ideal either.
    
    I had investigated the idea of destroying/rebuilding the simplehash
    table regardless because, in some cases, it can crowd out space for the
    transition states, and we end up with a mostly-empty bucket array that
    causes recursive spilling.
    
    I'm not sure if that's a practical problem -- all the heuristics are
    designed to avoid that situation -- but I can create the situation with
    injection points.
    
    Also, in the case of spilled grouping sets, once one group is
    completely done with all spilled data it would be good to destroy that
    hash table. By adjusting the order we process the spilled data, we
    could minimize the creation/destruction of the bucket array.
    
    However, rebuilding it has a cost, so we should only do that when there
    is really a problem. I came to the conclusion it would require
    significant refactoring to make that work well, and I didn't get around
    to it yet.
    
    Regards,
    	Jeff Davis