Thread

  1. Re: Add the ability to limit the amount of memory that can be allocated to backends.

    Tomas Vondra <tomas@vondra.me> — 2025-12-27T14:52:21Z

    Hi,
    
    Let me bump this dormant thread after about a year. I don't have any
    patch to share or anything like that, but I happened to read two quite
    interesting papers relevant to the topic of setting a per-query memory
    limit, and distributing the memory between operators in a query plan:
    
    1) Exploiting pipeline interruptions for efficient memory allocation
       Authors: Josep Aguilar-Saborit, Mohammad Jalali, Dave Sharpe, Victor
       Muntés MuleroAuthors Info & Claims
       Published: 2008
       PDF: https://dl.acm.org/doi/epdf/10.1145/1458082.1458169
    
    2) Saving Private Hash Join
       Authors: Laurens Kuiper, Paul Groß, Peter Boncz, Hannes Mühleisen
       Published: 2024-2025
       PDF: https://www.vldb.org/pvldb/vol18/p2748-kuiper.pdf
    
    I read the (2) paper first, expecting to learn some new stuff about hash
    joins, but to my surprise it's focusing on how to distribute memory
    between multiple hash joins in a single query. The hash joins serve
    mostly as an example of an actual/common operator in a query.
    
    The (1) paper establishes the theoretical framework and algorithms, and
    (2) presents a more precise/complete model for hash joins.
    
    I think both papers are worth reading, and the framework seems quite
    helpful. Even if there some parts may need tweaks, as Postgres does
    certain things differently for whatever reason.
    
    I'm not going to explain all the details (the papers are not that long
    anyway), but the proposed approach combines a couple basic parts:
    
    
    1) leverage "pipeline interruption" operations
    
    Some operators materialize the intermediate results. This splits the
    plan into parts that don't need memory at the same time. It's enough to
    enforce the limit for these parts, not for the whole plan. Which means
    the optimization problems are smaller, and the operators can get more
    memory than when assuming the whole query runs at the same time.
    
    Of course, the reality is more complicated. Some operators are only
    partial pipeline interruptions (e.g. hashjoin breaks for the inner
    subplan, not the outer).
    
    And Postgres currently delays the Hash build until the first outer
    tuple, unlike DuckDB used in the (2) paper.
    
    
    2) cost model for memory distribution
    
    The memory is distributed between operators based on a simple cost
    model, instead of using a simple scheme where operators get 1/N of
    memory. The (2) paper presents a hash join cost model combining
    "materialization cost" and "throughput".
    
    But it's a pretty generic approach, and should not be too difficult to
    do for other operators. Most operators don't even need to allocate that
    much memory, so the cost model can ignore those, I think. Only nodes
    that "break the pipeline" would matter.
    
    The important part is that this is a second optimization phase. The
    optimizer picks a query plan (assuming some default memory sizes), and
    then the cost model determines how to distribute memory within that
    particular plan.
    
    The paper does that repeatedly during execution, but I suspect these
    adjustments are easier in DuckDB. In Postgres we'd probably do that only
    once right after planning? Or maybe not, not sure.
    
    
    The (1) paper does things a bit differently, so that it works on DB2 (so
    less dynamic), but it also focuses on plans with hash joins etc. The
    general framework seems to be mostly the same.
    
    
    Anyway, that's all I have. I don't expect to work on this anytime soon,
    but I found those papers interesting.
    
    
    regards
    
    -- 
    Tomas Vondra