Thread

  1. Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring

    Naga Appani <nagnrik@gmail.com> — 2025-11-05T01:13:09Z

    Thank you for the feedback, Tomas! I agree with the goal you outlined,
    providing a
    user-friendly “how much space is left” signal would make monitoring far more
    actionable.
    
    On Sat, Oct 18, 2025 at 6:18 AM Tomas Vondra <tomas@vondra.me> wrote:
    >
    > Knowing num_mxids / num_members or members_size is nice, but how would
    > I judge how far the system is from hitting some threshold or hard limit?
    > Is there some maximum number of mxids/members that we could return? Or
    > something like that?
    
    Based on this, I experimented with calculating a num_remaining_members value to
    estimate how close the system is to MultiXact member-space exhaustion. I tested
    two approaches and validated their behavior through repeated exhaustion cycles.
    The results are below.
    
    At the same time, both you and Ashutosh pointed out that oldest_offset exposes
    internal implementation details and is not particularly useful on its own, so I
    removed oldest_offset in v11.
    
    WHAT I TRIED in regards to space remaining
    ==========================================
    
    Approach 1: (offsetStopLimit - nextOffset)
    ------------------------------------------
    I exposed offsetStopLimit from GetMultiXactInfo() and computed:
    
        remainingMembers = offsetStopLimit - nextOffset;
    
    Behavior at exhaustion:
    
        postgres=# SELECT num_mxids,num_members,remaining_members
                   FROM pg_get_multixact_stats();
         num_mxids | num_members | remaining_members
        -----------+-------------+-------------------
         115409471 | 4294914940  |                 1
        (1 row)
    
    After wraparound cleanup:
    
        postgres=# SELECT num_mxids,num_members,remaining_members
                   FROM pg_get_multixact_stats();
         num_mxids | num_members | remaining_members
        -----------+-------------+-------------------
                 0 |           0 |                 0
        (1 row)
    
    The value stayed at 0 until roughly ~100k new members were allocated. My reading
    is that nextOffset wraps to a small value, while offsetStopLimit remains large
    (derived from the oldestOffset at the moment of truncation). Without using the
    backend’s wrap-aware comparison logic (MultiXactOffsetWouldWrap()), plain
    subtraction crosses the wrap boundary and becomes misleading.
    
    Approach 2: (MaxMultiXactOffset - members)
    ------------------------------------------
    I also tested:
    
        remainingMembers = MaxMultiXactOffset - members;
    
    Across three exhaustion cycles:
    
    1st attempt:
    
        postgres=# SELECT num_mxids,num_members,remaining_members
                 FROM pg_get_multixact_stats();
         num_mxids | num_members | remaining_members
        -----------+-------------+-------------------
         125098473 | 4294914940  |             52355
        (1 row)
    
    2nd attempt:
    
        postgres=# SELECT num_mxids,num_members,remaining_members
                   FROM pg_get_multixact_stats();
         num_mxids | num_members | remaining_members
        -----------+-------------+-------------------
         116285530 | 4294905729  |             61566
        (1 row)
    
    3rd attempt:
    
        postgres=# SELECT num_mxids,num_members,remaining_members
                   FROM pg_get_multixact_stats();
         num_mxids | num_members | remaining_members
        -----------+-------------+-------------------
         111973488 | 4294862592  |            104703
        (1 row)
    
    The system correctly rejected inserts in each cycle, but the computed
    “remaining”
    value increased between cycles. This seems to match the dynamic nature of
    offsetStopLimit, which appears to be recomputed after truncation:
    - based on the new oldestOffset
    - aligned back to the start of its segment
    - with one safety segment subtracted
    
    Because the stop boundary shifts depending on segment boundaries, the plain
    (Max − members) formula reflects alignment effects rather than actual remaining
    capacity.
    
    Understanding
    ============
    Based on reading the relevant parts of multixact.c and observing the runtime
    behavior, both approaches seem to run into limitations when trying to derive a
    “remaining members” value outside the backend. I may be missing details, but the
    behavior I observed suggests that a reliable computation might require
    duplicating
    several internal mechanisms, including:
    - wrap-aware offset comparison
    - SLRU page and segment alignment rules
    - SetOffsetVacuumLimit’s segment recalculation
    
    Without accounting for those, the derived numbers behaved inconsistently across
    tests, sometimes staying at 0 until a large jump, and in other cases increasing
    between exhaustion cycles. This seems broadly consistent with your concern that
    simple arithmetic on these counters does not match how the backend determines
    wraparound risk.
    
    To be clear, this interpretation is based only on what I could infer from the
    code and testing, and I may not be capturing the entire picture. But from what I
    observed, a user-visible “remaining members” metric does not seem
    straightforward
    without exposing or replicating backend logic.
    
    My thoughts
    ==========
    Given all this, the cleanest approach appears to be not exposing a “remaining
    members” counter directly.
    PostgreSQL has historically avoided exposing remaining-capacity counters for
    wraparound-limited resources such as:
    - transaction IDs
    - MultiXact IDs
    - OIDs
    
    Instead, PostgreSQL exposes current usage and relies on documented
    thresholds for
    monitoring. Following that established pattern avoids tying a SQL-visible
    interface to backend internals that may evolve over time.
    
    Self-monitoring based on documented limits
    ==========================================
    Monitoring then follows the same pattern PostgreSQL already uses for XIDs and
    other wraparound-limited values:
    - track num_members growth over time
    - warn when it exceeds roughly 2^31
    - treat values approaching 2^32 as exhaustion-risk territory
    - observe the growth rate to estimate when intervention may be needed
    
    This keeps the interface simple, stable, and aligned with existing PostgreSQL
    behavior.
    
    Why oldest_offset was removed
    =============================
    Both you and Ashutosh pointed out that oldest_offset reflects internal SLRU
    geometry and is not actionable without reproducing backend logic. Combined with
    the behavior seen in the experiments above, it made sense not to expose this
    field in the user-visible API. It is removed in v11.
    
    Final shape of the function (v11)
    =================================
    The function now returns:
    - num_mxids
    - num_members
    - members_size
    - oldest_multixact
    
    These fields are stable, directly interpretable, and do not depend on SLRU
    internals or wrap-aware arithmetic.
    
    On Thu, Oct 16, 2025 at 9:10 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
    > Here’s the updated v10 patch, now including access/htup_details.h in
    > src/backend/utils/adt/multixactfuncs.c.
    
    Thank you!
    
    On Thu, Oct 16, 2025 at 7:28 PM torikoshia <torikoshia@oss.nttdata.com> wrote:
    >
    > Could you please update the patch to fix this?
    
    Thank you for raising it and bringing it to attention!
    
    Attached is the v11.
    
    Best regards,
    Naga