Thread

  1. Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring

    Naga Appani <nagnrik@gmail.com> — 2025-12-06T17:52:57Z

    Hi Ashutosh,
    
    Thanks for the review!
    
    I agree - comparing the exposed members_size against the documented
    thresholds is sufficient for monitoring purposes.
    
    This aligns with the approach taken in v11: exposing the current usage in
    a way consistent with other PostgreSQL counters (e.g., XIDs, OIDs), without
    introducing user-visible remaining-capacity calculations whose behavior is
    inconsistent and difficult to interpret externally. In the same spirit, I
    removed oldest_offset: as we discussed, it is internal and does not
    provide an actionable signal to users.
    
    If this addresses the concerns raised so far, I would appreciate
    consideration in moving v11 forward for commit.
    
    On Mon, Nov 10, 2025 at 12:13 AM Ashutosh Bapat
    <ashutosh.bapat.oss@gmail.com> wrote:
    >
    > On Wed, Nov 5, 2025 at 6:43 AM Naga Appani <nagnrik@gmail.com> wrote:
    > >
    > > Understanding
    > > ============
    > > Based on reading the relevant parts of multixact.c and observing the runtime
    > > behavior, both approaches seem to run into limitations when trying to derive a
    > > “remaining members” value outside the backend. I may be missing details, but the
    > > behavior I observed suggests that a reliable computation might require
    > > duplicating
    > > several internal mechanisms, including:
    > > - wrap-aware offset comparison
    > > - SLRU page and segment alignment rules
    > > - SetOffsetVacuumLimit’s segment recalculation
    > >
    > > Without accounting for those, the derived numbers behaved inconsistently across
    > > tests, sometimes staying at 0 until a large jump, and in other cases increasing
    > > between exhaustion cycles. This seems broadly consistent with your concern that
    > > simple arithmetic on these counters does not match how the backend determines
    > > wraparound risk.
    > >
    > > To be clear, this interpretation is based only on what I could infer from the
    > > code and testing, and I may not be capturing the entire picture. But from what I
    > > observed, a user-visible “remaining members” metric does not seem
    > > straightforward
    > > without exposing or replicating backend logic.
    >
    > Right now MultiXactOffsetWouldWrap() assesses if the given distance is
    > higher than the permitted distance between start and boundary. I think
    > we could instead change it to report the permitted distance based on
    > start and boundary; use it to report remaining space (after
    > multiplying it with bytes per member) and also use it to assess
    > whether the required distance is within that boundary or whether we
    > need a warning. But ...
    > On Sat, Oct 18, 2025 at 4:48 PM Tomas Vondra <tomas@vondra.me> wrote:
    > >
    > > Thanks for working on this. I'm wondering if this is expected / could
    > > help with monitoring for "space exhaustion" issues, which we currently
    > > can't do easily, as it's not exposed anywhere.
    > >
    > > This is in multixact.c at line ~1177, where we do this:
    > >
    > >     if (MultiXactState->oldestOffsetKnown &&
    > >         MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
    > >                                  nextOffset, nmembers))
    > >     {
    > >         ereport(ERROR, ...
    > >     }
    > >
    > > But I'm not sure the current patch exposes enough information to
    > > calculate how much space remains - calculating that we requires
    > > offsetStopLimit and nextOffset.
    >
    > The function exposes the number of existing members and the amount of
    > space they consume (members_size). The documentation mentions space
    > related thresholds 10GB and 20GB. Isn't comparing members_size to
    > these thresholds enough to take appropriate action? If so, we could
    > report the difference between these respective thresholds and
    > members_size as a metric of space remaining before a given threshold
    > is triggered.
    
    Best regards,
    Naga