Thread

  1. Re: Row pattern recognition

    Henson Choi <assam258@gmail.com> — 2026-05-29T01:28:29Z

    Hi Jian,
    
    Thanks for the careful read of execRPR.c and the NFA -- several of these
    landed on real gaps in our comments/docs. Going through your points
    inline below, with a summary of decisions at the end.
    
    
    > In src/include/nodes/execnodes.h, we're adding quite a few fields to
    > WindowAggState that are only used for RPR queries.
    > Should we consolidate these fields behind a single pointer (named
    > RPRContext) to keep the WindowAggState size smaller for non-RPR
    > queries?
    
    The size win is real -- it's roughly 450-500 bytes per WindowAggState
    that every non-RPR window query carries today. But it takes a wide
    code change to get there: it reshapes every RPR access path, a
    sizable (if mostly mechanical) diff across nodeWindowAgg.c,
    execRPR.c, and the explain side.
    
    Tatsuo, as co-author -- do you want this in v48? If you do, I'll
    prepare the RPRContext consolidation as an incremental patch for you to
    fold into v48. It isn't blocking either way.
    
    
    > In function get_reduced_frame_status, I made the following changes:
    > [conditionA..D probe elided]
    > I re-ran the regression tests and noticed it's entirely possible for
    > multiple conditions to be true at once. Because of this, all the IF
    > blocks in get_reduced_frame_status (after the initial one) are
    > order-dependent. If I move one IF statement above another,
    > get_reduced_frame_status may return a different result.
    > Is this the expected behavior? It seems like a potential logic flaw.
    
    You're right that the conditions aren't mutually exclusive and that
    reordering the branches changes the result -- but that's the
    cascade-with-early-return idiom, standard in C and used throughout the
    backend, not a logic flaw. Each branch is the minimal test given the
    negations the earlier returns have already established.
    
    heapam_visibility.c is the canonical example: the HeapTupleSatisfies*
    family cascades through ordered early returns, with the later XMAX
    branches correct only because the XMIN-committed invariant above holds.
    
    The order is correct, not just idiomatic: update_reduced_frame() only
    ever records (matched, length) as (false, 1), (true, 0), or (true, >=1),
    and A->B->C->D is the order that classifies exactly those three cases --
    reorder it and one of them gets misclassified.
    
    I'll add short "by here, ..." comments between the branches to make the
    invariants visible, but keep the structure. If you feel strongly it
    should be restructured, I'd defer to Tatsuo on that.
    
    
    > "boundary chk" should be "boundary check".
    > column "match_start dep.", I don't understand the meaning of "direct"
    > and "boundary check".
    > "count-dominance" seems like an invented word, but I could not find
    > the explanation.
    > "match_start-dependent" occurred many times, it would also deserve an
    > explanation. We can also rename it as match_start_dependent.
    
    All fair. v48 fixes the typo and adds a terminology block to README.rpr
    defining count-dominance (the VIII-3 cover condition, named there),
    match_start_dependent (renamed to the underscore spelling, to match the
    defineMatchStartDependent identifier), and the "direct" vs "boundary
    check" navigation distinction.
    
    
    > Attached is a minor refactoring of ExecRPRProcessRow, no need for
    > boolean hasLimitedFrame.
    > ``if (hasLimitedFrame && winstate->endOffsetValue != 0)`` seems wrong
    > to me, endOffsetValue is a Datum, and you directly compare it with 0.
    > see also calculate_frame_offsets.
    
    Good eye on the Datum comparison, and I agree your version reads more
    clearly -- gating on the frame-option flag and going through
    DatumGetInt64() rather than comparing a Datum to 0, plus the two new
    Asserts. (The existing code is correct, since hasLimitedFrame carries
    the limited/unbounded distinction separately, but the refactor is the
    clearer shape.) I'll take it.
    
    One thing to watch when dropping hasLimitedFrame, though: a zero offset
    is still a valid limited frame -- ROWS BETWEEN CURRENT ROW AND CURRENT
    ROW, and ... AND 0 FOLLOWING -- so the new check must not treat offset 0
    as unbounded, otherwise the boundary check is skipped and a quantified
    pattern (A+) silently absorbs the whole partition. The limited/unbounded
    distinction lives in the frame-option flag, not the offset value, so the
    refactor has to keep that flag in the picture.
    
    We don't currently cover the offset-0 quantified case -- the two
    zero-offset frame tests in rpr_base both use a plain PATTERN (A) -- so
    I'll add an A+ regression test for it.
    
    
    > [nfa_evaluate_row's window_gettupleslot returning false]
    > indicates that it's not possible for (currentPos > ctxFrameEnd).
    > therefore [if (currentPos >= ctxFrameEnd) ...]
    > should change to ``if (currentPos == ctxFrameEnd)``.
    > With that, the comment wording "exceeded" seems wrong too.
    
    Agreed. `>` is unreachable by the loop invariant -- currentPos advances
    by exactly one per row, and once a context is finalized the
    states == NULL guard skips it -- so >= and == behave identically here,
    and == states the intent better. The `>=` was a defensive guard against
    an overshoot that can't actually happen; v48 tightens it to ==, changes
    the comment from "exceeded" to "reached", and moves that defense into an
    Assert(currentPos <= ctxFrameEnd) so a future change that breaks the
    invariant trips immediately instead of silently.
    
    
    > Since I couldn't generate a coverage report, I am using elog(INFO) to
    > test reachability. Rerun the regress tests, you will find out that the
    > ELSE IF branch below is not reached by current regress tests.
    >     else if (targetCtx->states == NULL)
    >     {
    >         /* Context already completed - skip to result registration */
    >         goto register_result;
    >         elog(INFO, "XXXX reached");
    >     }
    
    The elog in your snippet sits *after* the goto, so it can never print
    even though the branch runs. Moving it above the goto shows it firing
    across the suite. Easy mistake; I've put a probe right after a
    goto/return more times than I'd like to admit.
    
    The branch is correct as written: under SKIP TO NEXT ROW an overlapping
    context can finish (and be preserved by cleanup, since
    matchEndRow >= matchStartRow) before the next update_reduced_frame()
    call lands on its start row. So no new test is needed -- I'll just add a
    comment above it explaining why the head context can already be complete
    there.
    
    
    Summary of decisions, in the order above:
    
    
      RPRContext consolidation         Tatsuo's call -- non-blocking for v48
      get_reduced_frame_status order   Keep -- cascade idiom; add comments
      README terminology + typo        Accept -- definitions block + rename
      ExecRPRProcessRow refactor       Accept -- but fix frameOffset
      currentPos >= -> ==              Accept -- plus comment + Assert
      states == NULL branch coverage   Already reached (155x) -- add comment
    
    
    I'll post an incremental patch shortly with the accepted changes and the
    comment additions; the RPRContext question is for Tatsuo.
    
    Thanks again,
    Henson