Thread
-
Re: Row pattern recognition
Henson Choi <assam258@gmail.com> — 2026-05-27T05:49:44Z
Hi jian, Tatsuo, Thanks for the thorough first read of execRPR.c and the NFA. The depth of understanding you reached on a fairly intricate body of code in a short window is impressive -- and several of the open questions in your review pointed straight at documentation and comment debt on our side that needed exposing. First-reviewer feedback of that kind is especially valuable, because the gaps it surfaces are exactly the ones the next reader would otherwise have to re-discover; closing them now puts everyone reading the code afterwards on firmer ground. Would be glad to keep working together on this area. > Disclaimer: I had some private off-list discussions with Henson Choi > regarding execRPR.c, and the NFA. > The following are more formal comments regarding execRPR.c and the NFA > algorithm, based on v47-0001 to v47-0009, I haven't applied the > incremental diff yet. > (Since the incremental diffs are scattered across several threads, a > unified v48 would be better). > (I'm still wrapping my head around the NFA in execRPR.c, so take the > comments below with a grain of salt.) Agreed on the unified v48 -- both the scattered incrementals and the responses to your review will be part of that series. The responses below will land alongside their corresponding change (README, comment, test, or commit) in that series rather than as a free-standing reply that defers the actual fixes. > PATTERN (A B C D) > We can short-circuit and exit early if any of the evaluations (A, B, > or C) fail in nfa_match. > This is necessary since the chance of a pattern element evaluation > returning false is not rare, I think. Agreed on the optimization intent. A slightly different shape worth considering: rather than eager short-circuit in nfa_match, defer DEFINE predicate evaluation to first use -- varMatched[i] computed on demand and cached per row, so pruned paths never pay for predicates they didn't reach. The code change is small; the part needing care is whether "not evaluating" is safe for every predicate (navigation state, slot setup, side effects). The current code or the standard may already enforce constraints that make lazy evaluation naturally safe, but I'd rather not act on that assumption without concrete grounding in what the code actually enforces or in test coverage that pins it down. Would you be interested in driving the discussion in this thread? You've been deep in execRPR.c recently, so the trade-offs sit closest to you, and I think Tatsuo will land a good conclusion once they're on the table. Happy to support the discussion in whatever way is useful as it converges. > For src/backend/executor/README.rpr: > We should explicitly explain 'absorbable' and 'absorption' somewhere in > README.rpr, as the text currently just assumes the reader knows what they mean. > Using some example illustrate "absorption" meaning, put it on > README.rpr would be great. > We can also mention that 'DFS' refers to Depth-First Search". Acknowledged, and the request surfaced an underlying problem in the README's terminology. "Absorption" is currently used for two distinct things: an AST-level rewrite in Phase 1 that pulls identical sequences around a group inside it, and the runtime context-equivalence collapse that drives the O(n^2) -> O(n) optimization. Sharing the word leaves a reader encountering "absorbable" early on without an anchor. Rather than disambiguate by qualifier ("prefix/suffix absorption" vs "context absorption"), I'd lean toward renaming the AST-level case so "absorption" stays reserved for the runtime concept. The README then only needs to explain absorption in one place, in detail, without the disambig preamble. For the rename, "prefix/suffix merging" feels like the natural fit -- the other AST-level optimizations in the same Phase 1 are already named "consecutive variable / group / ALT merging", so it slots in cleanly. "Prefix/suffix factoring" is another candidate if a more descriptive verb is preferred. Tatsuo, curious what you think of this direction and naming. Happy to take any name you prefer for the AST-level operation, or to keep the original "absorption" wording with stronger forward-references if you'd rather not rename. For the absorption explanation itself in README.rpr, the diagnosis I'd offer is that Chapter VIII already carries the necessary content -- the issue is narrative order. VIII-1 leads with the O(n^2) problem framing, so a reader meets the cost shape before meeting the intuition for why absorption is possible, and has to carry the problem until VIII-2's monotonicity argument finally lands. Beyond that, VIII-2 stays abstract; there is no row-by-row trace showing two states being judged equivalent. Two small additions seem to close most of the gap: (A) A 4-5 line intuition summary at the top of Chapter VIII, before VIII-1, naming what absorption is (collapsing contexts that have converged on identical future behavior) and the monotonicity principle that makes it safe. This gives the reader an anchor before the problem framing. (B) A short worked example at the end of VIII-2: a PATTERN (A+) trace over a few rows showing each new context being absorbed by Context_1 once its (elemIdx, depth-0 count) is dominated. Concrete state/count comparisons make the abstract solution land. Curious if this read of the gap matches what tripped you up, and whether (A) + (B) feel sufficient. Happy to draft both as part of the v48 README changes. For DFS, will expand it to "Depth-First Search (DFS)" at the first occurrence. > `````` > (4) Call nfa_advance(initialAdvance=true) > `````` > In V47, the variable `initialAdvance` does not exist. Leftover from an earlier patch version -- the boolean parameter was refactored away and the README notation wasn't updated. I'll bring it in line with the current signature. > In nfa_advance_var, after the first Assert, we can add: > Assert(elem->next < pattern->numElements); Agreed. Will add it right after Assert(canLoop || canExit) in nfa_advance_var, with a >= 0 lower bound tacked on while there (RPRElemIdx is signed int16, INVALID = -1): Assert(elem->next >= 0 && elem->next < pattern->numElements); > ExecRPRFinalizeAllContexts seems unnecessary; I commented it out, > rerun the regress tests > (TESTS='test_setup rpr_base rpr_nfa rpr_explain rpr_integration rpr' > meson test -C $BUILD3 --num-processes 20 --suite regress --verbose) > Only two SQL tests in rpr_explain.sql failed. Reproduced this. You're right on correctness and memory: data rows are identical with the call removed, and release_partition's MemoryContextReset reclaims memory anyway. Finalize isn't really about handling matches. By the time the partition ends, all genuine FIN reaches have already been recorded in-flight. Its job is to kill any VAR states still pursuing when rows run out, so cleanup sees a uniform ctx->states == NULL across every context. Three shapes survive there: - Pure pursuit (no matchedState, e.g., A+ B mid-pattern). - Empty-match candidate + pursuit (matchedState set with matchEndRow < matchStartRow -- e.g., greedy A* with no successful matches yet, while VAR is still chasing a longer non-empty one). - Real match + pursuit (matchedState set with matchEndRow >= matchStartRow -- e.g., greedy A* with some matches recorded and still looping for a longer one). The first two get reclassified as failures by cleanup; without Finalize they linger without contributing to stats. The third is stat-neutral -- cleanup skips it either way -- but goes through the same uniform path so partition-end classification stays centralized. The classification surfaces today only via rpr_explain stats, but becomes user-visible once we extend the R020 surface or move into R010 -- MEASURES and eventual R010 hooks count matches based on this classification. Worth locking in now, and an explicit partition-end stage is structurally cleaner than scattering the logic. Plan: keep the call and reframe it as the partition-end classification policy holder. Strategically, that gives the future partition-end hooks a single anchor to extend, instead of growing scattered end-of-partition paths. > SELECT first_value(id) OVER w AS match_start FROM stock_ticks > WINDOW w AS ( ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED > FOLLOWING AFTER MATCH SKIP PAST LAST ROW PATTERN ((A B) {2}) DEFINE A > AS price < 100, B AS price < 100); > > The query above invokes the following code. Since the PATTERN above is > not greedy, is the comment below incorrect? > `````` > else > { > /* Greedy: enter first, skip second */ > ... > } > `````` The comment is misleading; the code is correct. Two pieces: First, greedy vs reluctant: '?' plays two distinct roles in our grammar. As a quantifier on its own (A?, (A B)?) it means "optional" -- equivalent to {0,1}. As a suffix on another quantifier (A+?, (A B)*?, {n}?, etc.) it makes that quantifier reluctant. So {n} without a trailing '?' is greedy; the {n}? form is reluctant. PATTERN ((A B){2}) is in fact greedy. Second, why the branch is entered: nfa_advance_begin's else arm handles two cases at once: (a) Greedy with optional group: skipState != NULL, not reluctant. "Enter the group; also create the skip path." (b) Non-nullable group (min > 0, regardless of greedy/reluctant): skipState stays NULL, the outer guard "if (skipState != NULL && RPRElemIsReluctant(elem))" falls through, and the inner "if (skipState != NULL)" prevents the skip-path action from running. (A B){2} has min = max = 2, so it lands in (b) -- the action that actually runs is "enter the group", no skip path. The current label only describes (a), which is why it reads wrong for your test query. Plan is to rewrite the comment along the lines of "Greedy-or-non-nullable: route to the first child; for optional groups (skipState != NULL), additionally create the skip path." > nfa_advance_var > ``` > else if (canExit) > { > ... > } > ``` > The above ELSE IF overrides all RPRNFAState field values except > RPRNFAState->next. > Should we set RPRNFAState->next to NULL? > (If I add ``state->next = NULL;`` in the above ELSE IF branch, all the > regress tests still pass) Good catch on the asymmetry. Tracing it through, state->next is actually already NULL at every branch you flagged: nfa_advance resets it just before crossing into nfa_advance_state, and the intermediate branches don't disturb it. Your experiment passing with an added "state->next = NULL" is consistent with that -- the assignment is redundant rather than load-bearing. The contract that keeps state->next sane lives at two concentrated points (nfa_advance entry, nfa_add_state_unique linking), and the branches in between are pass-through. Sprinkling the same reset at every branch would be defensive noise rather than a real safety net, so I'd leave the branches alone. Happy to add a short comment near nfa_advance's reset marking it as the boundary contract, so the next reader doesn't trip on the same question. > For function nfa_advance_var, I don't understand the meaning of the > variable "count", after the first Assert I have added below: > ... > Rerunning the regress tests shows that count >= 3 occurs very infrequently. > ... > Can we add more complex queries (more count >= 3) to check if the > "count" variable is working correctly? The "count" semantics will read more cleanly once the absorption README work above lands (counts[d] = iteration count at nesting depth d). For coverage I'll add a nested reluctant quantifier to rpr_nfa (e.g., PATTERN ((A B){3,5}? C)) to drive count through the 3..5 band repeatedly. (rpr_nfa is the suite that already targets Quantifier Runtime Behavior and Absorption Optimization.) > In function nfa_add_state_unique: > /* Mark VAR in visited before duplicate check to prevent DFS loops */ > ... > I honestly don't understand the purpose of the code block above. But it doesn't > seem to influence the subsequent FOR LOOP; > ... > Could we add some comments explaining which external functions rely on > this code and why it belongs in nfa_add_state_unique? The code is correct, but the contract is split across two functions and currently only one side points to the other. The visited marking scheme is asymmetric on purpose: - Non-VAR elements (END/ALT/BEGIN/FIN) are marked on entry to nfa_advance_state because epsilon cycles must be prevented immediately. - VAR elements are marked later, in nfa_add_state_unique, only when added to the state list. That delay is intentional: it keeps legitimate quantifier loop-back to the same VAR across iterations possible. The paired cycle check sits in nfa_advance_state, and the asymmetric-marking rationale is documented there. What's missing is the back-reference from nfa_add_state_unique. I'll add a single line at the marking site pointing back to nfa_advance_state. > nfa_states_equal > compareDepth = elem->depth + 1; /* depth 0 needs 1 count, etc. */ > The comment above isn't helpful, IMHO, and I don't understand it. > We should focus on why compareDepth should be ```elem->depth + 1```. Agree the trailing comment is too terse. Two pieces are missing: (a) The +1 arithmetic: to compare counts up to depth N, we need slots counts[0..N], which is N+1 entries. (b) Why deeper slots are excluded: counts[d > elem->depth] are scratch state from deeper groups and get reset on re-entry, so they must not participate in equivalence judgment. Two states sharing elemIdx are equivalent iff all enclosing-or-current depth counts match. I'll replace the trailing comment with a small block covering both pieces. > function nfa_add_state_unique return value is not being used? > Do we need to do something with the return value, or is this expected? > (I don't have an opinion on it, I guess it would be better to raise this issue) Leftover from an earlier design -- the duplicate case is fully handled inside the function (the state is freed and nfaStatesMerged is incremented), so callers have nothing to branch on, and indeed none of them do. Will change the signature from bool to void and drop the return statements. > In nfa_advance_alt, during the main WHILE loop, I think altElem->depth > must be larger than elem->depth. > Therefore we can do > `````` > if (altElem->depth == elem->depth) > elog(ERROR, "nfa_advance_alt altElem->depth should not be > the same as elem->depth reached"); > if (altElem->depth < elem->depth) > break; > `````` I had to push back on this one. Tracing the depth bookkeeping: - For an ALT at depth D, branches sit at depth D+1, and each branch's first element has .jump pointing to the next branch's first (set in fillRPRPatternAlt). So the walk normally terminates when the last branch's .jump = INVALID -- the depth check doesn't fire at all. - But when the last branch is a quantified group, its first element is a BEGIN whose .jump = past-END (set by fillRPRPatternGroup and not overridden for the last branch). The walk then steps to a post-ALT element, and the depth check is what stops it from creating a stray state out there. That post-ALT element has depth <= D: * D-1 if the ALT is inside an enclosing group with a non-trivial quantifier, e.g., PATTERN ((A | (B C)+){2}) -- post-ALT lands on the outer END at depth 0, ALT at depth 1. (A {1,1} outer wrap gets removed by single-child unwrap, so it has to be a real quantifier.) * D if the ALT has a sibling at the same level, e.g., PATTERN (A | (B C)+) at top level -- post-ALT is FIN at depth 0, matching the ALT's depth 0. So "altElem->depth == elem->depth" is a legitimate end-of-walk signal for the quantified-group-last-branch case, not an invariant violation. Treating it as an error would misfire on patterns like A | (B C)+. The current "if (altElem->depth <= elem->depth) break;" in nfa_advance_alt is intentionally <= and not <, and the looser comparison is correct. Happy to add a brief comment there noting the trigger condition, if it would help future readers. Summary of decisions, in the order above: Short-circuit optimization Separate series -- invite you to drive Absorption README narrative Accept -- Chapter VIII summary + example AST-level "absorption" rename Pending Tatsuo's call -- prefix/suffix merging? DFS expansion Accept initialAdvance README mismatch Accept -- align with current signature Defensive Assert in advance_var Accept -- also add lower bound Finalize unnecessary? Keep -- partition-end policy holder Greedy comment label Accept -- rewrite to cover both cases state->next reset Decline -- boundary contract covers it count >= 3 test coverage Accept -- add to rpr_nfa visited marking purpose Accept -- add back-reference comment compareDepth comment Accept -- rewrite with intent Unused bool return Accept -- change to void ALT depth invariant Assert Decline -- end-of-walk signal, not invariant That's the full pass. The actual patches (nocfbot-0016 onward) will follow shortly as a separate submission, for another review round. Thanks, Henson