nocfbot-0038-Move-RPR-design-notes-to-README.rpr.txt

text/plain

Filename: nocfbot-0038-Move-RPR-design-notes-to-README.rpr.txt
Type: text/plain
Part: 37
Message: Re: Row pattern recognition
From f086f06ab0fda30d98dd0435c4decad30dc400f4 Mon Sep 17 00:00:00 2001
From: Henson Choi <assam258@gmail.com>
Date: Sat, 2 May 2026 01:05:04 +0900
Subject: [PATCH 38/40] Move RPR design notes from execRPR.c to README.rpr

The "Flat-Array Stream NFA Guide" (Chapters I-XII plus Appendix A-C)
was carried as a single 1580-line C comment block at the top of
execRPR.c, accounting for nearly half the file's length.  Move it
verbatim to a new src/backend/executor/README.rpr, following the
same plain-tex convention as the existing executor/README, and
leave a four-line pointer comment in execRPR.c.

All chapter cross-references in the document are intra-document
("Chapter IV", "Chapter VIII", "Chapter XII", etc.) and no other
source or comment refers to them, so extracting the block whole
introduces no dangling references.

Code-organization change; no functional or test impact.
---
 src/backend/executor/README.rpr | 1578 ++++++++++++++++++++++++++++++
 src/backend/executor/execRPR.c  | 1580 +------------------------------
 2 files changed, 1580 insertions(+), 1578 deletions(-)
 create mode 100644 src/backend/executor/README.rpr

diff --git a/src/backend/executor/README.rpr b/src/backend/executor/README.rpr
new file mode 100644
index 00000000000..52bcd77390c
--- /dev/null
+++ b/src/backend/executor/README.rpr
@@ -0,0 +1,1578 @@
+============================================================================
+  PostgreSQL Row Pattern Recognition: Flat-Array Stream NFA Guide
+============================================================================
+
+  Target audience: Developers with a basic understanding of the PostgreSQL
+                   executor and planner architecture
+
+  Scope: The entire process from PATTERN/DEFINE clause parsing to NFA
+         runtime execution
+
+  Related code:
+    - src/backend/parser/parse_rpr.c          (parser phase)
+    - src/backend/optimizer/plan/rpr.c        (optimizer phase)
+    - src/backend/executor/nodeWindowAgg.c    (executor phase, window agg)
+    - src/backend/executor/execRPR.c          (executor phase, NFA engine)
+    - src/include/executor/execRPR.h          (NFA public API)
+    - src/include/nodes/plannodes.h           (plan node definitions)
+    - src/include/nodes/execnodes.h           (execution state definitions)
+    - src/include/optimizer/rpr.h             (types and constants)
+    - src/backend/optimizer/plan/createplan.c (nav offset computation)
+
+============================================================================
+
+What is a Flat-Array Stream NFA?
+
+  The NFA in this implementation is not a traditional state-transition graph
+  but a flat array of fixed-size 16-byte elements. At runtime, it processes
+  the row stream in a forward-only manner, expanding epsilon transitions
+  eagerly without backtracking.
+
+  - Flat-Array: Pattern compiled into a flat array,
+                not a graph (Chapter IV)
+  - Stream:     Rows consumed sequentially in one direction,
+                never revisited (Chapter XII)
+  - NFA:        Nondeterministic execution where multiple states
+                coexist within a single context (Chapter VI)
+
+Chapter I  Row Pattern Recognition Overview
+============================================================================
+
+Row Pattern Recognition (hereafter RPR) is a feature introduced in SQL:2016
+that matches regex-based patterns against ordered row sets.
+
+The SQL standard defines two forms:
+
+  Feature R010: MATCH_RECOGNIZE (FROM clause)
+    - Dedicated table operator
+    - Provides dedicated functions such as MATCH_NUMBER(), CLASSIFIER()
+    - Supports ONE ROW PER MATCH / ALL ROWS PER MATCH
+
+  Feature R020: RPR in a window (WINDOW clause)
+    - Integrated into the existing window function framework
+    - Supports ALL ROWS PER MATCH only
+    - No MATCH_NUMBER()
+
+This implementation targets Feature R020.
+
+The basic syntax is as follows:
+
+  SELECT ...
+  OVER (
+    PARTITION BY ...
+    ORDER BY ...
+    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+    [INITIAL | SEEK]   -- SEEK is defined in the standard but not implemented
+    AFTER MATCH SKIP TO NEXT ROW | SKIP PAST LAST ROW
+    PATTERN ( <regex> )
+    DEFINE <variable> AS <condition>, ...
+  )
+
+The PATTERN clause is a regular expression over row pattern variables.
+The DEFINE clause specifies boolean conditions that determine whether each
+variable evaluates to true for the current row.
+
+Example:
+
+  PATTERN (A+ B)
+  DEFINE A AS price > PREV(price),
+         B AS price < PREV(price)
+
+This pattern matches "a span where prices rise consecutively then drop."
+
+Chapter II  Overall Processing Pipeline
+============================================================================
+
+RPR processing is divided into three phases:
+
+  +------------------------------------------------------------+
+  |  1. Parsing (Parser)                                       |
+  |     SQL text -> PATTERN AST + DEFINE expression tree       |
+  |                                                            |
+  |  2. Compilation (Optimizer/Planner)                        |
+  |     PATTERN AST -> optimization -> flat NFA element array  |
+  |                                                            |
+  |  3. Execution (Executor)                                   |
+  |     Row-by-row matching via NFA simulation                 |
+  +------------------------------------------------------------+
+
+Each phase uses independent data structures, and the interfaces between
+phases are well-defined:
+
+  Parser -> Planner:    WindowClause.rpPattern (RPRPatternNode tree)
+                        WindowClause.defineClause (TargetEntry list)
+
+  Planner -> Executor:  WindowAgg.rpPattern (RPRPattern struct)
+                        WindowAgg.defineClause (TargetEntry list)
+
+Chapter III  Parsing Phase
+============================================================================
+
+III-1. Entry Point
+
+  transformWindowDefinitions() (parse_clause.c)
+    +-- transformRPR() (parse_rpr.c)
+
+transformRPR() is invoked when RPCommonSyntax is present and performs the
+following:
+
+  (1) Frame option validation
+      - Only ROWS is allowed (RANGE, GROUPS are not)
+      - The start boundary must be CURRENT ROW
+      - EXCLUDE option is not allowed
+
+  (2) Transcription to WindowClause
+      - Copies rpPattern, rpSkipTo, initial fields
+
+  (3) DEFINE clause transformation (transformDefineClause)
+
+III-2. PATTERN AST
+
+The parser transforms the PATTERN clause into an RPRPatternNode tree.
+Each node has one of the following four types:
+
+  RPR_PATTERN_VAR    Variable reference. Name stored in varName field.
+  RPR_PATTERN_SEQ    Concatenation. Children node list in children.
+  RPR_PATTERN_ALT    Alternation. Branch node list in children.
+  RPR_PATTERN_GROUP  Group (parentheses). Body node list in children.
+
+All nodes have min/max fields to express quantifiers:
+
+  A       -> VAR(A, min=1, max=1)
+  A+      -> VAR(A, min=1, max=INF)
+  A*      -> VAR(A, min=0, max=INF)
+  A?      -> VAR(A, min=0, max=1)
+  A{3,5}  -> VAR(A, min=3, max=5)
+
+If the reluctant field is true, the quantifier is reluctant (non-greedy).
+(RPRPatternNode.reluctant is bool; reluctant_location is the separate
+ParseLoc field holding the '?' token position, or -1 if absent.)
+
+Example: PATTERN ((A+ B) | C*)
+
+  ALT
+  +-- SEQ
+  |   +-- VAR(A, 1, INF)
+  |   +-- VAR(B, 1, 1)
+  +-- VAR(C, 0, INF)
+
+III-3. DEFINE Clause Transformation
+
+transformDefineClause() processes each DEFINE variable as follows:
+
+  (1) Checks for duplicate variable names
+  (2) Transforms the expression via transformExpr()
+  (3) Extracts Var nodes via pull_var_clause() and ensures each is
+      present in the query targetlist, so the planner propagates the
+      referenced columns through the plan tree
+  (4) Wraps in a TargetEntry with the variable name set in resname
+
+After all variables are processed:
+  (5) Coerces each expression to Boolean type (coerce_to_boolean)
+
+Variables that are used in PATTERN but not defined in DEFINE are implicitly
+evaluated as TRUE (matching all rows).
+
+Chapter IV  Compilation Phase
+============================================================================
+
+IV-1. Entry Point
+
+  create_windowagg_plan() (createplan.c)
+    +-- buildDefineVariableList()    Build variable name list from DEFINE
+    +-- buildRPRPattern()           NFA compilation (6 phases)
+
+IV-2. The 6 Phases of buildRPRPattern()
+
+  Phase 1: AST optimization (optimizeRPRPattern)
+  Phase 2: Statistics collection (scanRPRPattern)
+  Phase 3: Memory allocation (allocateRPRPattern)
+  Phase 4: NFA element fill (fillRPRPattern)
+  Phase 5: Finalization (finalizeRPRPattern)
+  Phase 6: Absorbability analysis (computeAbsorbability)
+
+IV-3. Phase 1: AST Optimization
+
+After copying the parser-generated AST, the following optimizations are
+applied:
+
+  (a) SEQ flattening: Unwrap nested SEQ nodes
+      SEQ(A, SEQ(B, C)) -> SEQ(A, B, C)
+
+  (b) Consecutive variable merging: Merge consecutive occurrences of the
+      same variable into a single quantifier
+      A A -> A{2}
+      A{2,3} A{1,2} -> A{3,5}
+
+  (c) Consecutive group merging: Merge repeated identical groups
+      (A B)+ (A B)+ -> (A B){2,INF}
+
+  (d) Consecutive ALT merging: Merge repeated identical ALT nodes
+      (A | B) (A | B) (A | B) -> (A | B){3}
+
+  (e) Prefix/suffix absorption: Absorb identical sequences before/after
+      a group
+      A B (A B)+ -> (A B){2,INF}
+
+  (f) ALT flattening and deduplication
+      (A | (B | C)) -> (A | B | C)
+      (A | B | A) -> (A | B)
+
+  (g) Quantifier multiplication: Collapse nested quantifiers when safe
+      (A+)+ -> A+
+      (A{2,3}){5} -> A{10,15}
+
+  (h) Single-child unwrap
+      SEQ(A) -> A,  (A){1,1} -> A
+
+IV-4. Phase 4: NFA Element Array Generation
+
+Transforms the optimized AST into a flat array of RPRPatternElement.
+This is the core data structure used for NFA simulation at runtime.
+
+RPRPatternElement struct (16 bytes):
+
+  Field      Size     Description
+  ---------------------------------------------------------
+  varId      1B      Variable ID (0-251) or control code (252-255)
+  depth      1B      Group nesting depth
+  flags      1B      Bit flags (see below)
+  reserved   1B      Padding
+  min        4B      Quantifier lower bound
+  max        4B      Quantifier upper bound
+  next       2B      Next element index (sequential flow)
+  jump       2B      Branch target index (for ALT/GROUP)
+
+Control codes:
+
+  RPR_VARID_BEGIN (252)  Group start marker
+  RPR_VARID_END   (253)  Group end marker
+  RPR_VARID_ALT   (254)  Alternation start marker
+  RPR_VARID_FIN   (255)  Pattern completion marker
+
+Element flags (1 byte, bitmask):
+
+  0x01  RPR_ELEM_RELUCTANT          (VAR, BEGIN, END)
+        Non-greedy quantifier.  Prefers shorter match: try exit-loop
+        first, then repeat.  Set on VAR for simple (A+?),
+        on BEGIN+END for group ((...)+?).
+
+  0x02  RPR_ELEM_EMPTY_LOOP         (END)
+        Group body can produce empty match (all children nullable).
+        Creates a fast-forward exit clone alongside the normal
+        loop-back so cycle detection doesn't kill legitimate
+        matches. (IV-4b)
+
+  0x04  RPR_ELEM_ABSORBABLE_BRANCH  (VAR, BEGIN, END, ALT)
+        Element lies within an absorbable region.  Used at runtime
+        to track whether the current NFA state is in an absorbable
+        context.
+
+  0x08  RPR_ELEM_ABSORBABLE         (VAR, END)
+        Absorption judgment point.  Where to compare consecutive
+        iterations for absorption.
+          - Simple unbounded VAR (A+): set on the VAR itself
+          - Unbounded GROUP ((A B)+): set on the END element only
+
+  Accessor macros:
+    RPRElemIsReluctant(e)        (e)->flags & 0x01
+    RPRElemCanEmptyLoop(e)       (e)->flags & 0x02
+    RPRElemIsAbsorbableBranch(e) (e)->flags & 0x04
+    RPRElemIsAbsorbable(e)       (e)->flags & 0x08
+
+Example: PATTERN (A+ B | C)
+
+  AST: ALT(SEQ(VAR(A,1,INF), VAR(B,1,1)), VAR(C,1,1))
+
+  Compilation result:
+
+  idx  varId  depth  min  max  next  jump  Description
+  ------------------------------------------------------------
+   0   ALT    0      1    1    1     -1    Alternation start
+   1   A(0)   1      1    INF  2     3     Branch 1: A+
+   2   B(1)   1      1    1    4     -1    Branch 1: B -> FIN
+   3   C(2)   1      1    1    4     -1    Branch 2: C -> FIN
+   4   FIN    0      1    1    -1    -1    Pattern completion
+
+  - idx 0: ALT marker. next(=1) is the start of the first branch
+  - idx 1: Variable A. next(=2) is B, jump(=3) is the start of the second
+           branch
+  - idx 2: Variable B. next(=4) is FIN
+  - idx 3: Variable C. next(=4) is FIN
+  - idx 4: FIN marker. Match completion signal
+
+Roles of next and jump:
+
+  - next: The next element to move to "after consuming" the current element.
+          For VAR, the next position after a successful match.
+          For BEGIN/END, the next position inside/outside the group.
+
+  - jump: The element to "skip to."
+          In ALT, a jump from one branch to the next branch.
+          In BEGIN, a skip path to END+1 (for groups with min=0).
+          In END, a loop-back to the start of the group body.
+
+Example: PATTERN ((A B)+)
+
+  idx  varId    depth  min  max  next  jump  Description
+  --------------------------------------------------------------
+   0   BEGIN    0      1    INF  1     4     Group start
+   1   A(0)     1      1    1    2     -1    A
+   2   B(1)     1      1    1    3     -1    B
+   3   END      0      1    INF  4     1     Group end
+   4   FIN      0      1    1    -1    -1    Pattern completion
+
+  - idx 0: BEGIN. next(=1) enters the group body.
+           jump(=4) skips to after END = FIN (used when min=0).
+  - idx 3: END. next(=4) exits the group.
+           jump(=1) loops back to the start of the group body.
+
+IV-4a. Reluctant Flag (RPR_ELEM_RELUCTANT)
+
+The reluctant flag is set during Phase 4 (fillRPRPattern) when the AST node
+has reluctant == true. It reverses the priority of quantifier expansion at
+runtime:
+
+  Greedy (default):  try loop-back first, then exit  (prefer longer match)
+  Reluctant:         try exit first, then loop-back   (prefer shorter match)
+
+The flag is set on all elements that carry the quantifier:
+
+  Simple VAR (A+?):     RPR_ELEM_RELUCTANT on the VAR element
+  Group ((...)+?):      RPR_ELEM_RELUCTANT on BEGIN and END elements
+
+At runtime (nfa_advance), the flag controls DFS exploration order:
+
+  VAR with quantifier:
+    Greedy:    primary path = next (continue), clone = jump (skip)
+    Reluctant: primary path = jump (skip), clone = next (continue)
+
+  END element:
+    Greedy:    primary path = jump (loop-back), clone = next (exit)
+    Reluctant: primary path = next (exit), clone = jump (loop-back)
+
+  BEGIN with min=0:
+    Greedy:    primary path = next (enter group), clone = jump (skip)
+    Reluctant: primary path = jump (skip), clone = next (enter group)
+
+The absorption optimization requires greedy quantifiers. Reluctant
+quantifiers are excluded from absorbability analysis (see IV-5).
+
+IV-4b. Empty Loop Flag (RPR_ELEM_EMPTY_LOOP)
+
+The empty-loop flag is set during Phase 4 (fillRPRPatternGroup) on the END
+element when the group body is nullable -- i.e., every path through the
+body can match zero rows (all children are nullable).
+
+Example patterns that trigger this flag:
+
+  (A?)*    A is nullable (min=0), so group body is nullable -> END gets flag
+  (A? B?)+ Both children nullable -> body nullable -> END gets flag
+  (A | B*) B* is nullable, making the ALT nullable -> END gets flag
+
+The flag works in conjunction with the empty match cycle detection
+(elemIdx visited bitmap). Without this flag, cycle detection alone would
+cause legitimate matches to fail.
+
+Problem example: (A*){2,3}
+  - Iteration 1: A* consumes all available rows -> count=1, END reached
+  - Loop-back for iteration 2: A* matches zero rows -> END reached again
+  - Cycle detection sees the same elemIdx on the same row -> state killed
+  - count never reaches min(2) -> match fails (incorrect)
+
+With the RPR_ELEM_EMPTY_LOOP flag, nfa_advance_end creates two paths:
+the normal loop-back (which cycle detection will eventually kill) and
+a fast-forward exit clone that bypasses the loop entirely.
+(See IX-4(c) for detailed runtime behavior.)
+
+IV-5. Absorbability Analysis (RPR_ELEM_ABSORBABLE)
+
+Context absorption is an optimization technique that reduces O(n^2) to O(n).
+(Runtime behavior is described in Chapter VIII.)
+
+This phase determines whether the pattern has a structure suitable for the
+absorption optimization and sets flags on the relevant elements:
+
+  RPR_ELEM_ABSORBABLE         Absorption comparison point
+  RPR_ELEM_ABSORBABLE_BRANCH  Element within an absorbable region
+
+Eligibility conditions:
+
+  (1) SKIP PAST LAST ROW (not NEXT ROW)
+  (2) Frame end is UNBOUNDED FOLLOWING
+
+Structural conditions (isUnboundedStart + computeAbsorbabilityRecursive):
+
+  Case 1: Simple VAR+ (e.g., A+)
+          -> ABSORBABLE | ABSORBABLE_BRANCH set on the VAR
+  Case 2: GROUP+ with fixed-length children (min == max, recursively)
+          e.g., (A B)+, (A B{2})+, ((A (B C){2}){2})+
+          -> ABSORBABLE_BRANCH on all elements within the group,
+             ABSORBABLE | ABSORBABLE_BRANCH on END
+
+          Why this is safe: when every child has min == max, the group
+          is semantically equivalent to unrolling its body into {1,1}
+          elements.  E.g., (A B{2})+ behaves like (A B B)+.  Each
+          iteration consumes a fixed number of rows, so an earlier
+          context's count always dominates a later one's (monotonicity).
+
+  Case 3: GROUP+ whose body starts with VAR+ (e.g., (A+ B)+)
+          -> Recurses from BEGIN into the body, applying Case 1.
+            ABSORBABLE | ABSORBABLE_BRANCH set on A.
+            B and END get no flags -> absorption stops once past A.
+
+Absorbability is determined per-element, not per-pattern.
+Absorption comparison is performed only when a state resides at an
+element with the RPR_ELEM_ABSORBABLE flag. Once a state leaves the
+flagged region, absorption is permanently disabled for that state.
+
+Through this mechanism, the runtime guarantees monotonicity:
+"a context that started earlier always subsumes a context that
+started later."
+
+Chapter V  NFA Runtime Data Structures
+============================================================================
+
+V-1. RPRNFAState -- NFA State
+
+A single NFA state represents "how far the pattern has progressed."
+
+  Field         Description
+  -----------------------------------------------------------
+  elemIdx       Index of the current pattern element
+  counts[]      Repetition count per group depth
+  isAbsorbable  Whether the state is in an absorbable region
+  next          Next state in the linked list
+
+The size of the counts array is rpPattern->maxDepth (= maximum group
+nesting depth + 1), allocated as a flexible array member at the end of
+the struct.
+
+Example: In PATTERN ((A B)+ C), a state waiting for B in the 3rd iteration
+
+  Element array: [0:BEGIN(d0) 1:A(d1) 2:B(d1) 3:END(d0) 4:C(d0) 5:FIN]
+
+  elemIdx = 2 (B, depth 1)
+  counts[0] = 2 (depth 0: depth of END. Group completed 2 iterations)
+  counts[1] = 1 (depth 1: depth of B. A matched in current iteration)
+
+  Counts are indexed by depth, not by elemIdx.
+  counts[0] is incremented when passing through END(depth 0),
+  and the group repetition count is preserved even when
+  the state is at B(depth 1).
+
+Definition of two states being "equal":
+
+  Two states are equal if they have the same elemIdx and the same counts
+  up to the depth of that element.
+  nfa_states_equal() compares counts[0..elem->depth] using memcmp.
+  Only counts at or below the depth of the current element are meaningful.
+
+V-2. RPRNFAContext -- Matching Context
+
+A single context represents "a matching attempt started from a specific
+start row."
+
+  Field                 Description
+  ---------------------------------------------------------------------
+  states                Linked list of active NFA states
+  matchStartRow         Row number where matching started
+  matchEndRow           Row number where matching completed
+                        (-1 if incomplete)
+  lastProcessedRow      Last row processed
+  matchedState          State that reached FIN (for greedy fallback)
+  hasAbsorbableState    Whether this context can absorb other contexts
+  allStatesAbsorbable   Whether this context can be absorbed
+  next, prev            Doubly-linked list
+
+Since the NFA is nondeterministic, multiple states can coexist
+simultaneously within a single context.
+
+Example: In PATTERN (A | B) C, if the first row matches both A and B,
+two states coexist within the context:
+
+  State 1: elemIdx=3 (waiting for C, via branch A)
+  State 2: elemIdx=3 (waiting for C, via branch B)
+
+In this case, since the (elemIdx, counts) of the two states are equal,
+nfa_add_state_unique() retains only State 1 (branch A), which was
+added first.
+Because DFS processes the first branch of ALT first, the state via A
+is registered first, and the state via B is discarded as a duplicate.
+This is the preferment guarantee.
+
+V-3. RPR Fields of WindowAggState
+
+  nfaContext / nfaContextTail   Doubly-linked list of active contexts
+  nfaContextFree                Reuse pool for contexts
+  nfaStateFree                  Reuse pool for states
+  nfaVarMatched                 Per-row cache: varMatched[varId]
+  nfaVisitedElems               Bitmap for cycle detection
+  nfaStateSize                  Precomputed size of RPRNFAState
+
+Memory management:
+
+  States and contexts are managed through their own free lists.
+  Instead of palloc, they are obtained from the reuse pool, and
+  returned to the pool upon deallocation.
+  This reduces the overhead of frequent allocation/deallocation.
+
+Chapter VI  NFA Execution: 3-Phase Model
+============================================================================
+
+VI-1. Entry Point and Overall Flow
+
+When the window function processes each row, row_is_in_reduced_frame()
+is called. This function determines whether the current row belongs to
+a matched frame, and if necessary, calls update_reduced_frame() to
+drive the NFA.
+
+Flow of update_reduced_frame():
+
+  (1) Find or create a context for the target row
+  (2) Enter the row processing loop
+  (3) After the loop ends, record the match result
+
+Pseudocode of the row processing loop:
+
+  targetCtx = ExecRPRGetHeadContext(pos)
+  if targetCtx == NULL:
+      targetCtx = ExecRPRStartContext(pos)
+
+  for currentPos = startPos; targetCtx->states != NULL; currentPos++:
+      if not nfa_evaluate_row(currentPos):  -- row does not exist
+          ExecRPRFinalizeAllContexts()      -- finalize all contexts
+          ExecRPRCleanupDeadContexts()      -- clean up after finalization
+          break
+
+      ExecRPRProcessRow(currentPos)         -- 3-phase processing
+      ExecRPRStartContext(currentPos + 1)   -- pre-create next start point
+      ExecRPRCleanupDeadContexts()          -- remove dead contexts
+
+Key point: Processing a single row may require processing multiple rows
+ahead. Due to the nature of window functions, determining the frame for
+row N requires looking at rows beyond N.
+
+VI-2. Context Creation: ExecRPRStartContext()
+
+Creates a new context and performs the initial advance.
+
+  (1) Allocate context via nfa_context_alloc()
+  (2) Set matchStartRow = pos
+  (3) Create initial state: elemIdx=0 (first pattern element),
+      counts=all zero
+  (4) Call nfa_advance(initialAdvance=true)
+
+The initial advance expands epsilon transitions at the beginning of
+the pattern. For example, the initial advance for PATTERN ((A | B) C):
+
+  Start: elemIdx=0 (ALT)
+    -> Expand ALT branches
+      -> elemIdx=1 (A) -- VAR, so add state; stop here
+      -> elemIdx=2 (B) -- VAR, so add state; stop here
+
+  Result: Two states in the context {waiting for A, waiting for B}
+
+During the initial advance, reaching FIN is not recorded as a match.
+This is to prevent empty matches.
+
+VI-3. Row Evaluation: nfa_evaluate_row()
+
+Evaluates all variable conditions in the DEFINE clause at once for
+the current row.
+
+  for each defineClause[i]:
+      result = ExecEvalExpr(defineClause[i])
+      varMatched[i] = (not null and true)
+
+To support row navigation operators (PREV, NEXT, FIRST, LAST),
+a 1-slot model is used: only ecxt_outertuple is set to the current
+row.  Navigation is handled by EEOP_RPR_NAV_SET/RESTORE opcodes
+emitted during DEFINE expression compilation:
+
+  NAV_SET:     save ecxt_outertuple, swap in target row via nav_slot
+  (evaluate):  argument expression reads from swapped slot
+  NAV_RESTORE: restore original ecxt_outertuple
+
+Compound navigation (PREV(FIRST()), NEXT(FIRST()), PREV(LAST()),
+NEXT(LAST())) is flattened by the parser into a single RPRNavExpr
+with a compound kind (RPR_NAV_PREV_FIRST, etc.).  The executor
+computes the target position in two steps: first the inner reference
+point (match_start + N or currentpos - N) with match-range validation,
+then the outer adjustment (+/- M) with partition-range validation.
+If either step is out of range, the result is NULL.
+
+nav_slot caches the last fetched position (nav_slot_pos) to avoid
+redundant tuplestore lookups when multiple navigation calls target
+the same row.
+
+The varMatched array is referenced later in Phase 1 (Match).
+
+VI-4. Per-Context Re-evaluation (match_start-dependent variables)
+
+DEFINE variables that depend on match_start (those containing FIRST,
+LAST-with-offset, or compound PREV_FIRST/NEXT_FIRST/PREV_LAST/NEXT_LAST)
+are identified at plan time via defineMatchStartDependent.  The shared
+evaluation in nfa_evaluate_row() uses the head context's matchStartRow
+for FIRST/LAST base position.
+
+When processing a context whose matchStartRow differs from the shared
+value, nfa_reevaluate_dependent_vars() temporarily sets nav_match_start
+to that context's matchStartRow and re-evaluates only the dependent
+variables.  The original nav_match_start and currentpos are saved and
+restored after re-evaluation.
+
+Summary of evaluation strategy by navigation content:
+
+  Navigation content               evaluation
+  -------------------------------------------------------
+  No navigation                    shared (once per row)
+  PREV/NEXT only                   shared (once per row)
+  LAST (no offset)                 shared (once per row)
+  LAST (with offset)               per-context
+  FIRST (any)                      per-context
+  Compound (inner FIRST)           per-context
+  Compound (inner LAST, no off.)   shared (once per row)
+  Compound (inner LAST, w/off.)    per-context
+
+VI-5. Tuplestore Mark and Trim (nodeWindowAgg.c)
+
+Navigation functions require access to past rows via the tuplestore.
+To allow tuplestore_trim() to free rows that are no longer reachable,
+the planner computes two offsets (see compute_nav_offsets):
+
+  navMaxOffset (Nav Mark Lookback):
+    Maximum backward reach from currentpos.  Contributed by PREV,
+    LAST-with-offset, and compound PREV_LAST/NEXT_LAST.
+    Mark position: currentpos - navMaxOffset.
+
+  navFirstOffset (Nav Mark Lookahead):
+    Minimum forward offset from match_start.  Contributed by FIRST
+    and compound PREV_FIRST/NEXT_FIRST.  Can be negative when
+    compound PREV_FIRST looks before match_start.
+    Mark position: oldest_context->matchStartRow + navFirstOffset.
+
+The actual mark is set to: min(lookback_mark, lookahead_mark).
+This ensures all rows reachable by any navigation function are retained.
+
+When offsets contain non-constant expressions (Param), the planner sets
+navMaxOffsetKind/navFirstOffsetKind to RPR_NAV_OFFSET_NEEDS_EVAL and the
+executor evaluates them at init time.  On overflow, the kind is set to
+RPR_NAV_OFFSET_RETAIN_ALL, disabling trim for that dimension.
+
+VI-6. ExecRPRProcessRow(): 3-Phase Processing
+
+NFA processing for a single row is divided into three phases:
+
+  +--------------------------------------------+
+  |  Phase 1: MATCH (convergence)              |
+  |  Compare the current row against each VAR  |
+  |  state. Remove states that fail to match.  |
+  |                                            |
+  |  Phase 2: ABSORB (absorption)              |
+  |  Merge duplicate contexts to prevent       |
+  |  state explosion.                          |
+  |                                            |
+  |  Phase 3: ADVANCE (expansion)              |
+  |  Expand epsilon transitions to prepare     |
+  |  for the next row.                         |
+  +--------------------------------------------+
+
+This ordering is important:
+
+  - Match executes first to "consume the current row."
+  - Absorb executes immediately after Match, when states have been updated.
+  - Advance executes last to prepare "states waiting for the next row."
+
+Chapter VII  Phase 1: Match
+============================================================================
+
+nfa_match() iterates through each state in the context:
+
+  (1) Check whether the state's elemIdx is a VAR element
+  (2) Compare against the current row using nfa_eval_var_match()
+  (3) Match success: increment repetition count, retain state
+  (4) Match failure: remove state
+
+Match determination (nfa_eval_var_match):
+
+  If varId is within the range of defineVariableList:
+      Use the value of varMatched[varId]
+
+  If varId exceeds the range (variable not defined in DEFINE):
+      Unconditionally true (matches all rows)
+
+Immediate advance for simple VARs:
+
+  For a VAR with min=1, max=1 where the next element is END,
+  the Match phase processes through END immediately.
+  This is necessary for accurate state comparison in Phase 2 (Absorb).
+
+  Example: In PATTERN ((A B)+), when A matches, it immediately advances
+  to B, and when B matches, it immediately advances through END to
+  complete the group count. This enables absorption comparison with
+  other contexts.
+
+Chapter VIII  Phase 2: Absorb (Context Absorption)
+============================================================================
+
+VIII-1. Problem
+
+In the current implementation, a new context is started for each row
+processed.
+Applying PATTERN (A+) to 10 rows produces 10 contexts,
+each of which tracks state independently.
+
+If there are N rows, the total number of states becomes O(N^2):
+
+  Context 1 (started at row 1): can match A up to N times
+  Context 2 (started at row 2): can match A up to N-1 times
+  ...
+  Context N (started at row N): can match A 1 time
+
+VIII-2. Solution: Context Absorption
+
+Key observation: a context started earlier contains
+all matches of a later-started context (monotonicity principle).
+
+If Context 1 started at row 1 and matched A 5 times,
+the state where Context 2 (started at row 2) matched A 4 times
+is already contained within Context 1.
+
+Therefore Context 2 can be "absorbed" into Context 1.
+
+VIII-3. Absorption Conditions
+
+Planner-time prerequisites (all must hold for absorption to be enabled):
+
+  (a) SKIP PAST LAST ROW.  SKIP TO NEXT ROW creates overlapping
+      contexts that cannot be safely absorbed.
+  (b) Unbounded frame (ROWS BETWEEN CURRENT ROW AND UNBOUNDED
+      FOLLOWING).  Limited frames apply differently to each context,
+      breaking the monotonicity principle.
+  (c) No match_start-dependent navigation in DEFINE.
+
+      Mechanism: each context has a different matchStartRow, so FIRST
+      resolves to a different row for each context at the same
+      currentpos.  An earlier context's DEFINE result no longer
+      subsumes a later one's, making count-dominance comparison
+      invalid.  Rather than comparing matchStartRow at runtime
+      (which would complicate the absorb path), any match_start
+      dependency disables absorption entirely.
+
+      Navigation content              match_start dep.  absorption
+      ------------------------------------------------------------
+      No navigation                   none              safe
+      PREV/NEXT only                  none              safe
+      LAST (no offset)                none              safe
+      LAST (with offset)              boundary check    unsafe
+      FIRST (any)                     direct            unsafe
+      Compound (inner FIRST)          direct            unsafe
+      Compound (inner LAST, no off.)  none              safe
+      Compound (inner LAST, w/off.)   boundary chk      unsafe
+
+Runtime conditions (evaluated per context pair):
+
+  (1) The pattern is marked as isAbsorbable (see IV-5)
+  (2) allStatesAbsorbable of the target context is true
+  (3) An earlier context "covers" all states of the target
+
+Cover condition (nfa_states_covered):
+
+  A state with the same elemIdx exists in the earlier context,
+  and the count at that depth is greater than or equal -- then it is covered.
+
+VIII-4. Dual-Flag Design
+
+Two boolean flags make the absorption decision efficient:
+
+  hasAbsorbableState (monotonic: only true->false transition possible)
+    "Does this context have the ability to absorb other contexts?"
+    true if at least one absorbable state exists.
+    Transitions to false when states are removed leaving no absorbable
+    states.
+    Once false, it never becomes true again.
+
+  allStatesAbsorbable (dynamic: can fluctuate)
+    "Can this context be absorbed?"
+    true if all states are in an absorbable region.
+    Becomes false when a non-absorbable state is added; reverts to true
+    when it is removed.
+
+VIII-5. Absorption Order
+
+nfa_absorb_contexts() traverses from tail (newest) to head (oldest).
+
+  for ctx = tail to head:
+      if ctx.allStatesAbsorbable:
+          for older = ctx.prev to head:
+              if older.hasAbsorbableState:
+                  if nfa_states_covered(older, ctx):
+                      free(ctx)  -- absorbed
+                      break
+
+Since inspection starts from the newest context, the most recently started
+(= having the shortest match) context is absorbed first.
+
+Chapter IX  Phase 3: Advance (Epsilon Transition Expansion)
+============================================================================
+
+IX-1. Overview
+
+nfa_advance() expands epsilon transitions from each state after Match,
+generating "new states waiting for the next row."
+
+An epsilon transition is a transition that moves without consuming a row:
+
+  - ALT: branch to each alternative
+  - BEGIN: enter group (or skip if min=0)
+  - END: loop-back within group (or exit when condition is met)
+  - FIN: record match completion
+  - VAR loop/exit: repeat/exit according to the quantifier
+
+Expansion stops upon reaching a VAR element, and the state is added.
+This is because VAR is the element that "will consume the next row."
+
+IX-2. Processing Order: DFS and Preferment
+
+advance processes states in lexicographic order,
+performing Depth-First Search (DFS) on each state.
+
+This DFS order is what guarantees the SQL standard's "preferment":
+
+  The branch that appears first in the PATTERN text takes precedence.
+
+Example: PATTERN (A | B) C
+
+  The first branch A of the ALT takes precedence over the second branch B.
+  When both A and B can match, the match via A is selected.
+
+nfa_add_state_unique() prevents duplicate addition of the same state,
+so the state added first (= from the preferred branch) is retained.
+
+IX-3. Routing Function: nfa_route_to_elem()
+
+All inter-element transitions in the advance phase go through
+nfa_route_to_elem().
+This function branches its behavior based on the type of the next element:
+
+  If the next element is VAR:
+    (1) Add the state to the context (nfa_add_state_unique)
+    (2) If the VAR has min=0, also add a skip path (recurse via next)
+    -> Expansion stops here (VAR is the element that "will consume the next
+       row")
+
+  If the next element is non-VAR (ALT, BEGIN, END, FIN):
+    -> Recursively call nfa_advance_state() to continue expansion
+
+With this structure, advance recursively follows epsilon transitions
+until reaching a VAR, consistently stopping only at VAR elements.
+
+IX-4. Per-Element advance Behavior
+
+(a) ALT (nfa_advance_alt)
+
+  Upon encountering an ALT element, all branches are expanded in order.
+  The first element of each branch is connected via a jump pointer.
+
+  idx=0 (ALT) -> branch 1 start (next) -> branch 2 start (jump) -> ...
+
+  nfa_advance_state() is recursively called for each branch.
+
+(b) BEGIN (nfa_advance_begin)
+
+  Handles group entry.
+  jump points to the element after END (= first element outside the group).
+
+  Greedy (default):
+    (1) Enter the group body (move via next, reset the count at that depth)
+    (2) If min=0, also add a group skip path (move via jump)
+
+  Reluctant:
+    Order reversed -- skip path first, group entry second.
+    If the skip path reaches FIN, the group entry path is not generated
+    (shortest match preferred).
+
+(c) END (nfa_advance_end)
+
+  Handles group termination. This is the core of the repetition logic.
+
+  Let count be the count at the current depth:
+
+  count < min:
+    Loop-back (move via jump, repeat the group body)
+
+    If the RPR_ELEM_EMPTY_LOOP flag is set:
+      In addition to loop-back, also add a fast-forward exit path.
+      This is because the body may produce an empty match, causing count
+      to never reach min. fast-forward resets counts[depth] to 0
+      and exits via next (treating the remaining required iterations
+      as empty matches).
+
+  min <= count < max:
+    Greedy: loop-back first, exit second
+    Reluctant: exit first, loop-back second
+               If the exit path reaches FIN, loop-back is omitted.
+
+  count >= max:
+    Unconditional exit (move via next)
+
+  On exit: reset counts[depth] = 0, and if the next element is an outer END,
+  increment the count at the outer depth.
+
+(d) VAR (nfa_advance_var)
+
+  Handles repeat/exit for a VAR element with a quantifier.
+
+  Let count be the count at the current depth:
+
+  count < min:
+    Unconditional loop (stay at the same elemIdx, wait for the next row)
+
+  min <= count < max:
+    Greedy: loop first, exit (next) second
+    Reluctant: exit first, loop second
+               If the exit path reaches FIN, loop is omitted.
+
+  count >= max:
+    Unconditional exit (move via next)
+
+  On exit: reset counts[depth] = 0.
+
+(e) FIN
+
+  Match success. The current state is moved to matchedState for recording,
+  and matchEndRow is set to the current row.
+
+  Upon reaching FIN, all remaining unprocessed states are removed
+  (early termination). By DFS order, the path that reached FIN first
+  has the highest preferment, so the rest are inferior paths.
+  This is the core mechanism that guarantees preferment.
+
+  In SKIP PAST LAST ROW mode, upon reaching FIN, subsequent contexts
+  that started within the match range are immediately pruned.
+
+IX-5. State Deduplication: nfa_add_state_unique()
+
+When adding a new state to a context, it is compared against existing
+states;
+if an identical state already exists, it is not added.
+
+Comparison criteria: elemIdx + counts[0..elem->depth] (see V-1)
+
+This deduplication is the core mechanism that suppresses NFA state
+explosion.
+Because DFS order causes preferred-branch states to be added first,
+identical states from lower-priority branches are automatically discarded.
+
+IX-6. Cycle Detection: nfaVisitedElems
+
+When a group body can produce an empty match,
+looping back from END may cause an infinite loop.
+
+Example: PATTERN ((A?)*)
+
+  A? has min=0, so it can pass through without matching.
+  If the outer group repeats: BEGIN -> A? skip -> END -> BEGIN -> ...
+
+To prevent this:
+
+  (1) At compile time: set the RPR_ELEM_EMPTY_LOOP flag on the END
+      of groups whose body is nullable.
+      The runtime effect of this flag is described in IX-4(c):
+      when count < min, a fast-forward exit path is added,
+      resolving the deadlock where count cannot increase due to empty
+      matches.
+
+  (2) At runtime: initialize the nfaVisitedElems bitmap immediately before
+      DFS expansion of each state within advance (once per state).
+      During DFS, epsilon elements (END, ALT, BEGIN) are marked in the
+      bitmap at nfa_advance_state entry.  VAR elements are marked later
+      when added to the state list (nfa_add_state_unique), so that
+      legitimate loop-back to the same VAR in a new group iteration
+      (e.g., END -> ALT -> same VAR) is not blocked.
+      If a previously visited elemIdx is revisited, that path is terminated.
+
+  Note: the bitmap tracks only elemIdx and does not consider counts.
+  Therefore, legitimate revisits to the same elemIdx but with different
+  counts may also be blocked.  This only occurs when the group body is
+  nullable (all paths can match empty), causing END -> loop-back ->
+  skip -> END within a single DFS.  In such cases the END element has
+  the RPR_ELEM_EMPTY_LOOP flag, so the fast-forward exit (IX-4(c))
+  provides an alternative path that bypasses the cycle.
+
+Chapter X  Match Result Processing
+============================================================================
+
+X-1. Match Result
+
+RPR tracks the current match result as a single entry in WindowAggState
+with four fields: rpr_match_valid, rpr_match_matched, rpr_match_start,
+and rpr_match_length.  When rpr_match_valid is true, the entry describes
+the match result for the position at rpr_match_start: rpr_match_matched
+indicates success or failure, and rpr_match_length gives the number of
+rows consumed.  A match with rpr_match_length 0 represents an empty match
+(pattern matched but consumed no rows).  When rpr_match_valid is false,
+the position has not been evaluated yet (RF_NOT_DETERMINED).
+
+A row's status against the current match result can be obtained by
+calling get_reduced_frame_status().
+
+X-2. AFTER MATCH SKIP
+
+Determines the starting point for the next match attempt after a successful
+match:
+
+  SKIP TO NEXT ROW:
+    New match attempt begins from the row after the match start row.
+    Overlapping matches are possible.
+
+  SKIP PAST LAST ROW:
+    New match attempt begins from the row after the match end row.
+    Only non-overlapping matches are possible.
+
+X-3. INITIAL vs SEEK
+
+  Standard definition (section 6.12):
+  INITIAL: "is used to look for a match whose first row is R."
+  SEEK:    "is used to permit a search for the first match anywhere
+           from R through the end of the full window frame."
+  In either case, if there is no match, the reduced window frame is empty.
+  The default is INITIAL.
+
+  Current implementation:
+  SEEK is not supported (the parser raises an error).
+  Only INITIAL is supported, searching only for matches starting at each
+  row position pos.
+
+X-4. Bounded Frame Handling
+
+  When the frame is bounded (e.g., ROWS BETWEEN CURRENT ROW AND 5
+  FOLLOWING), ExecRPRProcessRow receives hasLimitedFrame=true and
+  frameOffset indicating the upper bound.  Before the match phase,
+  any context whose match has exceeded the frame boundary
+  (currentPos >= matchStartRow + frameOffset + 1) is finalized early
+  by forcing a mismatch.  This prevents matches from extending beyond
+  the window frame.  The sum is clamped to PG_INT64_MAX on overflow.
+
+  Note that bounded frames also disable context absorption at the
+  planner level (see VIII-3(b)), since the frame boundary breaks the
+  monotonicity assumption required for correct absorption.
+
+Chapter XI  Worked Example: Full Execution Trace
+============================================================================
+
+XI-1. Query
+
+  SELECT company, tdate, price,
+         first_value(price) OVER w AS start_price,
+         last_value(price) OVER w AS end_price
+  FROM stock
+  WINDOW w AS (
+    PARTITION BY company
+    ORDER BY tdate
+    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+    AFTER MATCH SKIP PAST LAST ROW
+    PATTERN (A+ B)
+    DEFINE A AS price > PREV(price),
+           B AS price < PREV(price)
+  );
+
+XI-2. Data
+
+  Row#    tdate       price
+  --------------------------
+  0       2024-01-01  100
+  1       2024-01-02  110
+  2       2024-01-03  120
+  3       2024-01-04  115
+  4       2024-01-05  130
+
+XI-3. Compilation Result
+
+  PATTERN (A+ B) -> unchanged after optimization
+
+  idx  varId  depth  min  max  next  jump
+  -----------------------------------------
+   0   A(0)   0      1    INF  1     -1     A+
+   1   B(1)   0      1    1    2     -1     B
+   2   FIN    0      1    1    -1    -1
+
+  DEFINE: A -> "price > PREV(price)", B -> "price < PREV(price)"
+  isAbsorbable = true (A+ is a simple unbounded VAR)
+
+XI-4. Execution Trace
+
+--- Row 0 (price=100) ---
+
+  update_reduced_frame(0) called.
+
+  Context C0 created (matchStartRow=0).
+  Initial advance: elemIdx=0(A) -> VAR, so state is added.
+  C0.states = [{elemIdx=0, counts=[0]}]
+
+  nfa_evaluate_row(0):
+    A: price(100) > PREV(price) -> no PREV -> false
+    B: price(100) < PREV(price) -> no PREV -> false
+    varMatched = [false, false]
+
+  ExecRPRProcessRow(0):
+    Phase 1 (Match): A(0) state vs varMatched[0]=false -> state removed
+    C0.states = [] (empty)
+
+    Phase 2 (Absorb): skipped (no states)
+    Phase 3 (Advance): skipped (no states)
+
+  C0.states is empty, so the loop terminates.
+  matchEndRow < matchStartRow -> unmatched.
+
+--- Row 1 (price=110) ---
+
+  update_reduced_frame(1) called.
+
+  Context C1 created (matchStartRow=1).
+  Initial advance: C1.states = [{elemIdx=0, counts=[0]}]
+
+  nfa_evaluate_row(1):
+    A: 110 > PREV(100) -> true
+    B: 110 < PREV(100) -> false
+    varMatched = [true, false]
+
+  ExecRPRProcessRow(1):
+    Phase 1 (Match): A(0) match succeeds -> counts[0]++ -> counts=[1]
+    C1.states = [{elemIdx=0, counts=[1]}]
+
+    Phase 3 (Advance):
+      State {elemIdx=0, counts=[1]}: A+ (min=1, count=1, max=INF)
+        count >= min, so:
+        Greedy -> loop first: keep {elemIdx=0, counts=[1]}
+                  exit: reset counts[0]=0, next(=1) -> {elemIdx=1,
+                        counts=[0]}
+    C1.states = [{elemIdx=0, counts=[1]}, {elemIdx=1, counts=[0]}]
+
+--- Row 2 (price=120) ---
+
+  Context C2 created (matchStartRow=2).
+  Initial advance: C2.states = [{elemIdx=0, counts=[0]}]
+
+  nfa_evaluate_row(2):
+    A: 120 > PREV(110) -> true
+    B: 120 < PREV(110) -> false
+    varMatched = [true, false]
+
+  C1 ExecRPRProcessRow(2):
+    Phase 1 (Match):
+      {elemIdx=0, counts=[1]}: A matches -> counts=[2]
+      {elemIdx=1, counts=[0]}: B does not match -> removed
+    C1.states = [{elemIdx=0, counts=[2]}]
+
+  C2 ExecRPRProcessRow(2):
+    Phase 1 (Match):
+      {elemIdx=0, counts=[0]}: A matches -> counts=[1]
+    C2.states = [{elemIdx=0, counts=[1]}]
+
+    Phase 2 (Absorb):
+      Does C1 (started earlier) cover C2?
+        C1: {elemIdx=0, counts=[2]}, C2: {elemIdx=0, counts=[1]}
+        Same elemIdx, C1.counts >= C2.counts -> covered
+      C2 absorbed. -> removed.
+
+    Phase 3 (Advance):
+      {elemIdx=0, counts=[2]}: Greedy -> loop + exit
+        Loop: {elemIdx=0, counts=[2]}
+        Exit: reset counts[0]=0, next(=1) -> {elemIdx=1, counts=[0]}
+    C1.states = [{elemIdx=0, counts=[2]}, {elemIdx=1, counts=[0]}]
+
+  Context C3 created (matchStartRow=3).
+
+--- Row 3 (price=115) ---
+
+  nfa_evaluate_row(3):
+    A: 115 > PREV(120) -> false
+    B: 115 < PREV(120) -> true
+    varMatched = [false, true]
+
+  ExecRPRProcessRow(3):
+    Phase 1 (Match):
+      {elemIdx=0, counts=[2]}: A does not match -> removed
+      {elemIdx=1, counts=[0]}: B matches -> counts=[1]
+    C1.states = [{elemIdx=1, counts=[1]}]
+
+    Phase 3 (Advance):
+      {elemIdx=1, counts=[1]}: B (min=1, max=1)
+        count(1) >= max(1) -> unconditional exit
+        Reset counts[0]=0, next = 2 (FIN)
+      FIN reached -> matchEndRow = 3, matchedState recorded.
+      Early termination: no remaining states, so completed immediately.
+    C1.states = [] (empty after reaching FIN)
+
+  C1.states is empty and matchEndRow=3 >= matchStartRow=1 -> match succeeds.
+
+  rpr_match_start = 1, rpr_match_length = 3
+
+--- Row 4 (price=130) ---
+
+  update_reduced_frame(4) called.
+  C3 was already created but matchStartRow=3, so it is not applicable.
+  New context C4 created (matchStartRow=4).
+
+  nfa_evaluate_row(4):
+    A: 130 > PREV(115) -> true
+    B: 130 < PREV(115) -> false
+
+  ... No subsequent rows, so ExecRPRFinalizeAllContexts() is called.
+  Match incomplete -> unmatched.
+
+XI-5. Final Result
+
+  Row 0: unmatched     -> frame = the row itself
+  Row 1: match head    -> frame = rows 1 through 3
+  Row 2: inside match  -> skipped
+  Row 3: inside match  -> skipped
+  Row 4: unmatched     -> frame = the row itself
+
+Chapter XII  Summary of Key Design Decisions
+============================================================================
+
+XII-1. Flat Array vs Tree-Based NFA
+
+  Choice: Flat array (RPRPatternElement[])
+
+  Rationale:
+  - Cache-friendly: 16-byte fixed size, contiguous memory
+  - Index-based references: 2-byte indices instead of pointers
+  - Easy to serialize: can use memcpy when passing to plan nodes
+
+XII-2. Forward-only Execution vs Backtracking
+
+  Choice: Forward-only (state set tracking)
+
+  Rationale:
+  - Backtracking takes exponential time in the worst case
+  - NFA simulation guarantees polynomial time
+  - DFS order naturally guarantees preferment.
+    Greedy/reluctant per quantifier requires only reversing the DFS order
+  - Window functions receive sorted rows sequentially.
+    Forward-only fits directly into this pipeline,
+    whereas backtracking requires re-fetching previous rows
+  - DEFINE conditions are SQL expressions (PREV, RUNNING aggregates, etc.)
+    with high re-evaluation cost. Forward-only requires only one evaluation
+    per row
+
+XII-3. Per-Context Management
+
+  Choice: Independent context per start row
+
+  Rationale:
+  - Supports overlapping matches under SKIP TO NEXT ROW
+  - Determines the frame for each row independently
+  - Absorption optimization can eliminate redundant contexts in O(n)
+
+XII-4. Memory Pool Management
+
+  Choice: Custom free list
+
+  Rationale:
+  - NFA states are created and destroyed in large numbers per row
+  - Avoids palloc/pfree overhead
+  - State size is variable (counts[] array), but within a single query
+    maxDepth is fixed, so all states have the same size
+
+XII-5. Execution Optimization Summary
+
+  The following optimizations make the NFA simulation practical.
+
+  -- Compile-time --
+
+  (1) AST Optimization (IV-3)
+
+    Simplifies the AST before converting the pattern to an NFA.
+    Reduces the number of NFA elements through consecutive variable
+    merging (A A -> A{2}), SEQ flattening, quantifier multiplication,
+    and other transformations.
+
+    Significance: Reducing the element count directly shrinks the state
+    space, decreasing the cost of all subsequent runtime phases (match,
+    absorb, advance).
+
+  -- Runtime: advance phase --
+
+  (2) Group Skip (IX-4(b))
+
+    At the BEGIN of a group with min=0, uses jump to skip the entire
+    group. Moves directly to the first element outside the group without
+    exploring the group body. Greedy enters then skips; Reluctant skips
+    then enters.
+
+    Significance: For optional groups (min=0), immediately generates
+    a skip path without exploring the body, avoiding unnecessary DFS
+    expansion.
+
+  (3) State Deduplication (IX-5)
+
+    During advance, DFS may generate states with the same (elemIdx,
+    counts) combination through multiple paths. Additionally, for
+    group absorption, nfa_match performs inline advance from bounded
+    VARs (count >= max) within the absorbable region (ABSORBABLE_BRANCH)
+    through END chains to reach the judgment point (ABSORBABLE END).
+    This process can also produce duplicate states reaching the same END.
+    nfa_add_state_unique() blocks duplicate addition of identical states
+    in both cases.
+
+    Significance: Prevents exponential growth of the state count in
+    ALT branches and quantifier expansion. Since DFS order causes the
+    preferred branch's state to be registered first, identical states
+    from lower-priority branches are automatically discarded, thereby
+    also guaranteeing preferment.
+
+  (4) Cycle Detection and Fast-Forward (IX-6, IX-4(c))
+
+    When a nullable group body (e.g., A?) repeats empty matches,
+    the END -> BEGIN loop-back can continue indefinitely.
+
+    Two mechanisms resolve this:
+    - A visited bitmap (nfaVisitedElems) blocks revisitation of the
+      same element, preventing infinite loops (safety)
+    - At an END with the RPR_ELEM_EMPTY_LOOP flag set, when
+      count < min, the remaining required iterations are treated as
+      empty matches and a fast-forward exit path out of the group is
+      added (correctness)
+
+    Significance: Cycle detection guarantees termination, and
+    fast-forward guarantees that the min condition is satisfied.
+    Without these, patterns containing nullable groups would fall
+    into infinite loops or fail to match.
+
+  (5) Match Pruning (IX-4(e))
+
+    When a state reaches FIN during advance, all remaining unprocessed
+    states of that context are removed. Because of DFS order, the path
+    that reaches FIN first has the highest preferment, so the remaining
+    paths are inferior.
+
+    Significance: Once the best match is determined, exploration of
+    inferior paths is immediately terminated. This mechanism achieves
+    both preferment guarantees and performance optimization.
+
+  -- Runtime: inter-context --
+
+  (6) Early Termination (SKIP PAST LAST ROW)
+
+    In SKIP PAST LAST ROW mode, when a match is found, subsequent
+    contexts whose start rows fall within the match range are pruned
+    immediately without further processing.
+    In SKIP TO NEXT ROW mode, overlapping contexts are preserved
+    because each row requires its own independent match.
+
+    Significance: Prunes subsequent contexts whose start rows overlap
+    with a prior match range, avoiding unnecessary processing.
+
+  (7) Context Absorption (Chapter VIII)
+
+    If an independent context is created for each row, O(n^2) states
+    accumulate. By exploiting the monotonicity that an earlier-started
+    context subsumes the states of a later-started context, redundant
+    contexts are eliminated early.
+
+    Absorbability is determined per-element; comparison is performed
+    only at elements with the RPR_ELEM_ABSORBABLE flag (see IV-5).
+
+    Significance: Keeps the number of active contexts at a constant
+    level, achieving O(n^2) -> O(n) time complexity. Without this,
+    performance degrades sharply on long partitions.
+
+Appendix A. Key Function Index
+============================================================================
+
+  Function                      File                  Role
+  --------------------------------------------------------------------------
+  transformRPR                  parse_rpr.c           Parser entry point
+  transformDefineClause         parse_rpr.c           DEFINE transformation
+  collectPatternVariables       rpr.c                 Variable collection
+  buildDefineVariableList       rpr.c                 DEFINE variable list
+  buildRPRPattern               rpr.c                 NFA compilation main
+  optimizeRPRPattern            rpr.c                 AST optimization
+  fillRPRPattern                rpr.c                 NFA element generation
+  finalizeRPRPattern            rpr.c                 Finalization
+  computeAbsorbability          rpr.c                 Absorption analysis
+  update_reduced_frame          nodeWindowAgg.c       Execution main loop
+  nfa_evaluate_row              nodeWindowAgg.c       DEFINE evaluation
+  ExecRPRStartContext           execRPR.c             Context creation
+  ExecRPRProcessRow             execRPR.c             3-phase processing
+  nfa_match                     execRPR.c             Phase 1
+  nfa_absorb_contexts           execRPR.c             Phase 2
+  nfa_advance                   execRPR.c             Phase 3
+  nfa_advance_state             execRPR.c             Per-state branching
+  nfa_route_to_elem             execRPR.c             Element routing
+  nfa_advance_alt               execRPR.c             ALT handling
+  nfa_advance_begin             execRPR.c             BEGIN handling
+  nfa_advance_end               execRPR.c             END handling
+  nfa_advance_var               execRPR.c             VAR handling
+  nfa_add_state_unique          execRPR.c             Deduplication
+  nfa_states_covered            execRPR.c             Absorption check
+  nfa_reevaluate_dependent_vars execRPR.c             Per-context re-eval
+  ExecRPRGetHeadContext         execRPR.c             Context lookup
+  ExecRPRFreeContext            execRPR.c             Context deallocation
+  ExecRPRCleanupDeadContexts    execRPR.c             Dead context cleanup
+  ExecRPRFinalizeAllContexts    execRPR.c             Partition-end finalize
+  ExecRPRRecordContextSuccess   execRPR.c             Stats: match success
+  ExecRPRRecordContextFailure   execRPR.c             Stats: match failure
+  compute_nav_offsets           createplan.c          Trim offset computation
+
+Appendix B. Data Structure Relationship Diagram
+============================================================================
+
+  Parser Layer
+  --------
+  RPCommonSyntax
+    |--- rpSkipTo: RPSkipTo
+    |--- initial: bool
+    +--- rpPattern: RPRPatternNode* (tree)
+         |--- nodeType: VAR | SEQ | ALT | GROUP
+         |--- min, max: quantifier
+         |--- varName: variable name (VAR only)
+         +--- children: List* (SEQ/ALT/GROUP only)
+
+  Planner Layer
+  ----------
+  WindowAgg (plan node)
+    |--- rpSkipTo: RPSkipTo
+    |--- defineClause: List<TargetEntry>
+    +--- rpPattern: RPRPattern*
+         |--- numVars: int
+         |--- varNames: char**
+         |--- maxDepth: RPRDepth
+         |--- isAbsorbable: bool
+         |--- numElements: int
+         +--- elements: RPRPatternElement[]  (flat array)
+              |--- varId      (1B)
+              |--- depth      (1B)
+              |--- flags      (1B)
+              |--- reserved   (1B)
+              |--- min, max   (4B + 4B)
+              +--- next, jump (2B + 2B)
+
+  Executor Layer
+  ----------
+  WindowAggState
+    |--- rpSkipTo: RPSkipTo (AFTER MATCH SKIP mode)
+    |--- rpPattern: RPRPattern* (copied from plan)
+    |--- defineVariableList: List<String> (variable names, DEFINE order)
+    |--- defineClauseList: List<ExprState>
+    |--- nfaVarMatched: bool[] (per-row cache)
+    |--- nfaVisitedElems: bitmapword* (cycle detection)
+    |--- nfaStateSize: Size (pre-calculated RPRNFAState allocation size)
+    |--- nfaContext <-> nfaContextTail (doubly-linked list)
+    |   +--- RPRNFAContext
+    |       |--- states: RPRNFAState* (linked list)
+    |       |   |--- elemIdx
+    |       |   |--- counts[]
+    |       |   +--- isAbsorbable
+    |       |--- matchStartRow, matchEndRow
+    |       |--- lastProcessedRow
+    |       |--- matchedState (cloned on FIN arrival)
+    |       |--- hasAbsorbableState
+    |       +--- allStatesAbsorbable
+    |--- nfaContextFree (recycling pool)
+    +--- nfaStateFree (recycling pool)
+
+Appendix C. NFA Element Array Examples
+============================================================================
+
+C-1. PATTERN (A B C)
+
+  idx  varId  depth  min  max  next  jump
+  ------------------------------------------
+   0   A      0      1    1    1     -1
+   1   B      0      1    1    2     -1
+   2   C      0      1    1    3     -1
+   3   FIN    0      1    1    -1    -1
+
+C-2. PATTERN (A+ B*)
+
+  idx  varId  depth  min  max  next  jump  flags
+  ------------------------------------------------------------------------
+   0   A      0      1    INF  1     -1    ABSORBABLE | ABSORBABLE_BRANCH
+   1   B      0      0    INF  2     -1
+   2   FIN    0      1    1    -1    -1
+
+  Only A+ is the absorption point (Case 1). Once past A,
+  absorption is permanently disabled for that state.
+
+C-3. PATTERN (A | B | C)
+
+  idx  varId  depth  min  max  next  jump
+  ----------------------------------------
+   0   ALT    0      1    1    1     -1    alternation start
+   1   A      1      1    1    4     2     branch 1 -> FIN, jump -> branch 2
+   2   B      1      1    1    4     3     branch 2 -> FIN, jump -> branch 3
+   3   C      1      1    1    4     -1    branch 3 -> FIN
+   4   FIN    0      1    1    -1    -1
+
+C-4. PATTERN ((A B)+ C)
+
+  idx  varId    depth  min  max  next  jump  flags
+  --------------------------------------------------------------------------
+   0   BEGIN    0      1    INF  1     4     ABSORBABLE_BRANCH
+   1   A        1      1    1    2     -1    ABSORBABLE_BRANCH
+   2   B        1      1    1    3     -1    ABSORBABLE_BRANCH
+   3   END      0      1    INF  4     1     ABSORBABLE | ABSORBABLE_BRANCH
+   4   C        0      1    1    5     -1
+   5   FIN      0      1    1    -1    -1
+
+  Case 2: GROUP+ with {1,1} body VARs. A, B are branches;
+  END is the absorption point. Compare with C-6 (Case 3).
+
+C-5. PATTERN ((A | B)+? C)
+
+  idx  varId    depth  min  max   next  jump  flags
+  -------------------------------------------------------------------
+   0   BEGIN    0      1    INF   1     5     RELUCTANT, group start
+   1   ALT      1      1    1     2     -1    alternation start
+   2   A        2      1    1     4     3     branch 1
+   3   B        2      1    1     4     -1    branch 2
+   4   END      0      1    INF   5     1     RELUCTANT, group end
+   5   C        0      1    1     6     -1
+   6   FIN      0      1    1     -1    -1
+
+C-6. PATTERN ((A+ B)+ C)  -- Absorbability flag example
+
+  idx  varId    depth  min  max   next  jump  flags
+  ---------------------------------------------------------------------------
+   0   BEGIN    0      1    INF   1     4     ABSORBABLE_BRANCH, group start
+   1   A        1      1    INF   2     -1    ABSORBABLE | ABSORBABLE_BRANCH
+   2   B        1      1    1     3     -1
+   3   END      0      1    INF   4     1     group end
+   4   C        0      1    1     5     -1
+   5   FIN      0      1    1     -1    -1
+
+  Recurses from BEGIN into the body -> A matches Case 1 (simple VAR+).
+  A gets ABSORBABLE | ABSORBABLE_BRANCH, BEGIN gets ABSORBABLE_BRANCH.
+  B and END get no flags -> absorption stops once the state advances to B.
+  (See IV-5 Case 3)
+
+C-7. PATTERN ((A+ B | C*)+ D)  -- Per-branch absorption in ALT
+
+  idx  varId    depth  min  max   next  jump  flags
+  ---------------------------------------------------------------------------
+   0   BEGIN    0      1    INF   1     6     ABSORBABLE_BRANCH
+   1   ALT      1      1    1     2     -1    ABSORBABLE_BRANCH
+   2   A        2      1    INF   3     4     ABSORBABLE | ABSORBABLE_BRANCH
+   3   B        2      1    1     5     -1
+   4   C        2      0    INF   5     -1    ABSORBABLE | ABSORBABLE_BRANCH
+   5   END      0      1    INF   6     1     EMPTY_LOOP
+   6   D        0      1    1     7     -1
+   7   FIN      0      1    1     -1    -1
+
+  ALT branches are checked independently for absorbability.
+  Branch 1: A+ matches Case 1 -> A gets ABSORBABLE. B has no flag.
+  Branch 2: C* matches Case 1 -> C gets ABSORBABLE.
+  Both A and C get ABSORBABLE_BRANCH as part of their respective branch
+  paths.
+  END has EMPTY_LOOP: branch 2 (C*) is nullable, making the group body
+  nullable.
+  BEGIN and ALT get ABSORBABLE_BRANCH (on the path to absorbable elements).
+
+============================================================================
+  End of document
+============================================================================
diff --git a/src/backend/executor/execRPR.c b/src/backend/executor/execRPR.c
index 15e439daaae..96cfbfbc0a2 100644
--- a/src/backend/executor/execRPR.c
+++ b/src/backend/executor/execRPR.c
@@ -30,1584 +30,8 @@
 #include "utils/memutils.h"
 
 /*
- * ============================================================================
- *   PostgreSQL Row Pattern Recognition: Flat-Array Stream NFA Guide
- * ============================================================================
- *
- *   Target audience: Developers with a basic understanding of the PostgreSQL
- *                    executor and planner architecture
- *
- *   Scope: The entire process from PATTERN/DEFINE clause parsing to NFA
- *          runtime execution
- *
- *   Related code:
- *     - src/backend/parser/parse_rpr.c          (parser phase)
- *     - src/backend/optimizer/plan/rpr.c        (optimizer phase)
- *     - src/backend/executor/nodeWindowAgg.c    (executor phase, window agg)
- *     - src/backend/executor/execRPR.c          (executor phase, NFA engine)
- *     - src/include/executor/execRPR.h          (NFA public API)
- *     - src/include/nodes/plannodes.h           (plan node definitions)
- *     - src/include/nodes/execnodes.h           (execution state definitions)
- *     - src/include/optimizer/rpr.h             (types and constants)
- *     - src/backend/optimizer/plan/createplan.c (nav offset computation)
- *
- * ============================================================================
- *
- * What is a Flat-Array Stream NFA?
- *
- *   The NFA in this implementation is not a traditional state-transition graph
- *   but a flat array of fixed-size 16-byte elements. At runtime, it processes
- *   the row stream in a forward-only manner, expanding epsilon transitions
- *   eagerly without backtracking.
- *
- *   - Flat-Array: Pattern compiled into a flat array,
- *                 not a graph (Chapter IV)
- *   - Stream:     Rows consumed sequentially in one direction,
- *                 never revisited (Chapter XII)
- *   - NFA:        Nondeterministic execution where multiple states
- *                 coexist within a single context (Chapter VI)
- *
- * Chapter I  Row Pattern Recognition Overview
- * ============================================================================
- *
- * Row Pattern Recognition (hereafter RPR) is a feature introduced in SQL:2016
- * that matches regex-based patterns against ordered row sets.
- *
- * The SQL standard defines two forms:
- *
- *   Feature R010: MATCH_RECOGNIZE (FROM clause)
- *     - Dedicated table operator
- *     - Provides dedicated functions such as MATCH_NUMBER(), CLASSIFIER()
- *     - Supports ONE ROW PER MATCH / ALL ROWS PER MATCH
- *
- *   Feature R020: RPR in a window (WINDOW clause)
- *     - Integrated into the existing window function framework
- *     - Supports ALL ROWS PER MATCH only
- *     - No MATCH_NUMBER()
- *
- * This implementation targets Feature R020.
- *
- * The basic syntax is as follows:
- *
- *   SELECT ...
- *   OVER (
- *     PARTITION BY ...
- *     ORDER BY ...
- *     ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- *     [INITIAL | SEEK]   -- SEEK is defined in the standard but not implemented
- *     AFTER MATCH SKIP TO NEXT ROW | SKIP PAST LAST ROW
- *     PATTERN ( <regex> )
- *     DEFINE <variable> AS <condition>, ...
- *   )
- *
- * The PATTERN clause is a regular expression over row pattern variables.
- * The DEFINE clause specifies boolean conditions that determine whether each
- * variable evaluates to true for the current row.
- *
- * Example:
- *
- *   PATTERN (A+ B)
- *   DEFINE A AS price > PREV(price),
- *          B AS price < PREV(price)
- *
- * This pattern matches "a span where prices rise consecutively then drop."
- *
- * Chapter II  Overall Processing Pipeline
- * ============================================================================
- *
- * RPR processing is divided into three phases:
- *
- *   +------------------------------------------------------------+
- *   |  1. Parsing (Parser)                                       |
- *   |     SQL text -> PATTERN AST + DEFINE expression tree       |
- *   |                                                            |
- *   |  2. Compilation (Optimizer/Planner)                        |
- *   |     PATTERN AST -> optimization -> flat NFA element array  |
- *   |                                                            |
- *   |  3. Execution (Executor)                                   |
- *   |     Row-by-row matching via NFA simulation                 |
- *   +------------------------------------------------------------+
- *
- * Each phase uses independent data structures, and the interfaces between
- * phases are well-defined:
- *
- *   Parser -> Planner:    WindowClause.rpPattern (RPRPatternNode tree)
- *                         WindowClause.defineClause (TargetEntry list)
- *
- *   Planner -> Executor:  WindowAgg.rpPattern (RPRPattern struct)
- *                         WindowAgg.defineClause (TargetEntry list)
- *
- * Chapter III  Parsing Phase
- * ============================================================================
- *
- * III-1. Entry Point
- *
- *   transformWindowDefinitions() (parse_clause.c)
- *     +-- transformRPR() (parse_rpr.c)
- *
- * transformRPR() is invoked when RPCommonSyntax is present and performs the
- * following:
- *
- *   (1) Frame option validation
- *       - Only ROWS is allowed (RANGE, GROUPS are not)
- *       - The start boundary must be CURRENT ROW
- *       - EXCLUDE option is not allowed
- *
- *   (2) Transcription to WindowClause
- *       - Copies rpPattern, rpSkipTo, initial fields
- *
- *   (3) DEFINE clause transformation (transformDefineClause)
- *
- * III-2. PATTERN AST
- *
- * The parser transforms the PATTERN clause into an RPRPatternNode tree.
- * Each node has one of the following four types:
- *
- *   RPR_PATTERN_VAR    Variable reference. Name stored in varName field.
- *   RPR_PATTERN_SEQ    Concatenation. Children node list in children.
- *   RPR_PATTERN_ALT    Alternation. Branch node list in children.
- *   RPR_PATTERN_GROUP  Group (parentheses). Body node list in children.
- *
- * All nodes have min/max fields to express quantifiers:
- *
- *   A       -> VAR(A, min=1, max=1)
- *   A+      -> VAR(A, min=1, max=INF)
- *   A*      -> VAR(A, min=0, max=INF)
- *   A?      -> VAR(A, min=0, max=1)
- *   A{3,5}  -> VAR(A, min=3, max=5)
- *
- * If the reluctant field is true, the quantifier is reluctant (non-greedy).
- * (RPRPatternNode.reluctant is bool; reluctant_location is the separate
- * ParseLoc field holding the '?' token position, or -1 if absent.)
- *
- * Example: PATTERN ((A+ B) | C*)
- *
- *   ALT
- *   +-- SEQ
- *   |   +-- VAR(A, 1, INF)
- *   |   +-- VAR(B, 1, 1)
- *   +-- VAR(C, 0, INF)
- *
- * III-3. DEFINE Clause Transformation
- *
- * transformDefineClause() processes each DEFINE variable as follows:
- *
- *   (1) Checks for duplicate variable names
- *   (2) Transforms the expression via transformExpr()
- *   (3) Extracts Var nodes via pull_var_clause() and ensures each is
- *       present in the query targetlist, so the planner propagates the
- *       referenced columns through the plan tree
- *   (4) Wraps in a TargetEntry with the variable name set in resname
- *
- * After all variables are processed:
- *   (5) Coerces each expression to Boolean type (coerce_to_boolean)
- *
- * Variables that are used in PATTERN but not defined in DEFINE are implicitly
- * evaluated as TRUE (matching all rows).
- *
- * Chapter IV  Compilation Phase
- * ============================================================================
- *
- * IV-1. Entry Point
- *
- *   create_windowagg_plan() (createplan.c)
- *     +-- buildDefineVariableList()    Build variable name list from DEFINE
- *     +-- buildRPRPattern()           NFA compilation (6 phases)
- *
- * IV-2. The 6 Phases of buildRPRPattern()
- *
- *   Phase 1: AST optimization (optimizeRPRPattern)
- *   Phase 2: Statistics collection (scanRPRPattern)
- *   Phase 3: Memory allocation (allocateRPRPattern)
- *   Phase 4: NFA element fill (fillRPRPattern)
- *   Phase 5: Finalization (finalizeRPRPattern)
- *   Phase 6: Absorbability analysis (computeAbsorbability)
- *
- * IV-3. Phase 1: AST Optimization
- *
- * After copying the parser-generated AST, the following optimizations are
- * applied:
- *
- *   (a) SEQ flattening: Unwrap nested SEQ nodes
- *       SEQ(A, SEQ(B, C)) -> SEQ(A, B, C)
- *
- *   (b) Consecutive variable merging: Merge consecutive occurrences of the
- *       same variable into a single quantifier
- *       A A -> A{2}
- *       A{2,3} A{1,2} -> A{3,5}
- *
- *   (c) Consecutive group merging: Merge repeated identical groups
- *       (A B)+ (A B)+ -> (A B){2,INF}
- *
- *   (d) Consecutive ALT merging: Merge repeated identical ALT nodes
- *       (A | B) (A | B) (A | B) -> (A | B){3}
- *
- *   (e) Prefix/suffix absorption: Absorb identical sequences before/after
- *       a group
- *       A B (A B)+ -> (A B){2,INF}
- *
- *   (f) ALT flattening and deduplication
- *       (A | (B | C)) -> (A | B | C)
- *       (A | B | A) -> (A | B)
- *
- *   (g) Quantifier multiplication: Collapse nested quantifiers when safe
- *       (A+)+ -> A+
- *       (A{2,3}){5} -> A{10,15}
- *
- *   (h) Single-child unwrap
- *       SEQ(A) -> A,  (A){1,1} -> A
- *
- * IV-4. Phase 4: NFA Element Array Generation
- *
- * Transforms the optimized AST into a flat array of RPRPatternElement.
- * This is the core data structure used for NFA simulation at runtime.
- *
- * RPRPatternElement struct (16 bytes):
- *
- *   Field      Size     Description
- *   ---------------------------------------------------------
- *   varId      1B      Variable ID (0-251) or control code (252-255)
- *   depth      1B      Group nesting depth
- *   flags      1B      Bit flags (see below)
- *   reserved   1B      Padding
- *   min        4B      Quantifier lower bound
- *   max        4B      Quantifier upper bound
- *   next       2B      Next element index (sequential flow)
- *   jump       2B      Branch target index (for ALT/GROUP)
- *
- * Control codes:
- *
- *   RPR_VARID_BEGIN (252)  Group start marker
- *   RPR_VARID_END   (253)  Group end marker
- *   RPR_VARID_ALT   (254)  Alternation start marker
- *   RPR_VARID_FIN   (255)  Pattern completion marker
- *
- * Element flags (1 byte, bitmask):
- *
- *   0x01  RPR_ELEM_RELUCTANT          (VAR, BEGIN, END)
- *         Non-greedy quantifier.  Prefers shorter match: try exit-loop
- *         first, then repeat.  Set on VAR for simple (A+?),
- *         on BEGIN+END for group ((...)+?).
- *
- *   0x02  RPR_ELEM_EMPTY_LOOP         (END)
- *         Group body can produce empty match (all children nullable).
- *         Creates a fast-forward exit clone alongside the normal
- *         loop-back so cycle detection doesn't kill legitimate
- *         matches. (IV-4b)
- *
- *   0x04  RPR_ELEM_ABSORBABLE_BRANCH  (VAR, BEGIN, END, ALT)
- *         Element lies within an absorbable region.  Used at runtime
- *         to track whether the current NFA state is in an absorbable
- *         context.
- *
- *   0x08  RPR_ELEM_ABSORBABLE         (VAR, END)
- *         Absorption judgment point.  Where to compare consecutive
- *         iterations for absorption.
- *           - Simple unbounded VAR (A+): set on the VAR itself
- *           - Unbounded GROUP ((A B)+): set on the END element only
- *
- *   Accessor macros:
- *     RPRElemIsReluctant(e)        (e)->flags & 0x01
- *     RPRElemCanEmptyLoop(e)       (e)->flags & 0x02
- *     RPRElemIsAbsorbableBranch(e) (e)->flags & 0x04
- *     RPRElemIsAbsorbable(e)       (e)->flags & 0x08
- *
- * Example: PATTERN (A+ B | C)
- *
- *   AST: ALT(SEQ(VAR(A,1,INF), VAR(B,1,1)), VAR(C,1,1))
- *
- *   Compilation result:
- *
- *   idx  varId  depth  min  max  next  jump  Description
- *   ------------------------------------------------------------
- *    0   ALT    0      1    1    1     -1    Alternation start
- *    1   A(0)   1      1    INF  2     3     Branch 1: A+
- *    2   B(1)   1      1    1    4     -1    Branch 1: B -> FIN
- *    3   C(2)   1      1    1    4     -1    Branch 2: C -> FIN
- *    4   FIN    0      1    1    -1    -1    Pattern completion
- *
- *   - idx 0: ALT marker. next(=1) is the start of the first branch
- *   - idx 1: Variable A. next(=2) is B, jump(=3) is the start of the second
- *            branch
- *   - idx 2: Variable B. next(=4) is FIN
- *   - idx 3: Variable C. next(=4) is FIN
- *   - idx 4: FIN marker. Match completion signal
- *
- * Roles of next and jump:
- *
- *   - next: The next element to move to "after consuming" the current element.
- *           For VAR, the next position after a successful match.
- *           For BEGIN/END, the next position inside/outside the group.
- *
- *   - jump: The element to "skip to."
- *           In ALT, a jump from one branch to the next branch.
- *           In BEGIN, a skip path to END+1 (for groups with min=0).
- *           In END, a loop-back to the start of the group body.
- *
- * Example: PATTERN ((A B)+)
- *
- *   idx  varId    depth  min  max  next  jump  Description
- *   --------------------------------------------------------------
- *    0   BEGIN    0      1    INF  1     4     Group start
- *    1   A(0)     1      1    1    2     -1    A
- *    2   B(1)     1      1    1    3     -1    B
- *    3   END      0      1    INF  4     1     Group end
- *    4   FIN      0      1    1    -1    -1    Pattern completion
- *
- *   - idx 0: BEGIN. next(=1) enters the group body.
- *            jump(=4) skips to after END = FIN (used when min=0).
- *   - idx 3: END. next(=4) exits the group.
- *            jump(=1) loops back to the start of the group body.
- *
- * IV-4a. Reluctant Flag (RPR_ELEM_RELUCTANT)
- *
- * The reluctant flag is set during Phase 4 (fillRPRPattern) when the AST node
- * has reluctant == true. It reverses the priority of quantifier expansion at
- * runtime:
- *
- *   Greedy (default):  try loop-back first, then exit  (prefer longer match)
- *   Reluctant:         try exit first, then loop-back   (prefer shorter match)
- *
- * The flag is set on all elements that carry the quantifier:
- *
- *   Simple VAR (A+?):     RPR_ELEM_RELUCTANT on the VAR element
- *   Group ((...)+?):      RPR_ELEM_RELUCTANT on BEGIN and END elements
- *
- * At runtime (nfa_advance), the flag controls DFS exploration order:
- *
- *   VAR with quantifier:
- *     Greedy:    primary path = next (continue), clone = jump (skip)
- *     Reluctant: primary path = jump (skip), clone = next (continue)
- *
- *   END element:
- *     Greedy:    primary path = jump (loop-back), clone = next (exit)
- *     Reluctant: primary path = next (exit), clone = jump (loop-back)
- *
- *   BEGIN with min=0:
- *     Greedy:    primary path = next (enter group), clone = jump (skip)
- *     Reluctant: primary path = jump (skip), clone = next (enter group)
- *
- * The absorption optimization requires greedy quantifiers. Reluctant
- * quantifiers are excluded from absorbability analysis (see IV-5).
- *
- * IV-4b. Empty Loop Flag (RPR_ELEM_EMPTY_LOOP)
- *
- * The empty-loop flag is set during Phase 4 (fillRPRPatternGroup) on the END
- * element when the group body is nullable -- i.e., every path through the
- * body can match zero rows (all children are nullable).
- *
- * Example patterns that trigger this flag:
- *
- *   (A?)*    A is nullable (min=0), so group body is nullable -> END gets flag
- *   (A? B?)+ Both children nullable -> body nullable -> END gets flag
- *   (A | B*) B* is nullable, making the ALT nullable -> END gets flag
- *
- * The flag works in conjunction with the empty match cycle detection
- * (elemIdx visited bitmap). Without this flag, cycle detection alone would
- * cause legitimate matches to fail.
- *
- * Problem example: (A*){2,3}
- *   - Iteration 1: A* consumes all available rows -> count=1, END reached
- *   - Loop-back for iteration 2: A* matches zero rows -> END reached again
- *   - Cycle detection sees the same elemIdx on the same row -> state killed
- *   - count never reaches min(2) -> match fails (incorrect)
- *
- * With the RPR_ELEM_EMPTY_LOOP flag, nfa_advance_end creates two paths:
- * the normal loop-back (which cycle detection will eventually kill) and
- * a fast-forward exit clone that bypasses the loop entirely.
- * (See IX-4(c) for detailed runtime behavior.)
- *
- * IV-5. Absorbability Analysis (RPR_ELEM_ABSORBABLE)
- *
- * Context absorption is an optimization technique that reduces O(n^2) to O(n).
- * (Runtime behavior is described in Chapter VIII.)
- *
- * This phase determines whether the pattern has a structure suitable for the
- * absorption optimization and sets flags on the relevant elements:
- *
- *   RPR_ELEM_ABSORBABLE         Absorption comparison point
- *   RPR_ELEM_ABSORBABLE_BRANCH  Element within an absorbable region
- *
- * Eligibility conditions:
- *
- *   (1) SKIP PAST LAST ROW (not NEXT ROW)
- *   (2) Frame end is UNBOUNDED FOLLOWING
- *
- * Structural conditions (isUnboundedStart + computeAbsorbabilityRecursive):
- *
- *   Case 1: Simple VAR+ (e.g., A+)
- *           -> ABSORBABLE | ABSORBABLE_BRANCH set on the VAR
- *   Case 2: GROUP+ with fixed-length children (min == max, recursively)
- *           e.g., (A B)+, (A B{2})+, ((A (B C){2}){2})+
- *           -> ABSORBABLE_BRANCH on all elements within the group,
- *              ABSORBABLE | ABSORBABLE_BRANCH on END
- *
- *           Why this is safe: when every child has min == max, the group
- *           is semantically equivalent to unrolling its body into {1,1}
- *           elements.  E.g., (A B{2})+ behaves like (A B B)+.  Each
- *           iteration consumes a fixed number of rows, so an earlier
- *           context's count always dominates a later one's (monotonicity).
- *
- *   Case 3: GROUP+ whose body starts with VAR+ (e.g., (A+ B)+)
- *           -> Recurses from BEGIN into the body, applying Case 1.
- *             ABSORBABLE | ABSORBABLE_BRANCH set on A.
- *             B and END get no flags -> absorption stops once past A.
- *
- * Absorbability is determined per-element, not per-pattern.
- * Absorption comparison is performed only when a state resides at an
- * element with the RPR_ELEM_ABSORBABLE flag. Once a state leaves the
- * flagged region, absorption is permanently disabled for that state.
- *
- * Through this mechanism, the runtime guarantees monotonicity:
- * "a context that started earlier always subsumes a context that
- * started later."
- *
- * Chapter V  NFA Runtime Data Structures
- * ============================================================================
- *
- * V-1. RPRNFAState -- NFA State
- *
- * A single NFA state represents "how far the pattern has progressed."
- *
- *   Field         Description
- *   -----------------------------------------------------------
- *   elemIdx       Index of the current pattern element
- *   counts[]      Repetition count per group depth
- *   isAbsorbable  Whether the state is in an absorbable region
- *   next          Next state in the linked list
- *
- * The size of the counts array is rpPattern->maxDepth (= maximum group
- * nesting depth + 1), allocated as a flexible array member at the end of
- * the struct.
- *
- * Example: In PATTERN ((A B)+ C), a state waiting for B in the 3rd iteration
- *
- *   Element array: [0:BEGIN(d0) 1:A(d1) 2:B(d1) 3:END(d0) 4:C(d0) 5:FIN]
- *
- *   elemIdx = 2 (B, depth 1)
- *   counts[0] = 2 (depth 0: depth of END. Group completed 2 iterations)
- *   counts[1] = 1 (depth 1: depth of B. A matched in current iteration)
- *
- *   Counts are indexed by depth, not by elemIdx.
- *   counts[0] is incremented when passing through END(depth 0),
- *   and the group repetition count is preserved even when
- *   the state is at B(depth 1).
- *
- * Definition of two states being "equal":
- *
- *   Two states are equal if they have the same elemIdx and the same counts
- *   up to the depth of that element.
- *   nfa_states_equal() compares counts[0..elem->depth] using memcmp.
- *   Only counts at or below the depth of the current element are meaningful.
- *
- * V-2. RPRNFAContext -- Matching Context
- *
- * A single context represents "a matching attempt started from a specific
- * start row."
- *
- *   Field                 Description
- *   ---------------------------------------------------------------------
- *   states                Linked list of active NFA states
- *   matchStartRow         Row number where matching started
- *   matchEndRow           Row number where matching completed
- *                         (-1 if incomplete)
- *   lastProcessedRow      Last row processed
- *   matchedState          State that reached FIN (for greedy fallback)
- *   hasAbsorbableState    Whether this context can absorb other contexts
- *   allStatesAbsorbable   Whether this context can be absorbed
- *   next, prev            Doubly-linked list
- *
- * Since the NFA is nondeterministic, multiple states can coexist
- * simultaneously within a single context.
- *
- * Example: In PATTERN (A | B) C, if the first row matches both A and B,
- * two states coexist within the context:
- *
- *   State 1: elemIdx=3 (waiting for C, via branch A)
- *   State 2: elemIdx=3 (waiting for C, via branch B)
- *
- * In this case, since the (elemIdx, counts) of the two states are equal,
- * nfa_add_state_unique() retains only State 1 (branch A), which was
- * added first.
- * Because DFS processes the first branch of ALT first, the state via A
- * is registered first, and the state via B is discarded as a duplicate.
- * This is the preferment guarantee.
- *
- * V-3. RPR Fields of WindowAggState
- *
- *   nfaContext / nfaContextTail   Doubly-linked list of active contexts
- *   nfaContextFree                Reuse pool for contexts
- *   nfaStateFree                  Reuse pool for states
- *   nfaVarMatched                 Per-row cache: varMatched[varId]
- *   nfaVisitedElems               Bitmap for cycle detection
- *   nfaStateSize                  Precomputed size of RPRNFAState
- *
- * Memory management:
- *
- *   States and contexts are managed through their own free lists.
- *   Instead of palloc, they are obtained from the reuse pool, and
- *   returned to the pool upon deallocation.
- *   This reduces the overhead of frequent allocation/deallocation.
- *
- * Chapter VI  NFA Execution: 3-Phase Model
- * ============================================================================
- *
- * VI-1. Entry Point and Overall Flow
- *
- * When the window function processes each row, row_is_in_reduced_frame()
- * is called. This function determines whether the current row belongs to
- * a matched frame, and if necessary, calls update_reduced_frame() to
- * drive the NFA.
- *
- * Flow of update_reduced_frame():
- *
- *   (1) Find or create a context for the target row
- *   (2) Enter the row processing loop
- *   (3) After the loop ends, record the match result
- *
- * Pseudocode of the row processing loop:
- *
- *   targetCtx = ExecRPRGetHeadContext(pos)
- *   if targetCtx == NULL:
- *       targetCtx = ExecRPRStartContext(pos)
- *
- *   for currentPos = startPos; targetCtx->states != NULL; currentPos++:
- *       if not nfa_evaluate_row(currentPos):  -- row does not exist
- *           ExecRPRFinalizeAllContexts()      -- finalize all contexts
- *           ExecRPRCleanupDeadContexts()      -- clean up after finalization
- *           break
- *
- *       ExecRPRProcessRow(currentPos)         -- 3-phase processing
- *       ExecRPRStartContext(currentPos + 1)   -- pre-create next start point
- *       ExecRPRCleanupDeadContexts()          -- remove dead contexts
- *
- * Key point: Processing a single row may require processing multiple rows
- * ahead. Due to the nature of window functions, determining the frame for
- * row N requires looking at rows beyond N.
- *
- * VI-2. Context Creation: ExecRPRStartContext()
- *
- * Creates a new context and performs the initial advance.
- *
- *   (1) Allocate context via nfa_context_alloc()
- *   (2) Set matchStartRow = pos
- *   (3) Create initial state: elemIdx=0 (first pattern element),
- *       counts=all zero
- *   (4) Call nfa_advance(initialAdvance=true)
- *
- * The initial advance expands epsilon transitions at the beginning of
- * the pattern. For example, the initial advance for PATTERN ((A | B) C):
- *
- *   Start: elemIdx=0 (ALT)
- *     -> Expand ALT branches
- *       -> elemIdx=1 (A) -- VAR, so add state; stop here
- *       -> elemIdx=2 (B) -- VAR, so add state; stop here
- *
- *   Result: Two states in the context {waiting for A, waiting for B}
- *
- * During the initial advance, reaching FIN is not recorded as a match.
- * This is to prevent empty matches.
- *
- * VI-3. Row Evaluation: nfa_evaluate_row()
- *
- * Evaluates all variable conditions in the DEFINE clause at once for
- * the current row.
- *
- *   for each defineClause[i]:
- *       result = ExecEvalExpr(defineClause[i])
- *       varMatched[i] = (not null and true)
- *
- * To support row navigation operators (PREV, NEXT, FIRST, LAST),
- * a 1-slot model is used: only ecxt_outertuple is set to the current
- * row.  Navigation is handled by EEOP_RPR_NAV_SET/RESTORE opcodes
- * emitted during DEFINE expression compilation:
- *
- *   NAV_SET:     save ecxt_outertuple, swap in target row via nav_slot
- *   (evaluate):  argument expression reads from swapped slot
- *   NAV_RESTORE: restore original ecxt_outertuple
- *
- * Compound navigation (PREV(FIRST()), NEXT(FIRST()), PREV(LAST()),
- * NEXT(LAST())) is flattened by the parser into a single RPRNavExpr
- * with a compound kind (RPR_NAV_PREV_FIRST, etc.).  The executor
- * computes the target position in two steps: first the inner reference
- * point (match_start + N or currentpos - N) with match-range validation,
- * then the outer adjustment (+/- M) with partition-range validation.
- * If either step is out of range, the result is NULL.
- *
- * nav_slot caches the last fetched position (nav_slot_pos) to avoid
- * redundant tuplestore lookups when multiple navigation calls target
- * the same row.
- *
- * The varMatched array is referenced later in Phase 1 (Match).
- *
- * VI-4. Per-Context Re-evaluation (match_start-dependent variables)
- *
- * DEFINE variables that depend on match_start (those containing FIRST,
- * LAST-with-offset, or compound PREV_FIRST/NEXT_FIRST/PREV_LAST/NEXT_LAST)
- * are identified at plan time via defineMatchStartDependent.  The shared
- * evaluation in nfa_evaluate_row() uses the head context's matchStartRow
- * for FIRST/LAST base position.
- *
- * When processing a context whose matchStartRow differs from the shared
- * value, nfa_reevaluate_dependent_vars() temporarily sets nav_match_start
- * to that context's matchStartRow and re-evaluates only the dependent
- * variables.  The original nav_match_start and currentpos are saved and
- * restored after re-evaluation.
- *
- * Summary of evaluation strategy by navigation content:
- *
- *   Navigation content               evaluation
- *   -------------------------------------------------------
- *   No navigation                    shared (once per row)
- *   PREV/NEXT only                   shared (once per row)
- *   LAST (no offset)                 shared (once per row)
- *   LAST (with offset)               per-context
- *   FIRST (any)                      per-context
- *   Compound (inner FIRST)           per-context
- *   Compound (inner LAST, no off.)   shared (once per row)
- *   Compound (inner LAST, w/off.)    per-context
- *
- * VI-5. Tuplestore Mark and Trim (nodeWindowAgg.c)
- *
- * Navigation functions require access to past rows via the tuplestore.
- * To allow tuplestore_trim() to free rows that are no longer reachable,
- * the planner computes two offsets (see compute_nav_offsets):
- *
- *   navMaxOffset (Nav Mark Lookback):
- *     Maximum backward reach from currentpos.  Contributed by PREV,
- *     LAST-with-offset, and compound PREV_LAST/NEXT_LAST.
- *     Mark position: currentpos - navMaxOffset.
- *
- *   navFirstOffset (Nav Mark Lookahead):
- *     Minimum forward offset from match_start.  Contributed by FIRST
- *     and compound PREV_FIRST/NEXT_FIRST.  Can be negative when
- *     compound PREV_FIRST looks before match_start.
- *     Mark position: oldest_context->matchStartRow + navFirstOffset.
- *
- * The actual mark is set to: min(lookback_mark, lookahead_mark).
- * This ensures all rows reachable by any navigation function are retained.
- *
- * When offsets contain non-constant expressions (Param), the planner sets
- * navMaxOffsetKind/navFirstOffsetKind to RPR_NAV_OFFSET_NEEDS_EVAL and the
- * executor evaluates them at init time.  On overflow, the kind is set to
- * RPR_NAV_OFFSET_RETAIN_ALL, disabling trim for that dimension.
- *
- * VI-6. ExecRPRProcessRow(): 3-Phase Processing
- *
- * NFA processing for a single row is divided into three phases:
- *
- *   +--------------------------------------------+
- *   |  Phase 1: MATCH (convergence)              |
- *   |  Compare the current row against each VAR  |
- *   |  state. Remove states that fail to match.  |
- *   |                                            |
- *   |  Phase 2: ABSORB (absorption)              |
- *   |  Merge duplicate contexts to prevent       |
- *   |  state explosion.                          |
- *   |                                            |
- *   |  Phase 3: ADVANCE (expansion)              |
- *   |  Expand epsilon transitions to prepare     |
- *   |  for the next row.                         |
- *   +--------------------------------------------+
- *
- * This ordering is important:
- *
- *   - Match executes first to "consume the current row."
- *   - Absorb executes immediately after Match, when states have been updated.
- *   - Advance executes last to prepare "states waiting for the next row."
- *
- * Chapter VII  Phase 1: Match
- * ============================================================================
- *
- * nfa_match() iterates through each state in the context:
- *
- *   (1) Check whether the state's elemIdx is a VAR element
- *   (2) Compare against the current row using nfa_eval_var_match()
- *   (3) Match success: increment repetition count, retain state
- *   (4) Match failure: remove state
- *
- * Match determination (nfa_eval_var_match):
- *
- *   If varId is within the range of defineVariableList:
- *       Use the value of varMatched[varId]
- *
- *   If varId exceeds the range (variable not defined in DEFINE):
- *       Unconditionally true (matches all rows)
- *
- * Immediate advance for simple VARs:
- *
- *   For a VAR with min=1, max=1 where the next element is END,
- *   the Match phase processes through END immediately.
- *   This is necessary for accurate state comparison in Phase 2 (Absorb).
- *
- *   Example: In PATTERN ((A B)+), when A matches, it immediately advances
- *   to B, and when B matches, it immediately advances through END to
- *   complete the group count. This enables absorption comparison with
- *   other contexts.
- *
- * Chapter VIII  Phase 2: Absorb (Context Absorption)
- * ============================================================================
- *
- * VIII-1. Problem
- *
- * In the current implementation, a new context is started for each row
- * processed.
- * Applying PATTERN (A+) to 10 rows produces 10 contexts,
- * each of which tracks state independently.
- *
- * If there are N rows, the total number of states becomes O(N^2):
- *
- *   Context 1 (started at row 1): can match A up to N times
- *   Context 2 (started at row 2): can match A up to N-1 times
- *   ...
- *   Context N (started at row N): can match A 1 time
- *
- * VIII-2. Solution: Context Absorption
- *
- * Key observation: a context started earlier contains
- * all matches of a later-started context (monotonicity principle).
- *
- * If Context 1 started at row 1 and matched A 5 times,
- * the state where Context 2 (started at row 2) matched A 4 times
- * is already contained within Context 1.
- *
- * Therefore Context 2 can be "absorbed" into Context 1.
- *
- * VIII-3. Absorption Conditions
- *
- * Planner-time prerequisites (all must hold for absorption to be enabled):
- *
- *   (a) SKIP PAST LAST ROW.  SKIP TO NEXT ROW creates overlapping
- *       contexts that cannot be safely absorbed.
- *   (b) Unbounded frame (ROWS BETWEEN CURRENT ROW AND UNBOUNDED
- *       FOLLOWING).  Limited frames apply differently to each context,
- *       breaking the monotonicity principle.
- *   (c) No match_start-dependent navigation in DEFINE.
- *
- *       Mechanism: each context has a different matchStartRow, so FIRST
- *       resolves to a different row for each context at the same
- *       currentpos.  An earlier context's DEFINE result no longer
- *       subsumes a later one's, making count-dominance comparison
- *       invalid.  Rather than comparing matchStartRow at runtime
- *       (which would complicate the absorb path), any match_start
- *       dependency disables absorption entirely.
- *
- *       Navigation content              match_start dep.  absorption
- *       ------------------------------------------------------------
- *       No navigation                   none              safe
- *       PREV/NEXT only                  none              safe
- *       LAST (no offset)                none              safe
- *       LAST (with offset)              boundary check    unsafe
- *       FIRST (any)                     direct            unsafe
- *       Compound (inner FIRST)          direct            unsafe
- *       Compound (inner LAST, no off.)  none              safe
- *       Compound (inner LAST, w/off.)   boundary chk      unsafe
- *
- * Runtime conditions (evaluated per context pair):
- *
- *   (1) The pattern is marked as isAbsorbable (see IV-5)
- *   (2) allStatesAbsorbable of the target context is true
- *   (3) An earlier context "covers" all states of the target
- *
- * Cover condition (nfa_states_covered):
- *
- *   A state with the same elemIdx exists in the earlier context,
- *   and the count at that depth is greater than or equal -- then it is covered.
- *
- * VIII-4. Dual-Flag Design
- *
- * Two boolean flags make the absorption decision efficient:
- *
- *   hasAbsorbableState (monotonic: only true->false transition possible)
- *     "Does this context have the ability to absorb other contexts?"
- *     true if at least one absorbable state exists.
- *     Transitions to false when states are removed leaving no absorbable
- *     states.
- *     Once false, it never becomes true again.
- *
- *   allStatesAbsorbable (dynamic: can fluctuate)
- *     "Can this context be absorbed?"
- *     true if all states are in an absorbable region.
- *     Becomes false when a non-absorbable state is added; reverts to true
- *     when it is removed.
- *
- * VIII-5. Absorption Order
- *
- * nfa_absorb_contexts() traverses from tail (newest) to head (oldest).
- *
- *   for ctx = tail to head:
- *       if ctx.allStatesAbsorbable:
- *           for older = ctx.prev to head:
- *               if older.hasAbsorbableState:
- *                   if nfa_states_covered(older, ctx):
- *                       free(ctx)  -- absorbed
- *                       break
- *
- * Since inspection starts from the newest context, the most recently started
- * (= having the shortest match) context is absorbed first.
- *
- * Chapter IX  Phase 3: Advance (Epsilon Transition Expansion)
- * ============================================================================
- *
- * IX-1. Overview
- *
- * nfa_advance() expands epsilon transitions from each state after Match,
- * generating "new states waiting for the next row."
- *
- * An epsilon transition is a transition that moves without consuming a row:
- *
- *   - ALT: branch to each alternative
- *   - BEGIN: enter group (or skip if min=0)
- *   - END: loop-back within group (or exit when condition is met)
- *   - FIN: record match completion
- *   - VAR loop/exit: repeat/exit according to the quantifier
- *
- * Expansion stops upon reaching a VAR element, and the state is added.
- * This is because VAR is the element that "will consume the next row."
- *
- * IX-2. Processing Order: DFS and Preferment
- *
- * advance processes states in lexicographic order,
- * performing Depth-First Search (DFS) on each state.
- *
- * This DFS order is what guarantees the SQL standard's "preferment":
- *
- *   The branch that appears first in the PATTERN text takes precedence.
- *
- * Example: PATTERN (A | B) C
- *
- *   The first branch A of the ALT takes precedence over the second branch B.
- *   When both A and B can match, the match via A is selected.
- *
- * nfa_add_state_unique() prevents duplicate addition of the same state,
- * so the state added first (= from the preferred branch) is retained.
- *
- * IX-3. Routing Function: nfa_route_to_elem()
- *
- * All inter-element transitions in the advance phase go through
- * nfa_route_to_elem().
- * This function branches its behavior based on the type of the next element:
- *
- *   If the next element is VAR:
- *     (1) Add the state to the context (nfa_add_state_unique)
- *     (2) If the VAR has min=0, also add a skip path (recurse via next)
- *     -> Expansion stops here (VAR is the element that "will consume the next
- *        row")
- *
- *   If the next element is non-VAR (ALT, BEGIN, END, FIN):
- *     -> Recursively call nfa_advance_state() to continue expansion
- *
- * With this structure, advance recursively follows epsilon transitions
- * until reaching a VAR, consistently stopping only at VAR elements.
- *
- * IX-4. Per-Element advance Behavior
- *
- * (a) ALT (nfa_advance_alt)
- *
- *   Upon encountering an ALT element, all branches are expanded in order.
- *   The first element of each branch is connected via a jump pointer.
- *
- *   idx=0 (ALT) -> branch 1 start (next) -> branch 2 start (jump) -> ...
- *
- *   nfa_advance_state() is recursively called for each branch.
- *
- * (b) BEGIN (nfa_advance_begin)
- *
- *   Handles group entry.
- *   jump points to the element after END (= first element outside the group).
- *
- *   Greedy (default):
- *     (1) Enter the group body (move via next, reset the count at that depth)
- *     (2) If min=0, also add a group skip path (move via jump)
- *
- *   Reluctant:
- *     Order reversed -- skip path first, group entry second.
- *     If the skip path reaches FIN, the group entry path is not generated
- *     (shortest match preferred).
- *
- * (c) END (nfa_advance_end)
- *
- *   Handles group termination. This is the core of the repetition logic.
- *
- *   Let count be the count at the current depth:
- *
- *   count < min:
- *     Loop-back (move via jump, repeat the group body)
- *
- *     If the RPR_ELEM_EMPTY_LOOP flag is set:
- *       In addition to loop-back, also add a fast-forward exit path.
- *       This is because the body may produce an empty match, causing count
- *       to never reach min. fast-forward resets counts[depth] to 0
- *       and exits via next (treating the remaining required iterations
- *       as empty matches).
- *
- *   min <= count < max:
- *     Greedy: loop-back first, exit second
- *     Reluctant: exit first, loop-back second
- *                If the exit path reaches FIN, loop-back is omitted.
- *
- *   count >= max:
- *     Unconditional exit (move via next)
- *
- *   On exit: reset counts[depth] = 0, and if the next element is an outer END,
- *   increment the count at the outer depth.
- *
- * (d) VAR (nfa_advance_var)
- *
- *   Handles repeat/exit for a VAR element with a quantifier.
- *
- *   Let count be the count at the current depth:
- *
- *   count < min:
- *     Unconditional loop (stay at the same elemIdx, wait for the next row)
- *
- *   min <= count < max:
- *     Greedy: loop first, exit (next) second
- *     Reluctant: exit first, loop second
- *                If the exit path reaches FIN, loop is omitted.
- *
- *   count >= max:
- *     Unconditional exit (move via next)
- *
- *   On exit: reset counts[depth] = 0.
- *
- * (e) FIN
- *
- *   Match success. The current state is moved to matchedState for recording,
- *   and matchEndRow is set to the current row.
- *
- *   Upon reaching FIN, all remaining unprocessed states are removed
- *   (early termination). By DFS order, the path that reached FIN first
- *   has the highest preferment, so the rest are inferior paths.
- *   This is the core mechanism that guarantees preferment.
- *
- *   In SKIP PAST LAST ROW mode, upon reaching FIN, subsequent contexts
- *   that started within the match range are immediately pruned.
- *
- * IX-5. State Deduplication: nfa_add_state_unique()
- *
- * When adding a new state to a context, it is compared against existing
- * states;
- * if an identical state already exists, it is not added.
- *
- * Comparison criteria: elemIdx + counts[0..elem->depth] (see V-1)
- *
- * This deduplication is the core mechanism that suppresses NFA state
- * explosion.
- * Because DFS order causes preferred-branch states to be added first,
- * identical states from lower-priority branches are automatically discarded.
- *
- * IX-6. Cycle Detection: nfaVisitedElems
- *
- * When a group body can produce an empty match,
- * looping back from END may cause an infinite loop.
- *
- * Example: PATTERN ((A?)*)
- *
- *   A? has min=0, so it can pass through without matching.
- *   If the outer group repeats: BEGIN -> A? skip -> END -> BEGIN -> ...
- *
- * To prevent this:
- *
- *   (1) At compile time: set the RPR_ELEM_EMPTY_LOOP flag on the END
- *       of groups whose body is nullable.
- *       The runtime effect of this flag is described in IX-4(c):
- *       when count < min, a fast-forward exit path is added,
- *       resolving the deadlock where count cannot increase due to empty
- *       matches.
- *
- *   (2) At runtime: initialize the nfaVisitedElems bitmap immediately before
- *       DFS expansion of each state within advance (once per state).
- *       During DFS, epsilon elements (END, ALT, BEGIN) are marked in the
- *       bitmap at nfa_advance_state entry.  VAR elements are marked later
- *       when added to the state list (nfa_add_state_unique), so that
- *       legitimate loop-back to the same VAR in a new group iteration
- *       (e.g., END -> ALT -> same VAR) is not blocked.
- *       If a previously visited elemIdx is revisited, that path is terminated.
- *
- *   Note: the bitmap tracks only elemIdx and does not consider counts.
- *   Therefore, legitimate revisits to the same elemIdx but with different
- *   counts may also be blocked.  This only occurs when the group body is
- *   nullable (all paths can match empty), causing END -> loop-back ->
- *   skip -> END within a single DFS.  In such cases the END element has
- *   the RPR_ELEM_EMPTY_LOOP flag, so the fast-forward exit (IX-4(c))
- *   provides an alternative path that bypasses the cycle.
- *
- * Chapter X  Match Result Processing
- * ============================================================================
- *
- * X-1. Match Result
- *
- * RPR tracks the current match result as a single entry in WindowAggState
- * with four fields: rpr_match_valid, rpr_match_matched, rpr_match_start,
- * and rpr_match_length.  When rpr_match_valid is true, the entry describes
- * the match result for the position at rpr_match_start: rpr_match_matched
- * indicates success or failure, and rpr_match_length gives the number of
- * rows consumed.  A match with rpr_match_length 0 represents an empty match
- * (pattern matched but consumed no rows).  When rpr_match_valid is false,
- * the position has not been evaluated yet (RF_NOT_DETERMINED).
- *
- * A row's status against the current match result can be obtained by
- * calling get_reduced_frame_status().
- *
- * X-2. AFTER MATCH SKIP
- *
- * Determines the starting point for the next match attempt after a successful
- * match:
- *
- *   SKIP TO NEXT ROW:
- *     New match attempt begins from the row after the match start row.
- *     Overlapping matches are possible.
- *
- *   SKIP PAST LAST ROW:
- *     New match attempt begins from the row after the match end row.
- *     Only non-overlapping matches are possible.
- *
- * X-3. INITIAL vs SEEK
- *
- *   Standard definition (section 6.12):
- *   INITIAL: "is used to look for a match whose first row is R."
- *   SEEK:    "is used to permit a search for the first match anywhere
- *            from R through the end of the full window frame."
- *   In either case, if there is no match, the reduced window frame is empty.
- *   The default is INITIAL.
- *
- *   Current implementation:
- *   SEEK is not supported (the parser raises an error).
- *   Only INITIAL is supported, searching only for matches starting at each
- *   row position pos.
- *
- * X-4. Bounded Frame Handling
- *
- *   When the frame is bounded (e.g., ROWS BETWEEN CURRENT ROW AND 5
- *   FOLLOWING), ExecRPRProcessRow receives hasLimitedFrame=true and
- *   frameOffset indicating the upper bound.  Before the match phase,
- *   any context whose match has exceeded the frame boundary
- *   (currentPos >= matchStartRow + frameOffset + 1) is finalized early
- *   by forcing a mismatch.  This prevents matches from extending beyond
- *   the window frame.  The sum is clamped to PG_INT64_MAX on overflow.
- *
- *   Note that bounded frames also disable context absorption at the
- *   planner level (see VIII-3(b)), since the frame boundary breaks the
- *   monotonicity assumption required for correct absorption.
- *
- * Chapter XI  Worked Example: Full Execution Trace
- * ============================================================================
- *
- * XI-1. Query
- *
- *   SELECT company, tdate, price,
- *          first_value(price) OVER w AS start_price,
- *          last_value(price) OVER w AS end_price
- *   FROM stock
- *   WINDOW w AS (
- *     PARTITION BY company
- *     ORDER BY tdate
- *     ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- *     AFTER MATCH SKIP PAST LAST ROW
- *     PATTERN (A+ B)
- *     DEFINE A AS price > PREV(price),
- *            B AS price < PREV(price)
- *   );
- *
- * XI-2. Data
- *
- *   Row#    tdate       price
- *   --------------------------
- *   0       2024-01-01  100
- *   1       2024-01-02  110
- *   2       2024-01-03  120
- *   3       2024-01-04  115
- *   4       2024-01-05  130
- *
- * XI-3. Compilation Result
- *
- *   PATTERN (A+ B) -> unchanged after optimization
- *
- *   idx  varId  depth  min  max  next  jump
- *   -----------------------------------------
- *    0   A(0)   0      1    INF  1     -1     A+
- *    1   B(1)   0      1    1    2     -1     B
- *    2   FIN    0      1    1    -1    -1
- *
- *   DEFINE: A -> "price > PREV(price)", B -> "price < PREV(price)"
- *   isAbsorbable = true (A+ is a simple unbounded VAR)
- *
- * XI-4. Execution Trace
- *
- * --- Row 0 (price=100) ---
- *
- *   update_reduced_frame(0) called.
- *
- *   Context C0 created (matchStartRow=0).
- *   Initial advance: elemIdx=0(A) -> VAR, so state is added.
- *   C0.states = [{elemIdx=0, counts=[0]}]
- *
- *   nfa_evaluate_row(0):
- *     A: price(100) > PREV(price) -> no PREV -> false
- *     B: price(100) < PREV(price) -> no PREV -> false
- *     varMatched = [false, false]
- *
- *   ExecRPRProcessRow(0):
- *     Phase 1 (Match): A(0) state vs varMatched[0]=false -> state removed
- *     C0.states = [] (empty)
- *
- *     Phase 2 (Absorb): skipped (no states)
- *     Phase 3 (Advance): skipped (no states)
- *
- *   C0.states is empty, so the loop terminates.
- *   matchEndRow < matchStartRow -> unmatched.
- *
- * --- Row 1 (price=110) ---
- *
- *   update_reduced_frame(1) called.
- *
- *   Context C1 created (matchStartRow=1).
- *   Initial advance: C1.states = [{elemIdx=0, counts=[0]}]
- *
- *   nfa_evaluate_row(1):
- *     A: 110 > PREV(100) -> true
- *     B: 110 < PREV(100) -> false
- *     varMatched = [true, false]
- *
- *   ExecRPRProcessRow(1):
- *     Phase 1 (Match): A(0) match succeeds -> counts[0]++ -> counts=[1]
- *     C1.states = [{elemIdx=0, counts=[1]}]
- *
- *     Phase 3 (Advance):
- *       State {elemIdx=0, counts=[1]}: A+ (min=1, count=1, max=INF)
- *         count >= min, so:
- *         Greedy -> loop first: keep {elemIdx=0, counts=[1]}
- *                   exit: reset counts[0]=0, next(=1) -> {elemIdx=1,
- *                         counts=[0]}
- *     C1.states = [{elemIdx=0, counts=[1]}, {elemIdx=1, counts=[0]}]
- *
- * --- Row 2 (price=120) ---
- *
- *   Context C2 created (matchStartRow=2).
- *   Initial advance: C2.states = [{elemIdx=0, counts=[0]}]
- *
- *   nfa_evaluate_row(2):
- *     A: 120 > PREV(110) -> true
- *     B: 120 < PREV(110) -> false
- *     varMatched = [true, false]
- *
- *   C1 ExecRPRProcessRow(2):
- *     Phase 1 (Match):
- *       {elemIdx=0, counts=[1]}: A matches -> counts=[2]
- *       {elemIdx=1, counts=[0]}: B does not match -> removed
- *     C1.states = [{elemIdx=0, counts=[2]}]
- *
- *   C2 ExecRPRProcessRow(2):
- *     Phase 1 (Match):
- *       {elemIdx=0, counts=[0]}: A matches -> counts=[1]
- *     C2.states = [{elemIdx=0, counts=[1]}]
- *
- *     Phase 2 (Absorb):
- *       Does C1 (started earlier) cover C2?
- *         C1: {elemIdx=0, counts=[2]}, C2: {elemIdx=0, counts=[1]}
- *         Same elemIdx, C1.counts >= C2.counts -> covered
- *       C2 absorbed. -> removed.
- *
- *     Phase 3 (Advance):
- *       {elemIdx=0, counts=[2]}: Greedy -> loop + exit
- *         Loop: {elemIdx=0, counts=[2]}
- *         Exit: reset counts[0]=0, next(=1) -> {elemIdx=1, counts=[0]}
- *     C1.states = [{elemIdx=0, counts=[2]}, {elemIdx=1, counts=[0]}]
- *
- *   Context C3 created (matchStartRow=3).
- *
- * --- Row 3 (price=115) ---
- *
- *   nfa_evaluate_row(3):
- *     A: 115 > PREV(120) -> false
- *     B: 115 < PREV(120) -> true
- *     varMatched = [false, true]
- *
- *   ExecRPRProcessRow(3):
- *     Phase 1 (Match):
- *       {elemIdx=0, counts=[2]}: A does not match -> removed
- *       {elemIdx=1, counts=[0]}: B matches -> counts=[1]
- *     C1.states = [{elemIdx=1, counts=[1]}]
- *
- *     Phase 3 (Advance):
- *       {elemIdx=1, counts=[1]}: B (min=1, max=1)
- *         count(1) >= max(1) -> unconditional exit
- *         Reset counts[0]=0, next = 2 (FIN)
- *       FIN reached -> matchEndRow = 3, matchedState recorded.
- *       Early termination: no remaining states, so completed immediately.
- *     C1.states = [] (empty after reaching FIN)
- *
- *   C1.states is empty and matchEndRow=3 >= matchStartRow=1 -> match succeeds.
- *
- *   rpr_match_start = 1, rpr_match_length = 3
- *
- * --- Row 4 (price=130) ---
- *
- *   update_reduced_frame(4) called.
- *   C3 was already created but matchStartRow=3, so it is not applicable.
- *   New context C4 created (matchStartRow=4).
- *
- *   nfa_evaluate_row(4):
- *     A: 130 > PREV(115) -> true
- *     B: 130 < PREV(115) -> false
- *
- *   ... No subsequent rows, so ExecRPRFinalizeAllContexts() is called.
- *   Match incomplete -> unmatched.
- *
- * XI-5. Final Result
- *
- *   Row 0: unmatched     -> frame = the row itself
- *   Row 1: match head    -> frame = rows 1 through 3
- *   Row 2: inside match  -> skipped
- *   Row 3: inside match  -> skipped
- *   Row 4: unmatched     -> frame = the row itself
- *
- * Chapter XII  Summary of Key Design Decisions
- * ============================================================================
- *
- * XII-1. Flat Array vs Tree-Based NFA
- *
- *   Choice: Flat array (RPRPatternElement[])
- *
- *   Rationale:
- *   - Cache-friendly: 16-byte fixed size, contiguous memory
- *   - Index-based references: 2-byte indices instead of pointers
- *   - Easy to serialize: can use memcpy when passing to plan nodes
- *
- * XII-2. Forward-only Execution vs Backtracking
- *
- *   Choice: Forward-only (state set tracking)
- *
- *   Rationale:
- *   - Backtracking takes exponential time in the worst case
- *   - NFA simulation guarantees polynomial time
- *   - DFS order naturally guarantees preferment.
- *     Greedy/reluctant per quantifier requires only reversing the DFS order
- *   - Window functions receive sorted rows sequentially.
- *     Forward-only fits directly into this pipeline,
- *     whereas backtracking requires re-fetching previous rows
- *   - DEFINE conditions are SQL expressions (PREV, RUNNING aggregates, etc.)
- *     with high re-evaluation cost. Forward-only requires only one evaluation
- *     per row
- *
- * XII-3. Per-Context Management
- *
- *   Choice: Independent context per start row
- *
- *   Rationale:
- *   - Supports overlapping matches under SKIP TO NEXT ROW
- *   - Determines the frame for each row independently
- *   - Absorption optimization can eliminate redundant contexts in O(n)
- *
- * XII-4. Memory Pool Management
- *
- *   Choice: Custom free list
- *
- *   Rationale:
- *   - NFA states are created and destroyed in large numbers per row
- *   - Avoids palloc/pfree overhead
- *   - State size is variable (counts[] array), but within a single query
- *     maxDepth is fixed, so all states have the same size
- *
- * XII-5. Execution Optimization Summary
- *
- *   The following optimizations make the NFA simulation practical.
- *
- *   -- Compile-time --
- *
- *   (1) AST Optimization (IV-3)
- *
- *     Simplifies the AST before converting the pattern to an NFA.
- *     Reduces the number of NFA elements through consecutive variable
- *     merging (A A -> A{2}), SEQ flattening, quantifier multiplication,
- *     and other transformations.
- *
- *     Significance: Reducing the element count directly shrinks the state
- *     space, decreasing the cost of all subsequent runtime phases (match,
- *     absorb, advance).
- *
- *   -- Runtime: advance phase --
- *
- *   (2) Group Skip (IX-4(b))
- *
- *     At the BEGIN of a group with min=0, uses jump to skip the entire
- *     group. Moves directly to the first element outside the group without
- *     exploring the group body. Greedy enters then skips; Reluctant skips
- *     then enters.
- *
- *     Significance: For optional groups (min=0), immediately generates
- *     a skip path without exploring the body, avoiding unnecessary DFS
- *     expansion.
- *
- *   (3) State Deduplication (IX-5)
- *
- *     During advance, DFS may generate states with the same (elemIdx,
- *     counts) combination through multiple paths. Additionally, for
- *     group absorption, nfa_match performs inline advance from bounded
- *     VARs (count >= max) within the absorbable region (ABSORBABLE_BRANCH)
- *     through END chains to reach the judgment point (ABSORBABLE END).
- *     This process can also produce duplicate states reaching the same END.
- *     nfa_add_state_unique() blocks duplicate addition of identical states
- *     in both cases.
- *
- *     Significance: Prevents exponential growth of the state count in
- *     ALT branches and quantifier expansion. Since DFS order causes the
- *     preferred branch's state to be registered first, identical states
- *     from lower-priority branches are automatically discarded, thereby
- *     also guaranteeing preferment.
- *
- *   (4) Cycle Detection and Fast-Forward (IX-6, IX-4(c))
- *
- *     When a nullable group body (e.g., A?) repeats empty matches,
- *     the END -> BEGIN loop-back can continue indefinitely.
- *
- *     Two mechanisms resolve this:
- *     - A visited bitmap (nfaVisitedElems) blocks revisitation of the
- *       same element, preventing infinite loops (safety)
- *     - At an END with the RPR_ELEM_EMPTY_LOOP flag set, when
- *       count < min, the remaining required iterations are treated as
- *       empty matches and a fast-forward exit path out of the group is
- *       added (correctness)
- *
- *     Significance: Cycle detection guarantees termination, and
- *     fast-forward guarantees that the min condition is satisfied.
- *     Without these, patterns containing nullable groups would fall
- *     into infinite loops or fail to match.
- *
- *   (5) Match Pruning (IX-4(e))
- *
- *     When a state reaches FIN during advance, all remaining unprocessed
- *     states of that context are removed. Because of DFS order, the path
- *     that reaches FIN first has the highest preferment, so the remaining
- *     paths are inferior.
- *
- *     Significance: Once the best match is determined, exploration of
- *     inferior paths is immediately terminated. This mechanism achieves
- *     both preferment guarantees and performance optimization.
- *
- *   -- Runtime: inter-context --
- *
- *   (6) Early Termination (SKIP PAST LAST ROW)
- *
- *     In SKIP PAST LAST ROW mode, when a match is found, subsequent
- *     contexts whose start rows fall within the match range are pruned
- *     immediately without further processing.
- *     In SKIP TO NEXT ROW mode, overlapping contexts are preserved
- *     because each row requires its own independent match.
- *
- *     Significance: Prunes subsequent contexts whose start rows overlap
- *     with a prior match range, avoiding unnecessary processing.
- *
- *   (7) Context Absorption (Chapter VIII)
- *
- *     If an independent context is created for each row, O(n^2) states
- *     accumulate. By exploiting the monotonicity that an earlier-started
- *     context subsumes the states of a later-started context, redundant
- *     contexts are eliminated early.
- *
- *     Absorbability is determined per-element; comparison is performed
- *     only at elements with the RPR_ELEM_ABSORBABLE flag (see IV-5).
- *
- *     Significance: Keeps the number of active contexts at a constant
- *     level, achieving O(n^2) -> O(n) time complexity. Without this,
- *     performance degrades sharply on long partitions.
- *
- * Appendix A. Key Function Index
- * ============================================================================
- *
- *   Function                      File                  Role
- *   --------------------------------------------------------------------------
- *   transformRPR                  parse_rpr.c           Parser entry point
- *   transformDefineClause         parse_rpr.c           DEFINE transformation
- *   collectPatternVariables       rpr.c                 Variable collection
- *   buildDefineVariableList       rpr.c                 DEFINE variable list
- *   buildRPRPattern               rpr.c                 NFA compilation main
- *   optimizeRPRPattern            rpr.c                 AST optimization
- *   fillRPRPattern                rpr.c                 NFA element generation
- *   finalizeRPRPattern            rpr.c                 Finalization
- *   computeAbsorbability          rpr.c                 Absorption analysis
- *   update_reduced_frame          nodeWindowAgg.c       Execution main loop
- *   nfa_evaluate_row              nodeWindowAgg.c       DEFINE evaluation
- *   ExecRPRStartContext           execRPR.c             Context creation
- *   ExecRPRProcessRow             execRPR.c             3-phase processing
- *   nfa_match                     execRPR.c             Phase 1
- *   nfa_absorb_contexts           execRPR.c             Phase 2
- *   nfa_advance                   execRPR.c             Phase 3
- *   nfa_advance_state             execRPR.c             Per-state branching
- *   nfa_route_to_elem             execRPR.c             Element routing
- *   nfa_advance_alt               execRPR.c             ALT handling
- *   nfa_advance_begin             execRPR.c             BEGIN handling
- *   nfa_advance_end               execRPR.c             END handling
- *   nfa_advance_var               execRPR.c             VAR handling
- *   nfa_add_state_unique          execRPR.c             Deduplication
- *   nfa_states_covered            execRPR.c             Absorption check
- *   nfa_reevaluate_dependent_vars execRPR.c             Per-context re-eval
- *   ExecRPRGetHeadContext         execRPR.c             Context lookup
- *   ExecRPRFreeContext            execRPR.c             Context deallocation
- *   ExecRPRCleanupDeadContexts    execRPR.c             Dead context cleanup
- *   ExecRPRFinalizeAllContexts    execRPR.c             Partition-end finalize
- *   ExecRPRRecordContextSuccess   execRPR.c             Stats: match success
- *   ExecRPRRecordContextFailure   execRPR.c             Stats: match failure
- *   compute_nav_offsets           createplan.c          Trim offset computation
- *
- * Appendix B. Data Structure Relationship Diagram
- * ============================================================================
- *
- *   Parser Layer
- *   --------
- *   RPCommonSyntax
- *     |--- rpSkipTo: RPSkipTo
- *     |--- initial: bool
- *     +--- rpPattern: RPRPatternNode* (tree)
- *          |--- nodeType: VAR | SEQ | ALT | GROUP
- *          |--- min, max: quantifier
- *          |--- varName: variable name (VAR only)
- *          +--- children: List* (SEQ/ALT/GROUP only)
- *
- *   Planner Layer
- *   ----------
- *   WindowAgg (plan node)
- *     |--- rpSkipTo: RPSkipTo
- *     |--- defineClause: List<TargetEntry>
- *     +--- rpPattern: RPRPattern*
- *          |--- numVars: int
- *          |--- varNames: char**
- *          |--- maxDepth: RPRDepth
- *          |--- isAbsorbable: bool
- *          |--- numElements: int
- *          +--- elements: RPRPatternElement[]  (flat array)
- *               |--- varId      (1B)
- *               |--- depth      (1B)
- *               |--- flags      (1B)
- *               |--- reserved   (1B)
- *               |--- min, max   (4B + 4B)
- *               +--- next, jump (2B + 2B)
- *
- *   Executor Layer
- *   ----------
- *   WindowAggState
- *     |--- rpSkipTo: RPSkipTo (AFTER MATCH SKIP mode)
- *     |--- rpPattern: RPRPattern* (copied from plan)
- *     |--- defineVariableList: List<String> (variable names, DEFINE order)
- *     |--- defineClauseList: List<ExprState>
- *     |--- nfaVarMatched: bool[] (per-row cache)
- *     |--- nfaVisitedElems: bitmapword* (cycle detection)
- *     |--- nfaStateSize: Size (pre-calculated RPRNFAState allocation size)
- *     |--- nfaContext <-> nfaContextTail (doubly-linked list)
- *     |   +--- RPRNFAContext
- *     |       |--- states: RPRNFAState* (linked list)
- *     |       |   |--- elemIdx
- *     |       |   |--- counts[]
- *     |       |   +--- isAbsorbable
- *     |       |--- matchStartRow, matchEndRow
- *     |       |--- lastProcessedRow
- *     |       |--- matchedState (cloned on FIN arrival)
- *     |       |--- hasAbsorbableState
- *     |       +--- allStatesAbsorbable
- *     |--- nfaContextFree (recycling pool)
- *     +--- nfaStateFree (recycling pool)
- *
- * Appendix C. NFA Element Array Examples
- * ============================================================================
- *
- * C-1. PATTERN (A B C)
- *
- *   idx  varId  depth  min  max  next  jump
- *   ------------------------------------------
- *    0   A      0      1    1    1     -1
- *    1   B      0      1    1    2     -1
- *    2   C      0      1    1    3     -1
- *    3   FIN    0      1    1    -1    -1
- *
- * C-2. PATTERN (A+ B*)
- *
- *   idx  varId  depth  min  max  next  jump  flags
- *   ------------------------------------------------------------------------
- *    0   A      0      1    INF  1     -1    ABSORBABLE | ABSORBABLE_BRANCH
- *    1   B      0      0    INF  2     -1
- *    2   FIN    0      1    1    -1    -1
- *
- *   Only A+ is the absorption point (Case 1). Once past A,
- *   absorption is permanently disabled for that state.
- *
- * C-3. PATTERN (A | B | C)
- *
- *   idx  varId  depth  min  max  next  jump
- *   ----------------------------------------
- *    0   ALT    0      1    1    1     -1    alternation start
- *    1   A      1      1    1    4     2     branch 1 -> FIN, jump -> branch 2
- *    2   B      1      1    1    4     3     branch 2 -> FIN, jump -> branch 3
- *    3   C      1      1    1    4     -1    branch 3 -> FIN
- *    4   FIN    0      1    1    -1    -1
- *
- * C-4. PATTERN ((A B)+ C)
- *
- *   idx  varId    depth  min  max  next  jump  flags
- *   --------------------------------------------------------------------------
- *    0   BEGIN    0      1    INF  1     4     ABSORBABLE_BRANCH
- *    1   A        1      1    1    2     -1    ABSORBABLE_BRANCH
- *    2   B        1      1    1    3     -1    ABSORBABLE_BRANCH
- *    3   END      0      1    INF  4     1     ABSORBABLE | ABSORBABLE_BRANCH
- *    4   C        0      1    1    5     -1
- *    5   FIN      0      1    1    -1    -1
- *
- *   Case 2: GROUP+ with {1,1} body VARs. A, B are branches;
- *   END is the absorption point. Compare with C-6 (Case 3).
- *
- * C-5. PATTERN ((A | B)+? C)
- *
- *   idx  varId    depth  min  max   next  jump  flags
- *   -------------------------------------------------------------------
- *    0   BEGIN    0      1    INF   1     5     RELUCTANT, group start
- *    1   ALT      1      1    1     2     -1    alternation start
- *    2   A        2      1    1     4     3     branch 1
- *    3   B        2      1    1     4     -1    branch 2
- *    4   END      0      1    INF   5     1     RELUCTANT, group end
- *    5   C        0      1    1     6     -1
- *    6   FIN      0      1    1     -1    -1
- *
- * C-6. PATTERN ((A+ B)+ C)  -- Absorbability flag example
- *
- *   idx  varId    depth  min  max   next  jump  flags
- *   ---------------------------------------------------------------------------
- *    0   BEGIN    0      1    INF   1     4     ABSORBABLE_BRANCH, group start
- *    1   A        1      1    INF   2     -1    ABSORBABLE | ABSORBABLE_BRANCH
- *    2   B        1      1    1     3     -1
- *    3   END      0      1    INF   4     1     group end
- *    4   C        0      1    1     5     -1
- *    5   FIN      0      1    1     -1    -1
- *
- *   Recurses from BEGIN into the body -> A matches Case 1 (simple VAR+).
- *   A gets ABSORBABLE | ABSORBABLE_BRANCH, BEGIN gets ABSORBABLE_BRANCH.
- *   B and END get no flags -> absorption stops once the state advances to B.
- *   (See IV-5 Case 3)
- *
- * C-7. PATTERN ((A+ B | C*)+ D)  -- Per-branch absorption in ALT
- *
- *   idx  varId    depth  min  max   next  jump  flags
- *   ---------------------------------------------------------------------------
- *    0   BEGIN    0      1    INF   1     6     ABSORBABLE_BRANCH
- *    1   ALT      1      1    1     2     -1    ABSORBABLE_BRANCH
- *    2   A        2      1    INF   3     4     ABSORBABLE | ABSORBABLE_BRANCH
- *    3   B        2      1    1     5     -1
- *    4   C        2      0    INF   5     -1    ABSORBABLE | ABSORBABLE_BRANCH
- *    5   END      0      1    INF   6     1     EMPTY_LOOP
- *    6   D        0      1    1     7     -1
- *    7   FIN      0      1    1     -1    -1
- *
- *   ALT branches are checked independently for absorbability.
- *   Branch 1: A+ matches Case 1 -> A gets ABSORBABLE. B has no flag.
- *   Branch 2: C* matches Case 1 -> C gets ABSORBABLE.
- *   Both A and C get ABSORBABLE_BRANCH as part of their respective branch
- *   paths.
- *   END has EMPTY_LOOP: branch 2 (C*) is nullable, making the group body
- *   nullable.
- *   BEGIN and ALT get ABSORBABLE_BRANCH (on the path to absorbable elements).
- *
- * ============================================================================
- *   End of document
- * ============================================================================
+ * For the design and execution model of the NFA engine implemented
+ * in this file, see src/backend/executor/README.rpr.
  */
 
 /* Bitmap macros for NFA cycle detection (cf. bitmapset.c, tidbitmap.c) */
-- 
2.50.1 (Apple Git-155)