nocfbot-0020-Update-RPR-code-comments-for-1-slot-navigation.txt
text/plain
Filename: nocfbot-0020-Update-RPR-code-comments-for-1-slot-navigation.txt
Type: text/plain
Part: 19
Message:
Re: Row pattern recognition
From fea456180111bfdb10d4dc580fc4062fe0ae6b15 Mon Sep 17 00:00:00 2001
From: Henson Choi <assam258@gmail.com>
Date: Thu, 2 Apr 2026 16:06:06 +0900
Subject: [PATCH 20/40] Update RPR code comments to reflect 1-slot navigation
model
---
src/backend/executor/execRPR.c | 45 ++++++++++++++++++++++------------
src/backend/parser/parse_rpr.c | 3 ++-
2 files changed, 32 insertions(+), 16 deletions(-)
diff --git a/src/backend/executor/execRPR.c b/src/backend/executor/execRPR.c
index 4c429528b04..5428d0e8fc4 100644
--- a/src/backend/executor/execRPR.c
+++ b/src/backend/executor/execRPR.c
@@ -191,10 +191,15 @@
* transformDefineClause() processes each DEFINE variable as follows:
*
* (1) Checks for duplicate variable names
- * (2) Transforms the expression into a standard SQL expression
- * (3) Coerces to Boolean type (coerce_to_boolean)
+ * (2) Transforms the expression via transformExpr()
+ * (3) Extracts Var nodes via pull_var_clause() and ensures each is
+ * present in the query targetlist, so the planner propagates the
+ * referenced columns through the plan tree
* (4) Wraps in a TargetEntry with the variable name set in resname
*
+ * After all variables are processed:
+ * (5) Coerces each expression to Boolean type (coerce_to_boolean)
+ *
* Variables that are used in PATTERN but not defined in DEFINE are implicitly
* evaluated as TRUE (matching all rows).
*
@@ -431,8 +436,9 @@
*
* Case 1: Simple VAR+ (e.g., A+)
* -> ABSORBABLE | ABSORBABLE_BRANCH set on the VAR
- * Case 2: GROUP+ whose body consists only of {1,1} VARs (e.g., (A B)+)
- * -> ABSORBABLE_BRANCH on children,
+ * Case 2: GROUP+ with fixed-length children (min == max, recursively)
+ * e.g., (A B)+, (A B{2})+, ((A (B C){2}){2})+
+ * -> ABSORBABLE_BRANCH on all elements within the group,
* ABSORBABLE | ABSORBABLE_BRANCH on END
* Case 3: GROUP+ whose body starts with VAR+ (e.g., (A+ B)+)
* -> Recurses from BEGIN into the body, applying Case 1.
@@ -604,11 +610,17 @@
* varMatched[i] = (not null and true)
*
* To support row navigation operators such as PREV() and NEXT(),
- * the previous row, current row, and next row are set in separate slots:
+ * a 1-slot model is used: only ecxt_outertuple is set to the current
+ * row. PREV/NEXT navigation is handled by EEOP_RPR_NAV_SET/RESTORE
+ * opcodes emitted during DEFINE expression compilation:
+ *
+ * NAV_SET: save ecxt_outertuple, swap in target row via nav_slot
+ * (evaluate): argument expression reads from swapped slot
+ * NAV_RESTORE: restore original ecxt_outertuple
*
- * ecxt_scantuple = previous row (for PREV reference)
- * ecxt_outertuple = current row (default reference)
- * ecxt_innertuple = next row (for NEXT reference)
+ * nav_slot caches the last fetched position (nav_slot_pos) to avoid
+ * redundant tuplestore lookups when multiple PREV/NEXT calls target
+ * the same row.
*
* The varMatched array is referenced later in Phase 1 (Match).
*
@@ -908,8 +920,11 @@
*
* (2) At runtime: initialize the nfaVisitedElems bitmap immediately before
* DFS expansion of each state within advance (once per state).
- * During DFS, set the corresponding elemIdx bit when visiting each
- * element.
+ * During DFS, epsilon elements (END, ALT, BEGIN) are marked in the
+ * bitmap at nfa_advance_state entry. VAR elements are marked later
+ * when added to the state list (nfa_add_state_unique), so that
+ * legitimate loop-back to the same VAR in a new group iteration
+ * (e.g., END -> ALT -> same VAR) is not blocked.
* If a previously visited elemIdx is revisited, that path is terminated.
*
* Note: the bitmap tracks only elemIdx and does not consider counts.
@@ -1216,11 +1231,11 @@
* (3) State Deduplication (IX-5)
*
* During advance, DFS may generate states with the same (elemIdx,
- * counts) combination through multiple paths. Additionally, unlike
- * VAR repetition, group repetition cannot perform absorption
- * comparison using VAR states, so inline advance is performed from
- * after Phase 1 match through to END; this process can also produce
- * duplicate states reaching the same END.
+ * counts) combination through multiple paths. Additionally, for
+ * group absorption, nfa_match performs inline advance from bounded
+ * VARs (count >= max) within the absorbable region (ABSORBABLE_BRANCH)
+ * through END chains to reach the judgment point (ABSORBABLE END).
+ * This process can also produce duplicate states reaching the same END.
* nfa_add_state_unique() blocks duplicate addition of identical states
* in both cases.
*
diff --git a/src/backend/parser/parse_rpr.c b/src/backend/parser/parse_rpr.c
index 3fb5d94abe9..d1e02e52e53 100644
--- a/src/backend/parser/parse_rpr.c
+++ b/src/backend/parser/parse_rpr.c
@@ -265,7 +265,8 @@ validateRPRPatternVarCount(ParseState *pstate, RPRPatternNode *node,
*
* Then for each DEFINE variable:
* 2. Checks for duplicate variable names in DEFINE clause
- * 3. Transforms expressions and adds to targetlist via findTargetlistEntrySQL99
+ * 3. Transforms expression via transformExpr() and ensures referenced
+ * Var nodes are present in the query targetlist (via pull_var_clause)
* 4. Creates defineClause entry with proper resname (pattern variable name)
* 5. Coerces expressions to boolean type
* 6. Marks column origins and assigns collation information
--
2.50.1 (Apple Git-155)