Thread

  1. Re: Row pattern recognition

    Henson Choi <assam258@gmail.com> — 2026-05-30T14:08:38Z

    Hi all,
    
    Resolved since the last post:
    
      D1. Single-row frame conformance, Subclause 6.10.2 -- Tatsuo's call [6]
          was to reject. Both ROWS BETWEEN CURRENT ROW AND CURRENT ROW and
          AND 0 FOLLOWING are now rejected (nocfbot-0025), which in turn
          unblocks the held ExecRPRProcessRow change (nocfbot-0026).
    
    Open decisions (repeated with context at the end):
    
      D2. The RPRContext consolidation -- Tatsuo's call as co-author;
          non-blocking either way [4].
      D3. The AST "absorption" rename -- Tatsuo's call [2].
    
      [1] Jian's review, round 1 (2026-05-26):
    
    https://postgr.es/m/CACJufxH-DZePhbdJM=8nNYceQiSbbXXLTw54iLhxiynQ+4hbBA@mail.gmail.com
      [2] my round-1 reply -- AST "absorption" rename deferred to Tatsuo (D3,
    2026-05-27):
    
    https://postgr.es/m/CAAAe_zDephfiDA_A3FN0hCymJRogEr=Rt3QoCTf4qMYDLk+xNA@mail.gmail.com
      [3] Jian's review, round 2 (2026-05-28):
    
    https://postgr.es/m/CACJufxGX17thWuEOq1tM5xbRHz2HXm1asooZC3GV25MYGmYqLQ@mail.gmail.com
      [4] my round-2 reply -- RPRContext deferred to Tatsuo (D2, 2026-05-29):
    
    https://postgr.es/m/CAAAe_zAH0MvP0TBmW3PTLeHjEpiyBz0473zJRM9pwLpseefMNw@mail.gmail.com
      [5] single-row frame conformance question (D1, 2026-05-29):
    
    https://postgr.es/m/CAAAe_zCbSU=dd-4qTL2QaBQwQ-cf51N_851a9Y5rOoz0wj0aXw@mail.gmail.com
      [6] Tatsuo's reply -- reject single-row frames (D1 resolved, 2026-05-30):
    
    https://postgr.es/m/20260530.114737.1416684464524168377.ishii@postgresql.org
      [7] Jian's review, round 3 (2026-05-30):
    
    https://postgr.es/m/CACJufxEsaU8GQ4yeXTWhAO8VjbrZTh5CpvUqz=4a3T0Cwz44pA@mail.gmail.com
    
    Attached: the v47 feature series (v47-0001..0009) rebased onto current
    master, plus the incremental patch series carried on top of it.
    
    Base:
    
      9a41b34a287  2026-05-26  doc:  add comma to UPDATE docs, for consistency
    
    Unchanged -- rebase only (already posted; only rebased, no content change).
    Titles for reference:
    
      nocfbot-0001  Add DEFINE non-volatile baseline to rpr_integration B9
      nocfbot-0002  Unify RPR DEFINE walkers and reject volatile callees
      nocfbot-0003  Cover RPR empty-match path with EXPLAIN tests; fix stale
    XXX comments
      nocfbot-0004  Reclassify DEFINE qualifier check and reword diagnostic to
    "expression"
      nocfbot-0005  Sync stale comments on DEFINE/PATTERN handling
      nocfbot-0006  Add trailing commas to RPR enum definitions
      nocfbot-0007  Remove optional outer parentheses from ereport() calls in
    RPR files
      nocfbot-0008  Add high-water mark tracking to NFA visited bitmap reset
      nocfbot-0009  Document DEFINE subquery rejection as intentional
    over-rejection
      nocfbot-0010  Remove duplicate #include in nodeWindowAgg.c
      nocfbot-0011  Normalize SQL/RPR standard references
      nocfbot-0012  Add rpr_integration B7 cases for RPR in recursive query
      nocfbot-0013  Reject row pattern recognition in recursive queries
      nocfbot-0014  Enhance README.rpr per Tatsuo Ishii's review
      nocfbot-0015  Round out README.rpr WindowAggState field coverage
      nocfbot-0016  Add raw_expression_tree_walker coverage for RPR raw nodes
    
      (nocfbot-0016 was sent earlier as 0015; renumbered here so the review
      series runs contiguously from 0017.)
    
    New incremental patches -- nocfbot-0017 onward. These apply Jian He's
    review (rounds 1 [1] and 2 [3]) and settle D1. nocfbot-0017..0024 and
    0026 are comment / doc / test plus a signature change and a couple of
    Assert additions -- no behavior change. nocfbot-0025 is the one
    user-visible change: it rejects the single-row frame per D1 [6].
    
      nocfbot-0017  Enhance README.rpr
          Chapter VIII absorption intro + a worked PATTERN (A+) trace;
          "Depth-First Search" spelled out at first use; the stale
          nfa_advance(initialAdvance=...) reference replaced.
    
      nocfbot-0018  Clarify execRPR.c comments and tighten an Assert
          Document the NFA invariants (compareDepth slot arithmetic, the
          asymmetric visited-marking scheme, the greedy/non-nullable BEGIN
          label, the ALT depth break, the state->next reset boundary),
          reframe ExecRPRFinalizeAllContexts as the partition-end policy
          holder, and add a defensive Assert in nfa_advance_var.
    
      nocfbot-0019  nfa_add_state_unique: bool return -> void
          The return value was unused.
    
      nocfbot-0020  Reluctant bounded mid-band test (rpr_nfa)
          A{3,5}? B, which drives the VAR-level count in nfa_advance_var
          through 3..5.
    
      nocfbot-0021  Define RPR absorption terminology in README.rpr
          Terminology block for the "match_start dep." column (none /
          direct / boundary check), the "boundary chk" typo fix, the
          count-dominance definition, and the match_start_dependent rename.
    
      nocfbot-0022  Document the get_reduced_frame_status cascade invariant
          The branches form an order-dependent early-return cascade; the
          running invariant is spelled out so the order reads as intentional.
    
      nocfbot-0023  Explain the completed-head-context branch in
                    update_reduced_frame
          Why a head context can already be complete under SKIP TO NEXT ROW.
    
      nocfbot-0024  Tighten the frame-boundary check from >= to ==
          The > case is unreachable by the loop invariant; the defense moves
          into an Assert(currentPos <= ctxFrameEnd), and the comment changes
          from "exceeded" to "reached".
    
      nocfbot-0025  Reject single-row window frame in row pattern recognition
          Per D1 [6], the frame end must be UNBOUNDED FOLLOWING or a positive
          offset FOLLOWING. CURRENT ROW is rejected in transformRPR() at parse
          time; a zero offset -- which need not be a constant -- in
          calculate_frame_offsets() at run time. The two single-row rpr_base
          tests become error cases, with a bind-parameter error test added.
    
      nocfbot-0026  Remove the redundant zero check on the RPR frame ending
                    offset
          With the single-row frame now rejected, a limited frame always
          carries a real offset, so the "endOffsetValue != 0" guard -- which
          compared a Datum directly to zero -- is dropped, leaving a plain
          DatumGetInt64(). This is the offset half of Jian's ExecRPRProcessRow
          cleanup; the structural half (dropping the hasLimitedFrame
          parameter) I've left as-is -- it is loop-invariant and computed once
          outside the per-row loop, so moving it into ExecRPRProcessRow would
          just repeat the work each row.
    
    Coverage -- my decision summaries [2], [4], with the patch each became:
    
      Round 1 [2]
        Short-circuit optimization       Separate series   -> separate series
        Absorption README narrative      Accept            -> nocfbot-0017
        AST-level "absorption" rename    Pending Tatsuo    -> D3
        DFS expansion                    Accept            -> nocfbot-0017
        initialAdvance README mismatch   Accept            -> nocfbot-0017
        Defensive Assert in advance_var  Accept            -> nocfbot-0018
        Finalize unnecessary?            Keep              -> nocfbot-0018
        Greedy comment label             Accept            -> nocfbot-0018
        state->next reset                Decline           -> nocfbot-0018
        count >= 3 test coverage         Accept            -> nocfbot-0020
        visited marking purpose          Accept            -> nocfbot-0018
        compareDepth comment             Accept            -> nocfbot-0018
        Unused bool return               Accept            -> nocfbot-0019
        ALT depth invariant Assert       Decline           -> nocfbot-0018
    
      Round 2 [4]
        RPRContext consolidation         Tatsuo's call     -> D2
        get_reduced_frame_status order   Keep              -> nocfbot-0022
        README terminology + typo        Accept            -> nocfbot-0021
        ExecRPRProcessRow refactor       Datum fix only    -> nocfbot-0026
        single-row frame (6.10.2, D1)    Reject (Tatsuo)   -> nocfbot-0025
        currentPos >= -> ==              Accept            -> nocfbot-0024
        states == NULL branch coverage   Already reached   -> nocfbot-0023
    
    Remaining work and decisions
    
      Decisions (need your input):
        D2  RPRContext consolidation -- Tatsuo.
        D3  AST "absorption" rename -- Tatsuo.
    
      Held pending a decision:
        - the RPRContext consolidation itself -- done if D2 says go.
        - the AST "absorption" rename itself -- done if D3 says go.
    
      (D1 settled: the parse-time frame-end check, the offset-0 test as an
       error case, and the Datum-comparison fix are now in nocfbot-0025/0026.)
    
      From Jian's round-3 review [7] (into the next revision):
        - four comment fixes: the line-number test anchors, the "at the END"
          and uppercase-END wording, and the nfa_advance_var count premise.
        - the make/clone rename of the NFA helpers (one of three attached
          refactors); the other two -- dropping nfaStateSize and the elem
          parameter -- I'd keep, both on hot-path grounds.
    
      From our in-house review (separate follow-ups):
        - a quantifier-normalization correctness fix (nested unbounded
          quantifiers such as (A{2,})* are currently mis-normalized)
        - a per-tuple memory-context fix in DEFINE evaluation (still verifying)
        - smaller correctness/conformance fixes (an overflow guard, a few
          missing parse-time checks, EXPLAIN output details)
        - documentation gaps (comments, README, SGML)
        - added regression tests (round-trip deparse, edge-case offsets)
    
      Before the v48 fold (from Jian's off-list comments):
        - INT_MAX -> PG_INT32_MAX (the unbounded-quantifier sentinel; ~24 sites)
        - foreach + lfirst() -> foreach_node (~33 sites)
        - foreach_current_index, dropping the redundant break (3 sites)
    
      Work, no decision needed:
        - short-circuit (lazy eval) -- stop evaluating a DEFINE predicate
          once its outcome is fixed. A separate series.
          It turns on a standard-interpretation point -- whether skipping is
          sound when a dropped subexpression has side effects.
    
    Thanks again to Jian for the careful reading.
    
    Henson