Thread
-
Re: Row pattern recognition
jian he <jian.universality@gmail.com> — 2026-05-28T13:23:59Z
Hi. In src/include/nodes/execnodes.h, we're adding quite a few fields to WindowAggState that are only used for RPR queries. Should we consolidate these fields behind a single pointer (named RPRContext) to keep the WindowAggState size smaller for non-RPR queries? In function get_reduced_frame_status, I made the following changes: `````` bool conditionA; bool conditionB; bool conditionC; bool conditionD; if (!winstate->rpr_match_valid) return RF_NOT_DETERMINED; conditionA = (pos == start && winstate->rpr_match_matched && length == 0); conditionB = (pos < start || pos >= start + length); conditionC = !winstate->rpr_match_matched; conditionD = (pos == start); if ((conditionA && conditionB) || (conditionA && conditionC) || (conditionA && conditionD)) { if (conditionA) elog(INFO, "conditionA is true"); if (conditionB) elog(INFO, "conditionB is true"); if (conditionC) elog(INFO, "conditionC is true"); if (conditionD) elog(INFO, "conditionD is true"); } if ((conditionB && conditionC) || (conditionB && conditionD) || (conditionC && conditionD)) { elog(INFO, "more than 2 branch is true, should not be reached"); if (!(conditionC && conditionD)) elog(INFO, "not (conditionCD is true)"); if ((conditionB && conditionD)) elog(INFO, "conditionB and conditionD is true)"); } `````` I re-ran the regression tests and noticed it's entirely possible for multiple conditions to be true at once. Because of this, all the IF blocks in get_reduced_frame_status (after the initial one) are order-dependent. If I move one IF statement above another, get_reduced_frame_status may return a different result. Is this the expected behavior? It seems like a potential logic flaw. ------------------------------------------------------------------------------------ VIII-3. Absorption Conditions Planner-time prerequisites (all must hold for absorption to be enabled): (a) SKIP PAST LAST ROW. SKIP TO NEXT ROW creates overlapping contexts that cannot be safely absorbed. (b) Unbounded frame (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING). Limited frames apply differently to each context, breaking the monotonicity principle. (c) No match_start-dependent navigation in DEFINE. Mechanism: each context has a different matchStartRow, so FIRST resolves to a different row for each context at the same currentpos. An earlier context's DEFINE result no longer subsumes a later one's, making count-dominance comparison invalid. Rather than comparing matchStartRow at runtime (which would complicate the absorb path), any match_start dependency disables absorption entirely. Navigation content match_start dep. absorption ------------------------------------------------------------ No navigation none safe PREV/NEXT only none safe LAST (no offset) none safe LAST (with offset) boundary check unsafe FIRST (any) direct unsafe Compound (inner FIRST) direct unsafe Compound (inner LAST, no off.) none safe Compound (inner LAST, w/off.) boundary chk unsafe ------------------------------------------------------------------------------------ "boundary chk" should be "boundary check". column "match_start dep.", I don't understand the meaning of "direct" and "boundary check". "count-dominance" seems like an invented word, but I could not find the explanation. "match_start-dependent" occurred many times, it would also deserve an explanation. We can also rename it as match_start_dependent. ---------------------------------------------------- Attached is a minor refactoring of ExecRPRProcessRow, no need for boolean hasLimitedFrame. ``if (hasLimitedFrame && winstate->endOffsetValue != 0)`` seems wrong to me, endOffsetValue is a Datum, and you directly compare it with 0. see also calculate_frame_offsets. ---------------------------------------------------- ExecRPRProcessRow, nfa_evaluate_row ``` if (!window_gettupleslot(winobj, pos, slot)) return false; /* No row exists */ ``` indicates that it's not possible for (currentPos > ctxFrameEnd). therefore ````` if (currentPos >= ctxFrameEnd) { /* Frame boundary exceeded: force mismatch */ nfa_match(winstate, ctx, NULL); continue; } ````` should change to ```if (currentPos == ctxFrameEnd)```. With that, the comment wording "exceeded" seems wrong too. The comment 'Frame boundary exceeded' also needs updating since it's no longer exceeding the boundary. ---------------------------------------------------- Since I couldn't generate a coverage report, I am using elog(INFO) to test reachability. Rerun the regress tests, you will find out that the ELSE IF branch below is not reached by current regress tests. `````` else if (targetCtx->states == NULL) { /* Context already completed - skip to result registration */ goto register_result; elog(INFO, "XXXX reached"); } `````` -- jian https://www.enterprisedb.com/