v28-0006-Row-pattern-recognition-patch-docs.patch

application/octet-stream

Filename: v28-0006-Row-pattern-recognition-patch-docs.patch
Type: application/octet-stream
Part: 5
Message: Re: Row pattern recognition

Patch

Same data as JSON: GET /api/v1/attachments/:id/patch the parsed metadata as JSON — format, series position, per-file stats; never the diff bytes. API reference →
Format: format-patch
Series: patch v28-0006
Subject: Row pattern recognition patch (docs).
File+
doc/src/sgml/advanced.sgml 81 0
doc/src/sgml/func.sgml 53 0
doc/src/sgml/ref/select.sgml 46 3
From a7266a6ca33c8fc2ed5bea53ba53d7c6a6e1d57e Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Sat, 11 Jan 2025 14:21:12 +0900
Subject: [PATCH v28 6/9] Row pattern recognition patch (docs).

---
 doc/src/sgml/advanced.sgml   | 81 ++++++++++++++++++++++++++++++++++++
 doc/src/sgml/func.sgml       | 53 +++++++++++++++++++++++
 doc/src/sgml/ref/select.sgml | 49 ++++++++++++++++++++--
 3 files changed, 180 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index 755c9f14850..9da48e0adad 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -537,6 +537,87 @@ WHERE pos &lt; 3;
     <literal>rank</literal> less than 3.
    </para>
 
+   <para>
+    Row pattern common syntax can be used to perform row pattern recognition
+    in a query. The row pattern common syntax includes two sub
+    clauses: <literal>DEFINE</literal>
+    and <literal>PATTERN</literal>. <literal>DEFINE</literal> defines
+    definition variables along with an expression. The expression must be a
+    logical expression, which means it must
+    return <literal>TRUE</literal>, <literal>FALSE</literal>
+    or <literal>NULL</literal>. The expression may comprise column references
+    and functions. Window functions, aggregate functions and subqueries are
+    not allowed. An example of <literal>DEFINE</literal> is as follows.
+
+<programlisting>
+DEFINE
+ LOWPRICE AS price &lt;= 100,
+ UP AS price &gt; PREV(price),
+ DOWN AS price &lt; PREV(price)
+</programlisting>
+
+    Note that <function>PREV</function> returns the price column in the
+    previous row if it's called in a context of row pattern recognition. Thus
+    in the second line the definition variable "UP" is <literal>TRUE</literal>
+    when the price column in the current row is greater than the price column
+    in the previous row. Likewise, "DOWN" is <literal>TRUE</literal> when the
+    price column in the current row is lower than the price column in the
+    previous row.
+   </para>
+   <para>
+    Once <literal>DEFINE</literal> exists, <literal>PATTERN</literal> can be
+    used. <literal>PATTERN</literal> defines a sequence of rows that satisfies
+    conditions defined in the <literal>DEFINE</literal> clause.  For example
+    following <literal>PATTERN</literal> defines a sequence of rows starting
+    with the a row satisfying "LOWPRICE", then one or more rows satisfying
+    "UP" and finally one or more rows satisfying "DOWN". Note that "+" means
+    one or more matches. Also you can use "*", which means zero or more
+    matches. If a sequence of rows which satisfies the PATTERN is found, in
+    the starting row all columns or functions are shown in the target
+    list. Note that aggregations only look into the matched rows, rather than
+    the whole frame. On the second or subsequent rows all window functions are
+    shown as NULL. Aggregates are NULL or 0 depending on its aggregation
+    definition. A count() aggregate shows 0. For rows that do not match on the
+    PATTERN, columns are shown AS NULL too. Example of
+    a <literal>SELECT</literal> using the <literal>DEFINE</literal>
+    and <literal>PATTERN</literal> clause is as follows.
+
+<programlisting>
+SELECT company, tdate, price,
+ first_value(price) OVER w,
+ max(price) OVER w,
+ count(price) OVER w
+FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ INITIAL
+ PATTERN (LOWPRICE UP+ DOWN+)
+ DEFINE
+  LOWPRICE AS price &lt;= 100,
+  UP AS price &gt; PREV(price),
+  DOWN AS price &lt; PREV(price)
+);
+</programlisting>
+<screen>
+ company  |   tdate    | price | first_value | max | count 
+----------+------------+-------+-------------+-----+-------
+ company1 | 2023-07-01 |   100 |         100 | 200 |     4
+ company1 | 2023-07-02 |   200 |             |     |     0
+ company1 | 2023-07-03 |   150 |             |     |     0
+ company1 | 2023-07-04 |   140 |             |     |     0
+ company1 | 2023-07-05 |   150 |             |     |     0
+ company1 | 2023-07-06 |    90 |          90 | 130 |     4
+ company1 | 2023-07-07 |   110 |             |     |     0
+ company1 | 2023-07-08 |   130 |             |     |     0
+ company1 | 2023-07-09 |   120 |             |     |     0
+ company1 | 2023-07-10 |   130 |             |     |     0
+(10 rows)
+</screen>
+   </para>
+
    <para>
     When a query involves multiple window functions, it is possible to write
     out each one with a separate <literal>OVER</literal> clause, but this is
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 47370e581ae..ea7683ff00d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -23377,6 +23377,59 @@ SELECT count(*) FROM sometable;
    Other frame specifications can be used to obtain other effects.
   </para>
 
+  <para>
+   Row pattern recognition navigation functions are listed in
+   <xref linkend="functions-rpr-navigation-table"/>.  These functions
+   can be used to describe DEFINE clause of Row pattern recognition.
+  </para>
+
+   <table id="functions-rpr-navigation-table">
+    <title>Row Pattern Navigation Functions</title>
+    <tgroup cols="1">
+     <thead>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para></entry>
+      </row>
+     </thead>
+
+     <tbody>
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>prev</primary>
+        </indexterm>
+        <function>prev</function> ( <parameter>value</parameter> <type>anyelement</type> )
+        <returnvalue>anyelement</returnvalue>
+       </para>
+       <para>
+        Returns the column value at the previous row;
+        returns NULL if there is no previous row in the window frame.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>next</primary>
+        </indexterm>
+        <function>next</function> ( <parameter>value</parameter> <type>anyelement</type> )
+        <returnvalue>anyelement</returnvalue>
+       </para>
+       <para>
+        Returns the column value at the next row;
+        returns NULL if there is no next row in the window frame.
+       </para></entry>
+      </row>
+
+     </tbody>
+    </tgroup>
+   </table>
+
   <note>
    <para>
     The SQL standard defines a <literal>RESPECT NULLS</literal> or
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index d7089eac0be..5833dc70b43 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -927,6 +927,7 @@ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceabl
 [ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
 [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
 [ <replaceable class="parameter">frame_clause</replaceable> ]
+[ <replaceable class="parameter">row_pattern_common_syntax</replaceable> ]
 </synopsis>
    </para>
 
@@ -969,8 +970,8 @@ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceabl
     The <replaceable class="parameter">frame_clause</replaceable> can be one of
 
 <synopsis>
-{ RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
-{ RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
+{ RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ] [row_pattern_common_syntax]
+{ RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [ <replaceable>frame_exclusion</replaceable> ] [row_pattern_common_syntax]
 </synopsis>
 
     where <replaceable>frame_start</replaceable>
@@ -1077,9 +1078,51 @@ EXCLUDE NO OTHERS
     a given peer group will be in the frame or excluded from it.
    </para>
 
+   <para>
+    The
+    optional <replaceable class="parameter">row_pattern_common_syntax</replaceable>
+    defines the <firstterm>row pattern recognition condition</firstterm> for
+    this
+    window. <replaceable class="parameter">row_pattern_common_syntax</replaceable>
+    includes following subclauses.
+
+<synopsis>
+[ AFTER MATCH SKIP PAST LAST ROW | AFTER MATCH SKIP TO NEXT ROW ]
+PATTERN <replaceable class="parameter">pattern_variable_name</replaceable>[+] [, ...]
+DEFINE <replaceable class="parameter">definition_varible_name</replaceable> AS <replaceable class="parameter">expression</replaceable> [, ...]
+</synopsis>
+    <literal>AFTER MATCH SKIP PAST LAST ROW</literal> or <literal>AFTER MATCH
+    SKIP TO NEXT ROW</literal> controls how to proceed to next row position
+    after a match found. With <literal>AFTER MATCH SKIP PAST LAST
+    ROW</literal> (the default) next row position is next to the last row of
+    previous match. On the other hand, with <literal>AFTER MATCH SKIP TO NEXT
+    ROW</literal> next row position is always next to the last row of previous
+    match. <literal>DEFINE</literal> defines definition variables along with a
+    boolean expression. <literal>PATTERN</literal> defines a sequence of rows
+    that satisfies certain conditions using variables defined
+    in <literal>DEFINE</literal> clause. If the variable is not defined in
+    the <literal>DEFINE</literal> clause, it is implicitly assumed following
+    is defined in the <literal>DEFINE</literal> clause.
+
+<synopsis>
+<literal>variable_name</literal> AS TRUE
+</synopsis>
+
+    Note that the maximu number of variables defined
+    in <literal>DEFINE</literal> clause is 26.
+   </para>
+
+   <para>
+    The SQL standard defines more subclauses: <literal>MEASURES</literal>
+    and <literal>SUBSET</literal>. They are not currently supported
+    in <productname>PostgreSQL</productname>. Also in the standard there are
+    more variations in <literal>AFTER MATCH</literal> clause.
+   </para>
+
    <para>
     The purpose of a <literal>WINDOW</literal> clause is to specify the
-    behavior of <firstterm>window functions</firstterm> appearing in the query's
+    behavior of <firstterm>window functions</firstterm> appearing in the
+    query's
     <link linkend="sql-select-list"><command>SELECT</command> list</link> or
     <link linkend="sql-orderby"><literal>ORDER BY</literal></link> clause.
     These functions
-- 
2.25.1