Re: Row pattern recognition

Tatsuo Ishii <ishii@sraoss.co.jp>

From: Tatsuo Ishii <ishii@sraoss.co.jp>
To: jchampion@timescale.com
Cc: er@xs4all.nl, vik@postgresfriends.org, pgsql-hackers@postgresql.org
Date: 2023-09-12T06:18:43Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. Add temporal FOREIGN KEY contraints

  2. Remove obsolete executor cleanup code

Attachments

>> What about leaving this (reevaluation) for now? Because:
>>
>> 1) we don't have CLASSIFIER
>> 2) we don't allow to give CLASSIFIER to PREV as its arggument
>>
>> so I think we don't need to worry about this for now.
> 
> Sure. I'm all for deferring features to make it easier to iterate; I
> just want to make sure the architecture doesn't hit a dead end. Or at
> least, not without being aware of it.

Ok, let's defer this issue. Currently the patch already exceeds 3k
lines. I am afraid too big patch cannot be reviewed by anyone, which
means it will never be committed.

> Also: is CLASSIFIER the only way to run into this issue?

Good question. I would like to know.

>> What if we don't follow the standard, instead we follow POSIX EREs?  I
>> think this is better for users unless RPR's REs has significant merit
>> for users.
> 
> Piggybacking off of what Vik wrote upthread, I think we would not be
> doing ourselves any favors by introducing a non-compliant
> implementation that performs worse than a traditional NFA. Those would
> be some awful bug reports.

What I am not sure about is, you and Vik mentioned that the
traditional NFA is superior that POSIX NFA in terms of performance.
But how "lexicographic ordering" is related to performance?

>> I am not sure if we need to worry about this because of the reason I
>> mentioned above.
> 
> Even if we adopted POSIX NFA semantics, we'd still have to implement
> our own parser for the PATTERN part of the query. I don't think
> there's a good way for us to reuse the parser in src/backend/regex.

Ok.

>> > Does that seem like a workable approach? (Worst-case, my code is just
>> > horrible, and we throw it in the trash.)
>>
>> Yes, it seems workable. I think for the first cut of RPR needs at
>> least the +quantifier with reasonable performance. The current naive
>> implementation seems to have issue because of exhaustive search.
> 
> +1

BTW, attched is the v6 patch. The differences from v5 include:

- Now aggregates can be used with RPR. Below is an example from the
  regression test cases, which is added by v6 patch.

- Fix assersion error pointed out by Erik.

SELECT company, tdate, price,
 first_value(price) OVER w,
 last_value(price) OVER w,
 max(price) OVER w,
 min(price) OVER w,
 sum(price) OVER w,
 avg(price) OVER w,
 count(price) OVER w
FROM stock
WINDOW w AS (
PARTITION BY company
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
AFTER MATCH SKIP PAST LAST ROW
INITIAL
PATTERN (START UP+ DOWN+)
DEFINE
START AS TRUE,
UP AS price > PREV(price),
DOWN AS price < PREV(price)
);
 company  |   tdate    | price | first_value | last_value | max  | min | sum  |          avg          | count 
----------+------------+-------+-------------+------------+------+-----+------+-----------------------+-------
 company1 | 07-01-2023 |   100 |         100 |        140 |  200 | 100 |  590 |  147.5000000000000000 |     4
 company1 | 07-02-2023 |   200 |             |            |      |     |      |                       |      
 company1 | 07-03-2023 |   150 |             |            |      |     |      |                       |      
 company1 | 07-04-2023 |   140 |             |            |      |     |      |                       |      
 company1 | 07-05-2023 |   150 |             |            |      |     |      |                       |      
 company1 | 07-06-2023 |    90 |          90 |        120 |  130 |  90 |  450 |  112.5000000000000000 |     4
 company1 | 07-07-2023 |   110 |             |            |      |     |      |                       |      
 company1 | 07-08-2023 |   130 |             |            |      |     |      |                       |      
 company1 | 07-09-2023 |   120 |             |            |      |     |      |                       |      
 company1 | 07-10-2023 |   130 |             |            |      |     |      |                       |      
 company2 | 07-01-2023 |    50 |          50 |       1400 | 2000 |  50 | 4950 | 1237.5000000000000000 |     4
 company2 | 07-02-2023 |  2000 |             |            |      |     |      |                       |      
 company2 | 07-03-2023 |  1500 |             |            |      |     |      |                       |      
 company2 | 07-04-2023 |  1400 |             |            |      |     |      |                       |      
 company2 | 07-05-2023 |  1500 |             |            |      |     |      |                       |      
 company2 | 07-06-2023 |    60 |          60 |       1200 | 1300 |  60 | 3660 |  915.0000000000000000 |     4
 company2 | 07-07-2023 |  1100 |             |            |      |     |      |                       |      
 company2 | 07-08-2023 |  1300 |             |            |      |     |      |                       |      
 company2 | 07-09-2023 |  1200 |             |            |      |     |      |                       |      
 company2 | 07-10-2023 |  1300 |             |            |      |     |      |                       |      
(20 rows)

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp