Re: Row pattern recognition
Chao Li <li.evan.chao@gmail.com>
From: Chao Li <li.evan.chao@gmail.com>
To: Tatsuo Ishii <ishii@postgresql.org>
Cc: vik@postgresfriends.org,
david.g.johnston@gmail.com,
jacob.champion@enterprisedb.com,
er@xs4all.nl,
peter@eisentraut.org,
pgsql-hackers@postgresql.org
Date: 2025-11-27T06:56:02Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add temporal FOREIGN KEY contraints
- 89f908a6d0ac 18.0 cited
-
Remove obsolete executor cleanup code
- d060e921ea5a 17.0 cited
> On Nov 27, 2025, at 11:10, Tatsuo Ishii <ishii@postgresql.org> wrote:
>
> Hi Chao,
>
> Any comment on this?
>
>>> 13 - 0005 - nodeWindowAgg.c
>>> ```
>>> static
>>> int
>>> do_pattern_match(char *pattern, char *encoded_str, int len)
>>> {
>>> static regex_t *regcache = NULL;
>>> static regex_t preg;
>>> static char patbuf[1024]; /* most recent 'pattern' is cached here */
>>> ```
>>>
>>> Using static variable in executor seems something I have never seen, which may persistent data across queries. I think usually per query data are stored in state structures.
>>
>> Yeah, such a usage maybe rare. But I hesitate to store the data
>> (regex_t) in WindowAggState because:
>>
>> (1) it bloats WindowAggState which we want to avoid if possible
>> (2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
With the static function-scope variables, those data persist across queries, which is error prone. I am afraid that even if I let this pass, other reviewers might have the same concern.
Maybe define a sub structure, say WindowAggCache, and put a pointer of WindowAggCache in WindowAggState, and only allocate memory when a query needs to.
>>
>>> 14 - 0005 - nodeWindowAgg.c
>>> ```
>>> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
>>> +
>>> + if (regcache != NULL)
>>> + pg_regfree(regcache); /* free previous re */
>>> +
>>> + /* we need to convert to char to pg_wchar */
>>> + plen = strlen(pattern);
>>> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
>>> + data_len = pg_mb2wchar_with_len(pattern, data, plen);
>>> + /* compile re */
>>> + sts = pg_regcomp(&preg, /* compiled re */
>>> + data, /* target pattern */
>>> + data_len, /* length of pattern */
>>> + cflags, /* compile option */
>>> + C_COLLATION_OID /* collation */
>>> + );
>>> + pfree(data);
>>> +
>>> + MemoryContextSwitchTo(oldContext);
>>> ```
>>>
>>> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the comment of pg_regcomp:
>>> ```
>>> /*
>>> * pg_regcomp - compile regular expression
>>> *
>>> * Note: on failure, no resources remain allocated, so pg_regfree()
>>> * need not be applied to re.
>>> */
>>> int
>>> pg_regcomp(regex_t *re,
>>> const chr *string,
>>> size_t len,
>>> int flags,
>>> Oid collation)
>>> ```
>>>
>>> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory leak.
>>
>> I admit current patch leaves the memory unfreed even after a query
>> finishes. What about adding something like:
>>
>> static void do_pattern_match_end(void)
>> {
>> if (regcache != NULL)
>> pg_regfree(regcache);
>> }
>>
>> and let ExecEndWindowAgg() call it?
>
I’m not sure. But I think if we move the data into WindowAggState, then I guess we don’t have to use TopmemoryContext here.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/