Thread
-
Re: Row pattern recognition
Tatsuo Ishii <ishii@postgresql.org> — 2025-12-01T12:21:38Z
>> On Nov 27, 2025, at 11:10, Tatsuo Ishii <ishii@postgresql.org> wrote: >> >> Hi Chao, >> >> Any comment on this? >> >>>> 13 - 0005 - nodeWindowAgg.c >>>> ``` >>>> static >>>> int >>>> do_pattern_match(char *pattern, char *encoded_str, int len) >>>> { >>>> static regex_t *regcache = NULL; >>>> static regex_t preg; >>>> static char patbuf[1024]; /* most recent 'pattern' is cached here */ >>>> ``` >>>> >>>> Using static variable in executor seems something I have never seen, which may persistent data across queries. I think usually per query data are stored in state structures. >>> >>> Yeah, such a usage maybe rare. But I hesitate to store the data >>> (regex_t) in WindowAggState because: >>> >>> (1) it bloats WindowAggState which we want to avoid if possible >>> (2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure) > > With the static function-scope variables, those data persist across queries, which is error prone. I am afraid that even if I let this pass, other reviewers might have the same concern. > > Maybe define a sub structure, say WindowAggCache, and put a pointer of WindowAggCache in WindowAggState, and only allocate memory when a query needs to. > >>> >>>> 14 - 0005 - nodeWindowAgg.c >>>> ``` >>>> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext); >>>> + >>>> + if (regcache != NULL) >>>> + pg_regfree(regcache); /* free previous re */ >>>> + >>>> + /* we need to convert to char to pg_wchar */ >>>> + plen = strlen(pattern); >>>> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar)); >>>> + data_len = pg_mb2wchar_with_len(pattern, data, plen); >>>> + /* compile re */ >>>> + sts = pg_regcomp(&preg, /* compiled re */ >>>> + data, /* target pattern */ >>>> + data_len, /* length of pattern */ >>>> + cflags, /* compile option */ >>>> + C_COLLATION_OID /* collation */ >>>> + ); >>>> + pfree(data); >>>> + >>>> + MemoryContextSwitchTo(oldContext); >>>> ``` >>>> >>>> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the comment of pg_regcomp: >>>> ``` >>>> /* >>>> * pg_regcomp - compile regular expression >>>> * >>>> * Note: on failure, no resources remain allocated, so pg_regfree() >>>> * need not be applied to re. >>>> */ >>>> int >>>> pg_regcomp(regex_t *re, >>>> const chr *string, >>>> size_t len, >>>> int flags, >>>> Oid collation) >>>> ``` >>>> >>>> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory leak. >>> >>> I admit current patch leaves the memory unfreed even after a query >>> finishes. What about adding something like: >>> >>> static void do_pattern_match_end(void) >>> { >>> if (regcache != NULL) >>> pg_regfree(regcache); >>> } >>> >>> and let ExecEndWindowAgg() call it? >> > > I’m not sure. But I think if we move the data into WindowAggState, then I guess we don’t have to use TopmemoryContext here. I decided to add new fields to WindowAggState: /* regular expression compiled result cache. Used for RPR. */ char *patbuf; /* pattern to be compiled */ regex_t preg; /* compiled re pattern */ Then allocate the memory for them using winstate->ss.ps.ps_ExprContext->ecxt_per_query_memory; Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp