Re: Row pattern recognition

Tatsuo Ishii <ishii@postgresql.org>

From: Tatsuo Ishii <ishii@postgresql.org>
To: li.evan.chao@gmail.com
Cc: vik@postgresfriends.org, david.g.johnston@gmail.com, jacob.champion@enterprisedb.com, er@xs4all.nl, peter@eisentraut.org, pgsql-hackers@postgresql.org
Date: 2025-11-24T04:57:25Z
Lists: pgsql-hackers
Hi Chao,

Thank you for the review!

>> On Nov 20, 2025, at 15:33, Chao Li <li.evan.chao@gmail.com> wrote:
>> 
>> 
>> I’d stop here, and continue 0005 tomorrow.
>> 
> 
> I reviewed 0005 today. I'm still not very familiar with the executor code, so going through 0005 has also been a valuable learning process for me.
> 
> 12 - 0005 - nodeWindowAgg.c
> ```
> +	if (string_set_get_size(str_set) == 0)
> +	{
> +		/* no match found in the first row */
> +		register_reduced_frame_map(winstate, original_pos, RF_UNMATCHED);
> +		destroyStringInfo(encoded_str);
> +		return;
> +	}
> ```
> 
> encoded_str should also be destroyed if not entering the “if” clause.

Subsequent string_set_discard() will free encode_str since encoded_str
is part of str_set. So no need to call destroyStringInfo(encoded_str)
in not entering "if" clause.

	string_set_discard(str_set);

So I think this is Ok.

> 13 - 0005 - nodeWindowAgg.c
> ```
> static
> int
> do_pattern_match(char *pattern, char *encoded_str, int len)
> {
> static regex_t *regcache = NULL;
> static regex_t preg;
> static char patbuf[1024]; /* most recent 'pattern' is cached here */
> ```
> 
> Using static variable in executor seems something I have never seen, which may persistent data across queries. I think usually per query data are stored in state structures.

Yeah, such a usage maybe rare. But I hesitate to store the data
(regex_t) in WindowAggState because:

(1) it bloats WindowAggState which we want to avoid if possible
(2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)

> 14 - 0005 - nodeWindowAgg.c
> ```
> +		MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
> +
> +		if (regcache != NULL)
> +			pg_regfree(regcache);	/* free previous re */
> +
> +		/* we need to convert to char to pg_wchar */
> +		plen = strlen(pattern);
> +		data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
> +		data_len = pg_mb2wchar_with_len(pattern, data, plen);
> +		/* compile re */
> +		sts = pg_regcomp(&preg, /* compiled re */
> +						 data,	/* target pattern */
> +						 data_len,	/* length of pattern */
> +						 cflags,	/* compile option */
> +						 C_COLLATION_OID	/* collation */
> +			);
> +		pfree(data);
> +
> +		MemoryContextSwitchTo(oldContext);
> ```
> 
> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the comment of pg_regcomp:
> ```
> /*
> * pg_regcomp - compile regular expression
> *
> * Note: on failure, no resources remain allocated, so pg_regfree()
> * need not be applied to re.
> */
> int
> pg_regcomp(regex_t *re,
> const chr *string,
> size_t len,
> int flags,
> Oid collation)
> ```
> 
> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory leak.

I admit current patch leaves the memory unfreed even after a query
finishes. What about adding something like:

static void do_pattern_match_end(void)
{
	if (regcache != NULL)
		pg_regfree(regcache);
}

and let ExecEndWindowAgg() call it?

> Okay, I’d stop here and continue to review 0006 next week.
> 
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
> 
> 
> 
>