Thread

  1. Re: Row pattern recognition

    Tatsuo Ishii <ishii@postgresql.org> — 2025-11-24T04:57:25Z

    Hi Chao,
    
    Thank you for the review!
    
    >> On Nov 20, 2025, at 15:33, Chao Li <li.evan.chao@gmail.com> wrote:
    >> 
    >> 
    >> I’d stop here, and continue 0005 tomorrow.
    >> 
    > 
    > I reviewed 0005 today. I'm still not very familiar with the executor code, so going through 0005 has also been a valuable learning process for me.
    > 
    > 12 - 0005 - nodeWindowAgg.c
    > ```
    > +	if (string_set_get_size(str_set) == 0)
    > +	{
    > +		/* no match found in the first row */
    > +		register_reduced_frame_map(winstate, original_pos, RF_UNMATCHED);
    > +		destroyStringInfo(encoded_str);
    > +		return;
    > +	}
    > ```
    > 
    > encoded_str should also be destroyed if not entering the “if” clause.
    
    Subsequent string_set_discard() will free encode_str since encoded_str
    is part of str_set. So no need to call destroyStringInfo(encoded_str)
    in not entering "if" clause.
    
    	string_set_discard(str_set);
    
    So I think this is Ok.
    
    > 13 - 0005 - nodeWindowAgg.c
    > ```
    > static
    > int
    > do_pattern_match(char *pattern, char *encoded_str, int len)
    > {
    > static regex_t *regcache = NULL;
    > static regex_t preg;
    > static char patbuf[1024]; /* most recent 'pattern' is cached here */
    > ```
    > 
    > Using static variable in executor seems something I have never seen, which may persistent data across queries. I think usually per query data are stored in state structures.
    
    Yeah, such a usage maybe rare. But I hesitate to store the data
    (regex_t) in WindowAggState because:
    
    (1) it bloats WindowAggState which we want to avoid if possible
    (2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
    
    > 14 - 0005 - nodeWindowAgg.c
    > ```
    > +		MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
    > +
    > +		if (regcache != NULL)
    > +			pg_regfree(regcache);	/* free previous re */
    > +
    > +		/* we need to convert to char to pg_wchar */
    > +		plen = strlen(pattern);
    > +		data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
    > +		data_len = pg_mb2wchar_with_len(pattern, data, plen);
    > +		/* compile re */
    > +		sts = pg_regcomp(&preg, /* compiled re */
    > +						 data,	/* target pattern */
    > +						 data_len,	/* length of pattern */
    > +						 cflags,	/* compile option */
    > +						 C_COLLATION_OID	/* collation */
    > +			);
    > +		pfree(data);
    > +
    > +		MemoryContextSwitchTo(oldContext);
    > ```
    > 
    > Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the comment of pg_regcomp:
    > ```
    > /*
    > * pg_regcomp - compile regular expression
    > *
    > * Note: on failure, no resources remain allocated, so pg_regfree()
    > * need not be applied to re.
    > */
    > int
    > pg_regcomp(regex_t *re,
    > const chr *string,
    > size_t len,
    > int flags,
    > Oid collation)
    > ```
    > 
    > “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory leak.
    
    I admit current patch leaves the memory unfreed even after a query
    finishes. What about adding something like:
    
    static void do_pattern_match_end(void)
    {
    	if (regcache != NULL)
    		pg_regfree(regcache);
    }
    
    and let ExecEndWindowAgg() call it?
    
    > Okay, I’d stop here and continue to review 0006 next week.
    > 
    > Best regards,
    > --
    > Chao Li (Evan)
    > HighGo Software Co., Ltd.
    > https://www.highgo.com/
    > 
    > 
    > 
    >