Re: Asynchronous MergeAppend

Matheus Alcantara <matheusssilv97@gmail.com>

From: "Matheus Alcantara" <matheusssilv97@gmail.com>
To: "Alexander Pyhalov" <a.pyhalov@postgrespro.ru>, "Matheus Alcantara" <matheusssilv97@gmail.com>
Cc: "Alena Rybakina" <a.rybakina@postgrespro.ru>, "Pgsql Hackers" <pgsql-hackers@postgresql.org>
Date: 2025-12-19T13:45:29Z
Lists: pgsql-hackers

Attachments

On Thu Dec 18, 2025 at 6:56 AM -03, Alexander Pyhalov wrote:
>> +	noccurred = WaitEventSetWait(node->ms_eventset, -1 /* no timeout */ , 
>> occurred_event,
>> +								 nevents, WAIT_EVENT_APPEND_READY);
>> 
>> Should we use the same WAIT_EVENT_APPEND_READY or create a new wait
>> event for merge append?
>
> I'm not sure that new wait event is needed - for observability I think 
> it's not critical
> to distinguish Append and MergeAppend when they waited for foreign 
> scans.  But also it's perhaps
> doesn't do any harm to record specific wait event.
>
Ok, I think that we can keep this way for now and let's see if a new
wait event is really needed.

>> I've created Appender and AppenderState types that are used by
>> Append/MergeAppend and AppendState/MergeAppendState respectively (I
>> should have think in a better name for these base type, ideas are
>> welcome). The execAppend.c was created to have the functions that can 
>> be
>> reused by Append and MergeAppend execution. I've tried to remove
>> duplicated code blocks that was almost the same and that didn't require
>> much refactoring.
>
> Overall I like new Appender node. Splitting code in this way really 
> helps to avoid code duplication.
> However, some similar code is still needed, because logic of getting new 
> tuples is different.
>
Indeed. 

> Some minor issues I've noticed.
> 1) ExecReScanAppender()  sets node->needrequest to NULL. 
> ExecReScanAppend() calls bms_free(node->as.needrequest) immediately 
> after this. The same is true for ExecReScanMergeAppend(). We should move 
> it to ExecReScanAppender().
>
Fixed

> 2) In src/backend/executor/execAppend.c:
> planstates are named  as mergeplans in ExecEndAppender(), perhaps, 
> appendplans or subplans are better names.
>
Fixed

> ExecInitAppender() could use palloc_array() to allocate appendplanstates 
> - as ExecInitMergeAppend().
>
Fixed

--
Matheus Alcantara
EDB: http://www.enterprisedb.com