Thread
-
Re: Asynchronous MergeAppend
Alexander Pyhalov <a.pyhalov@postgrespro.ru> — 2025-11-20T14:22:32Z
Matheus Alcantara писал(а) 2025-11-20 00:51: > On Tue Nov 18, 2025 at 4:14 AM -03, Alexander Pyhalov wrote: >> Updated the first patch. >> > Thanks for the new version. Some new comments. > > v7-0002-MergeAppend-should-support-Async-Foreign-Scan-subpla.patch: > > 1. Should be "nasyncplans" instead of "nplans"? ExecInitAppend use > "nasyncplans" to allocate the as_asyncresults array. > > + mergestate->ms_asyncresults = (TupleTableSlot **) > + palloc0(nplans * sizeof(TupleTableSlot *)); > + > No. There's a difference between how Append and MergeAppend handle async results. When Append looks for the next result, it can return any of them. So, async results are not ordered in Append. We have maximum nasyncplans of async results and return the first available result when we asked for one . For example, in ExecAppendAsyncRequest(): 1004 /* 1005 * If there are any asynchronously-generated results that have not yet 1006 * been returned, we have nothing to do; just return one of them. 1007 */ 1008 if (node->as_nasyncresults > 0) 1009 { 1010 --node->as_nasyncresults; 1011 *result = node->as_asyncresults[node->as_nasyncresults]; 1012 return true; 1013 } ExecAppendAsyncGetNext() looks (via ExecAppendAsyncRequest()) on any result. However, when we are asked for result in MergeAppend, we should return result of the specific subplan. To achieve this we should know, which subplan given results correspond to. So, we enumerate async results in the same way as requests (or ms_valid_asyncplans). Look at ExecAppendAsyncGetNext()/ExecAppendAsyncRequest(). > 2. I think that this comment should be updated. IIUC ms_valid_subplans > may not be all subplans because classify_matching_subplans() may move > async ones to ms_valid_asyncplans. Is that right? > > /* > * If we've yet to determine the valid subplans then do so now. If > * run-time pruning is disabled then the valid subplans will always be > * set to all subplans. > */ > Yes, you are correct, and similar comment in nodeAppend.c lacks the last sentence. Removed it. > 3. This code comment should also mention the > Assert(!bms_is_member(...));? > > + /* The result should be a TupleTableSlot or NULL. */ > + Assert(slot == NULL || IsA(slot, TupleTableSlot)); > + Assert(!bms_is_member(areq->request_index, > node->ms_has_asyncresults)); > > 4. It's worth include bms_num_members(node->ms_needrequest) <= 0 check > on ExecMergeAppendAsyncRequest() as an early return? IIUC it would > avoid > the bms_is_member(), bms_next_member() and bms_is_empty(needrequest) > calls. We can't exclude the first bms_is_member(), as node->ms_needrequest can be empty (we've already got result), so do not need to do request to get it, just return previously fetched result. Not sure about check above the following lines: 650 i = -1; 651 while ((i = bms_next_member(node->ms_needrequest, i)) >= 0) 652 { 653 if (!bms_is_member(i, node->ms_has_asyncresults)) 654 needrequest = bms_add_member(needrequest, i); 655 } 656 I think, it shouldn't be much cheaper as bms_next_member() will execute a couple instructions to find out that the number of words in bitmapset is zero, but will do nothing expensive. > > ExecMergeAppendAsyncRequest(MergeAppendState *node, int mplan) > Bitmapset *needrequest; > int i; > > + if (bms_num_members(node->ms_needrequest) <= 0) > + return false; > + > No, as I've mentioned, we can't exclude bms_is_member(mplan, node->ms_has_asyncresults) check. We could have received result while waiting for data for another subplan. Let's assume, we have 2 async subplans (0 and 1). For example, we've decided to get data from subplan 1. We 've already send requests to both async subplans (in ExecMergeAppendAsyncBegin() or later). Now we do ExecMergeAppendAsyncRequest(), but there's no subplans, which need request. So we enter the waiting loop. Let's assume we got event for another subplan (0). Via ExecAsyncNotify(), ExecAsyncForeignScanNotify() we go to postgresForeignAsyncNotify(), fetch data and mark request as complete. Via ExecAsyncMergeAppendResponse() we save results for subplan 0 in node->ms_asyncresults[0]. When we finally got result for subplan 1, we do the same, but now exit the loop. When MergeAppend finally decides that it needs results from subplan 0, we already have them, but ms_needrequest is empty, so ExecMergeAppendAsyncRequest() just returns this pre-fetched tuple. -- Best regards, Alexander Pyhalov, Postgres Professional