Re: Batching in executor
cca5507 <2624345507@qq.com>
From: cca5507 <2624345507@qq.com>
To: Amit Langote <amitlangote09@gmail.com>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>, Tomas Vondra <tomas@vondra.me>
Date: 2025-12-22T11:45:49Z
Lists: pgsql-hackers
Hi, Some comments for v4: 0001 ==== 1) table_scan_getnextbatch() "Assert(dir == ForwardScanDirection);" -> "Assert(ScanDirectionIsForward(dir));" 2) heapgettup_pagemode_batch() "TupleDesc tupdesc = key ? RelationGetDescr(rel) : NULL;" -> "TupleDesc tupdesc = RelationGetDescr(rel);" I think the latter is enough. 3) heapgettup_pagemode_batch() ``` /* Are there more visible tuples left on this page? */ lineindex = scan->rs_cindex + dir; linesleft = (lineindex <= (uint32) scan->rs_ntuples) ? (scan->rs_ntuples - lineindex) : 0; if (linesleft > 0) break; /* continue on this page */ ``` The "scan->rs_ntuples" is already an uint32. 4) heapgettup_pagemode_batch() ``` Assert(lineindex <= (uint32) scan->rs_ntuples); ``` The "scan->rs_ntuples" is already an uint32. And I think this should be "Assert(lineindex < scan->rs_ntuples);", the related assert in heapgettup_pagemode() is also wrong. 5) heapgettup_pagemode_batch() If the scan key filters out all tuples on a page, we may return 0 before reaching the end of scan, right? 6) heap_begin_batch() ``` hb = palloc(sizeof(HeapBatch)); hb->tupdata = palloc(sizeof(HeapTupleData) * maxitems); ``` Can we just use one palloc() for cache-friendly? 0002 ==== 1) heap_materialize_batch_all() ``` slot->base.tts_flags &= ~(TTS_FLAG_EMPTY | TTS_FLAG_SHOULDFREE); slot->base.tts_tid = tuple->t_self; slot->base.tts_tableOid = tuple->t_tableOid; slot->base.tts_flags &= ~(TTS_FLAG_SHOULDFREE | TTS_FLAG_EMPTY); ``` Redundant of "slot->base.tts_flags &="? 2) TupleBatchCreate() ``` inslots = palloc(sizeof(TupleTableSlot *) * capacity); outslots = palloc(sizeof(TupleTableSlot *) * capacity); for (int i = 0; i < capacity; i++) inslots[i] = MakeSingleTupleTableSlot(scandesc, &TTSOpsHeapTuple); b = (TupleBatch *) palloc(sizeof(TupleBatch)); ``` Can we just use one palloc() for cache-friendly? 3) TupleBatchCreate() ``` b->outslots = outslots; b->activeslots = NULL; b->outslots = outslots; ``` Redundant of "b->outslots = outslots;"? 4) TupleBatchReset() ``` if (b == NULL) return; ``` This can never happen, convert to a assert or just delete it? 5) SeqNextBatch() "Assert(direction == ForwardScanDirection);" -> "Assert(ScanDirectionIsForward(direction));" -- Regards, ChangAo Chen