Re: Batching in executor

cca5507 <2624345507@qq.com>

From: cca5507 <2624345507@qq.com>
To: Amit Langote <amitlangote09@gmail.com>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>, Tomas Vondra <tomas@vondra.me>
Date: 2025-12-22T11:45:49Z
Lists: pgsql-hackers
Hi,

Some comments for v4:

0001
====

1) table_scan_getnextbatch()
"Assert(dir == ForwardScanDirection);" -> "Assert(ScanDirectionIsForward(dir));"

2) heapgettup_pagemode_batch()
"TupleDesc	tupdesc = key ? RelationGetDescr(rel) : NULL;" -> "TupleDesc	tupdesc = RelationGetDescr(rel);"
I think the latter is enough.

3) heapgettup_pagemode_batch()
```
			/* Are there more visible tuples left on this page? */
			lineindex = scan->rs_cindex + dir;
			linesleft = (lineindex <= (uint32) scan->rs_ntuples) ?
				(scan->rs_ntuples - lineindex) : 0;
			if (linesleft > 0)
				break;	/* continue on this page */
```
The "scan->rs_ntuples" is already an uint32.

4) heapgettup_pagemode_batch()
```
		Assert(lineindex <= (uint32) scan->rs_ntuples);
```
The "scan->rs_ntuples" is already an uint32. And I think this should be "Assert(lineindex < scan->rs_ntuples);", the related
assert in heapgettup_pagemode() is also wrong.

5) heapgettup_pagemode_batch()
If the scan key filters out all tuples on a page, we may return 0 before reaching the end of scan, right?

6) heap_begin_batch()
```
	hb = palloc(sizeof(HeapBatch));
	hb->tupdata = palloc(sizeof(HeapTupleData) * maxitems);
```
Can we just use one palloc() for cache-friendly?

0002
====

1) heap_materialize_batch_all()
```
		slot->base.tts_flags &= ~(TTS_FLAG_EMPTY | TTS_FLAG_SHOULDFREE);
		slot->base.tts_tid = tuple->t_self;
		slot->base.tts_tableOid = tuple->t_tableOid;
		slot->base.tts_flags &= ~(TTS_FLAG_SHOULDFREE | TTS_FLAG_EMPTY);
```
Redundant of "slot->base.tts_flags &="?

2) TupleBatchCreate()
```
	inslots = palloc(sizeof(TupleTableSlot *) * capacity);
	outslots = palloc(sizeof(TupleTableSlot *) * capacity);
	for (int i = 0; i < capacity; i++)
		inslots[i] = MakeSingleTupleTableSlot(scandesc, &TTSOpsHeapTuple);

	b = (TupleBatch *) palloc(sizeof(TupleBatch));
```
Can we just use one palloc() for cache-friendly?

3) TupleBatchCreate()
```
	b->outslots = outslots;
	b->activeslots = NULL;
	b->outslots = outslots;
```
Redundant of "b->outslots = outslots;"?

4) TupleBatchReset()
```
	if (b == NULL)
		return;
```
This can never happen, convert to a assert or just delete it?

5) SeqNextBatch()
"Assert(direction == ForwardScanDirection);" -> "Assert(ScanDirectionIsForward(direction));"

--
Regards,
ChangAo Chen