Re: Returning nbtree posting list TIDs in DESC order during backwards scans

Mircea Cadariu <cadariu.mircea@gmail.com>

From: Mircea Cadariu <cadariu.mircea@gmail.com>
To: Peter Geoghegan <pg@bowt.ie>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-07-20T04:12:43Z
Lists: pgsql-hackers
Hi,

On 18/07/2025 05:37, Peter Geoghegan wrote:

> I think that this patch isn't too far off being committable.

Agreed. I just tried v3, and got 23 buffer hits, same as in the original 
demonstrative example.


> +                    /* Remember all prior TIDs (must be at least one) */
> +                    for (int i = nitems - 2; i >= 0; i--)

This loop has to start from the end, in order to return TIDs in DESC order?


> I'm not really worried about _bt_killitems; more so about the routines
> called by _bt_readpage, which must make sure that the bit is unset
> every time a TID is saved in so->currPos.items[].

I did a search through the code for "so->" and had a look at the results 
for functions I'd expect to see changes in, at the minimum:

   *  btbeginscan
  * btrescan
  * btendscan
  * btrestrpos
  * _bt_steppage
  * _bt_readfirstpage

I could find all of the above being touched in v3.


> Modern CPUs are likely to skip over
> non-itemDead entries very quickly.

Okay, yeah. A sequential iteration through an array will be fast, and we 
expect the branch predictor to do its job properly with the "if 
(!kitem->itemDead)".


> I'll need
> to think about issues around adding the new kitem->itemDead bitfield.

It's probably not a concern here, but got reminded of this ABI break: 
https://www.postgresql.org/message-id/flat/CABOikdNmVBC1LL6pY26dyxAS2f%2BgLZvTsNt%3D2XbcyG7WxXVBBQ%40mail.gmail.com


Kind regards,

Mircea Cadariu