Re: Should we update the random_page_cost default value?

Tomas Vondra <tomas@vondra.me>

From: Tomas Vondra <tomas@vondra.me>
To: Michael Banck <mbanck@gmx.net>
Cc: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-10-06T09:12:00Z
Lists: pgsql-hackers
On 10/6/25 11:02, Michael Banck wrote:
> Hi,
> 
> On Mon, Oct 06, 2025 at 02:59:16AM +0200, Tomas Vondra wrote:
>> I started looking at how we calculated the 4.0 default back in 2000.
>> Unfortunately, there's a lot of info, as Tom pointed out in 2024 [2].
>> But he outlined how the experiment worked:
>>
>> - generate large table (much bigger than RAM)
>> - measure runtime of seq scan
>> - measure runtime of full-table index scan
>> - calculate how much more expensive a random page access is
> 
> Ok, but I also read somewhere (I think it might have been Bruce in a
> recent (last few years) discussion of random_page_cost) that on top of
> that, we assumed 90% (or was it 95%?) of the queries were cached in
> shared_buffers (probably preferably the indexes), so that while random
> access is massively slower than sequential access (surely not 4x by
> 2000) is offset by that. I only quickly read your mail, but I didn't see
> any discussion of caching on first glance, or do you think it does not
> matter much?
> 

I think you're referring to this:

https://www.postgresql.org/message-id/1156772.1730397196%40sss.pgh.pa.us

As Tom points out, that's not really how we calculated the 4.0 default.
We should probably remove that from the docs.


regards

-- 
Tomas Vondra