Re: cache estimates, cache access cost

Greg Smith <greg@2ndquadrant.com>

From: Greg Smith <greg@2ndquadrant.com>
To: Cédric Villemain <cedric.villemain.debian@gmail.com>
Cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2011-05-16T03:52:56Z
Lists: pgsql-hackers

Attachments

Cédric Villemain wrote:
> http://git.postgresql.org/gitweb?p=users/c2main/postgres.git;a=shortlog;h=refs/heads/analyze_cache
>   

This rebases easily to make Cedric's changes move to the end; I just 
pushed a version with that change to 
https://github.com/greg2ndQuadrant/postgres/tree/analyze_cache if anyone 
wants a cleaner one to browse.  I've attached a patch too if that's more 
your thing.

I'd recommend not getting too stuck on the particular hook Cédric has 
added here to compute the cache estimate, which uses mmap and mincore to 
figure it out.  It's possible to compute similar numbers, albeit less 
accurate, using an approach similar to how pg_buffercache inspects 
things.  And I even once wrote a background writer extension that 
collected this sort of data as it was running the LRU scan anyway.  
Discussions of this idea seem to focus on how the "what's in the cache?" 
data is collected, which as far as I'm concerned is the least important 
part.  There are multiple options, some work better than others, and 
there's no reason that can't be swapped out later.  The more important 
question is how to store the data collected and then use it for 
optimizing queries.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us