Thread

  1. Re: [HACKERS] Fwd: Joins and links

    Hannu Krosing <hannu@trust.ee> — 1999-07-09T13:55:02Z

    Leon wrote:
    > 
    > Hannu Krosing wrote:
    > 
    > > The difference will quickly degrade as more rows are fetched in one
    > > query and cache misses and disk head movement start rattling your
    > > disks. The analogy being a man who needs 10 different items from a
    > > supermarket and takes 10 full round trips from home to buy them.
    > 
    > Frankly, I didn't even consider fetching database from disk. This
    > slows queries immensely and I wonder if there exist someone who
    > doesn't keep their entire DB in RAM.
    
    Well, I personally dont even know, how I could keep my entire PostgreSQL 
    DB in RAM :)
    
    It would be interesting to know what percentage of people do use 
    PostgreSQL for databases that are small enough to fit in RAM - 
    surely not the ones who need splitting of tables >2GB.
    
    And I think that setting up PostgreSQL for maximum RAM usage would 
    make a nice topic for "Optimizing PostgreSQL". 
    
    When my backends are mostly idle they usually use about 3-4MB of memory, 
    (hardly enough for any database :).
    
    It is quite possible that some bigger tables end up in a disk-cache, 
    but you can expect to find all your data in that cache only if you do 
    many queries on the same tables in a row, and the machine is otherways 
    idle.
    
    > I think the gain will raise with the number of tables, because
    > the more tables - the more index lookups are saved.
    
    My point is that sometimes even sequential scan is faster than index
    lookup,
    and not only due to overhead of using the index, but due to better disk 
    performance of sequential reads vs. random reads
    
    For in-memory databases this of course does not count.
    
    Still I'm quite sure that the main effort in PostgreSQL development has
    so 
    far gone to optimising queries where most of the data is fetched from
    the 
    disk.
    
    > > Fixing the optimizer would get a performance gain on a far wider
    > > range of tasks, and is still needed for links.
    > 
    > But general fixing of optimizer is total rewritement of it, whereas
    > link fix is almost a fast hack.
    
    I'm not too sure about it. It certainly can be done without a _total_ 
    rewrite, but getting all the new node types and access methods into the 
    parser/planner/executor may not be trivial. 
    
    One idea would be a cross-table OID index for anything in memory.
    Then, assuming that everything is in-memory, using oids as links would
    be 
    only trivially, if at all, slower (2-10 memory accesses and comparisons) 
    than "straight" link lookup, that could also be chasing linear chains of 
    forward-id-links on frequently updated DBs. On infrequently updated DBs 
    you could just use triggers and/or cron jobs to keep your reports
    updated,
    I quess that this is what most commercial OLAP systems do.
    
    Actually I lived my first halfyear of using PostgreSQL under a delusion 
    that lookup by OID would be somewhat special (fast). 
    Probably due to my misunderstanding of the (ever diminishing) O in
    ORDBMS :) 
    There have been some attempts to get the object-orientedness better 
    supported by PGSQL, (possibly even some contrib funtions), but nobody
    seems 
    to have needed it bad enough to
    1) implement it
     and 
    2) shout long enough to get it included in standart distibution. 
    Most (all?) of the developers seem to be die-hard RDBMS guys and thanks 
    to that we have now a solid and reasonably fast Relational DBMS with 
    some OO rudiments
    
    So I quess that unless you do (at least part of) links, no-one else will
    ;(
    
    > Unfortunately I already have a project to work on. There is too
    > little of me for two projects.
    
    Darn! I almost hoped we would get one more PostgreSQL hacker as I'm sure 
    that after familiarising oneself with the code enougth to implement
    links 
    one would be quite capable of helping with most of PostgreSQL
    development
    <grin>
    
    ----------------------
    Hannu