Thread

  1. Re: WIP: SP-GiST, Space-Partitioned GiST

    Oleg Bartunov <oleg@sai.msu.su> — 2011-09-24T21:29:09Z

    On Thu, 22 Sep 2011, Heikki Linnakangas wrote:
    
    > On 06.09.2011 20:34, Oleg Bartunov wrote:
    >> Here is the latest spgist patch, which has all planned features as well as
    >> all overhead, introduced by concurrency and recovery, so performance
    >> measurement should be realistic now.
    >
    > I'm ignoring the text suffix-tree part of this for now, because of the issue 
    > with non-C locales that Alexander pointer out.
    >
    > Regarding the quadtree, have you compared the performance of that with 
    > Alexander's improved split algorithm? I ran some tests using the test harness 
    > I still had lying around from the fast GiST index build tests:
    >
    >        testname         |      time       | accesses | indexsize
    > -------------------------+-----------------+----------+-----------
    > points unordered auto   | 00:03:58.188866 |   378779 | 522 MB
    > points ordered auto     | 00:07:14.362355 |   177534 | 670 MB
    > points unordered auto   | 00:02:59.130176 |    46561 | 532 MB
    > points ordered auto     | 00:04:00.50756  |    45066 | 662 MB
    > points unordered spgist | 00:03:05.569259 |    78871 | 394 MB
    > points ordered spgist   | 00:01:46.06855  |   422104 | 417 MB
    > (8 rows)
    >
    > These tests were with a table with 7500000 random points. In the 
    > ordered-tests, the table is sorted by x,y coordinates. 'time' is the time 
    > used to build the index on it, and 'accesses' is the total number of index 
    > blocks hit by a series of 10000 bounding box queries, measured from 
    > pg_statio_user_indexes.idx_blks_hit + idx_blks_read.
    >
    > The first two tests in the list are with a GiST index on unpatched 
    > PostgreSQL. The next six tests are with Alexander's double-sorting split 
    > patch. The last two tests are with an SP-GiST index.
    >
    > It looks like the query performance with GiST using the double-sorting split 
    > is better than SP-GiST, although the SP-GiST index is somewhat smaller. The 
    > ordered case seems pathologically bad, is that some sort of a worst-case 
    > scenario for quadtrees?
    
    random points are not good use-case, on real data sp-gist is much faster
    than GiST. I attached the latest version of patch, which descrease wal-traffic
    and introduce kd-tree example.
    
    real-life example from geonames database can be downloaded from
    http://mira.sai.msu.su/~megera/geo-all.dump.gz
    
    Below is a log of test run on my notebook:
    
    zcat /home/megera/app/pgsql/knn/geo-all.dump.gz | psql test
    create index spg_kd_idx on geo using spgist(point kd_point_ops);
    CREATE INDEX
    Time: 97180.047 ms
    test=# select pg_total_relation_size('spg_quad_idx');
      pg_total_relation_size 
    ------------------------
                   363126784
    
    test=# create index spg_quad_idx on geo using spgist(point);
    LOG:  checkpoints are occurring too frequently (22 seconds apart)
    HINT:  Consider increasing the configuration parameter "checkpoint_segments".
    LOG:  checkpoints are occurring too frequently (19 seconds apart)
    HINT:  Consider increasing the configuration parameter "checkpoint_segments".
    LOG:  checkpoints are occurring too frequently (20 seconds apart)
    HINT:  Consider increasing the configuration parameter "checkpoint_segments".
    CREATE INDEX
    Time: 68565.279 ms
    test=# select pg_total_relation_size('spg_quad_idx');
      pg_total_relation_size 
    ------------------------
                   363126784
    
    test=# create index gist_idx on geo using gist(point);
    CREATE INDEX
    Time: 354446.965 ms
    test=# select pg_total_relation_size('gist_idx');
      pg_total_relation_size 
    ------------------------
                   542793728
    
    
     	Regards,
     		Oleg
    _____________________________________________________________
    Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
    Sternberg Astronomical Institute, Moscow University, Russia
    Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
    phone: +007(495)939-16-83, +007(495)939-23-83