Thread

  1. Re: CUDA Sorting

    Oleg Bartunov <oleg@sai.msu.su> — 2012-02-12T12:13:33Z

    I'm wondering if CUDA will win in geomentry operations, for example,
    tesing point <@ complex_polygon
    
    Oleg
    On Sun, 12 Feb 2012, Gaetano Mendola wrote:
    
    > On 19/09/2011 16:36, Greg Smith wrote:
    >> On 09/19/2011 10:12 AM, Greg Stark wrote:
    >>> With the GPU I'm curious to see how well
    >>> it handles multiple processes contending for resources, it might be a
    >>> flashy feature that gets lots of attention but might not really be
    >>> very useful in practice. But it would be very interesting to see.
    >> 
    >> The main problem here is that the sort of hardware commonly used for
    >> production database servers doesn't have any serious enough GPU to
    >> support CUDA/OpenCL available. The very clear trend now is that all
    >> systems other than gaming ones ship with motherboard graphics chipsets
    >> more than powerful enough for any task but that. I just checked the 5
    >> most popular configurations of server I see my customers deploy
    >> PostgreSQL onto (a mix of Dell and HP units), and you don't get a
    >> serious GPU from any of them.
    >> 
    >> Intel's next generation Ivy Bridge chipset, expected for the spring of
    >> 2012, is going to add support for OpenCL to the built-in motherboard
    >> GPU. We may eventually see that trickle into the server hardware side of
    >> things too.
    >
    >
    > The trend is to have server capable of running CUDA providing GPU via 
    > external hardware (PCI Express interface with PCI Express switches), look for 
    > example at PowerEdge C410x PCIe Expansion Chassis from DELL.
    >
    > I did some experimenst timing the sort done with CUDA and the sort done with 
    > pg_qsort:
    >                       CUDA      pg_qsort
    > 33Milion integers:   ~ 900 ms,  ~ 6000 ms
    > 1Milion integers:    ~  21 ms,  ~  162 ms
    > 100k integers:       ~   2 ms,  ~   13 ms
    >
    > CUDA time has already in the copy operations (host->device, device->host).
    >
    > As GPU I was using a C2050, and the CPU doing the pg_qsort was a Intel(R) 
    > Xeon(R) CPU X5650  @ 2.67GHz
    >
    > Copy operations and kernel runs (the sort for instance) can run in parallel, 
    > so while you are sorting a batch of data, you can copy the next batch in 
    > parallel.
    >
    > As you can see the boost is not negligible.
    >
    > Next Nvidia hardware (Keplero family) is PCI Express 3 ready, so expect in 
    > the near future the "bottle neck" of the device->host->device copies to have 
    > less impact.
    >
    > I strongly believe there is space to provide modern database engine of
    > a way to offload sorts to GPU.
    >
    >> I've never seen a PostgreSQL server capable of running CUDA, and I
    >> don't expect that to change.
    >
    > That sounds like:
    >
    > "I think there is a world market for maybe five computers."
    > - IBM Chairman Thomas Watson, 1943
    >
    > Regards
    > Gaetano Mendola
    >
    >
    >
    
     	Regards,
     		Oleg
    _____________________________________________________________
    Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
    Sternberg Astronomical Institute, Moscow University, Russia
    Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
    phone: +007(495)939-16-83, +007(495)939-23-83