Fast Database and Data Streaming Operations using Graphics Processors

  • Naga Govindaraju | University of North Carolina, Chapel Hill

We present novel techniques to utilize the high computational power of graphics processing units (GPUs) to significantly accelerate many of the traditional general purpose algorithms on CPUs. As graphics processors are primarily designed to perform fast display of geometric primitives, we abstract many of the essential database and data mining algorithms using basic graphics operations. Our algorithms use efficient data representations and utilize the inherent parallelism in the single instruction multiple data (SIMD) units and the vector processing functionalities of the GPUs to efficiently evaluate the boolean combinations of predicates, aggregates, and join queries.

Graphics processors are optimized for processing data streams. We present deterministic algorithms to efficiently estimate quantiles and frequencies in large data streams. We utilize the high computational power and the memory bandwidth on a GPU to perform sorting on a GPU. The sorting algorithm is used as a main computational component for the construction of epsilon-approximate quantile and frequency summaries. We have applied our algorithm to data streams consisting of more than 100 million elements on a 3.4GHz PC with a NVIDIA 6800 Ultra GPU and achieved 2-4 times performance improvement over optimized CPU-based algorithms.

Our recent research focuses on using GPUs for sorting very large databases composed of hundreds of gigabytes of data using low-end commodity PCs. Experimental studies on the SortBenchmark indicate that external sorting is highly memory-intensive. As the GPUs internally have a dedicated memory interface, we present an efficient hybrid sorting algorithm to perform the computation on both the GPU and CPU, in parallel. Experimental results on a low-end PC with a NVIDIA 7800 GTX graphics co-processor indicate higher performance than optimized CPU-based algorithms on a high-end PC with 3.6 GHz Dual Xeon processors.

Speaker Details

Naga Govindaraju is currently a Research Assistant Professor in the Department of Computer Science at the University of North Carolina, Chapel Hill. He received a B. Tech in Computer Science from the Indian Institute of Technology, Bombay in 2001, and Masters and Ph.D. degrees in Computer Science from the University of North Carolina at Chapel Hill in 2003 and 2004 respectively. Dr. Govindaraju’s research focuses on the effective utilization of commodity graphics processors to solve several computational problems in real time. These include collision queries, database and data mining operations, shadow rendering and transparency algorithms on complex data sets, and new sorting algorithms and linear system solvers on GPUs. In 2005, he received the IEEE VR PRESENCE best paper award in virtual reality. He has regularly published many research articles in major graphics and database conferences such as ACM SIGGRAPH and ACM SIGMOD. Dr. Govindaraju has presented tutorials and courses on his research at ACM SIGGRAPH and Eurographics. He serves as a regular referee for many of the prestigious computer science conferences and journals.