Scale-Independent Relational Query Processing with PIQL

  • Michael Armbrust | UC Berkeley

The rapid growth of data volumes has led many developers to abandon traditional relational databases in favor of distributed key/values stores and map/reduce programs. While these alternatives often provide trivial scalability for developers, they lack many of the benefits of high-level declarative languages such as optimization and data independence. Instead, we propose extending the relational model with scale independence, a new type of data independence, that ensures consistent performance for all queries in an application, independent of the data size. Our implementation, PIQL – the Performance Insightful Query Language, provides a scale-independent relational system on top of existing distributed key/values stores using several techniques. First, PIQL uses bounded worst-case cost instead of average minimum cost as the objective function for query optimization. Additionally, the system automatically selects and maintains required indexes and materialized views. In this talk I will present PIQL’s extensions to standard SQL, the techniques used by the optimizer to ensure bounded resource requirements in the worst case, and performance results demonstrating linear scalability with consistent response time as an application’s data grows by an order of magnitude.

Speaker Details

Michael Armbrust is a PhD candidate at UC Berkeley, advised by Michael Franklin, David Patterson and Armando Fox. His interests broadly include distributed systems, large-scale structured storage and query optimization. Specifically, as a member of the RAD and AMP labs, he has focused on building systems that allow developers to rapidly build scalable interactive applications. Before attending Berkeley he received his BS in Computer Science and Mathematics from Purdue University.

    • Portrait of Jeff Running

      Jeff Running

Series: Microsoft Research Talks