Abstract

Workload information has proved to be a crucial component for database-administration tasks as well as for analysis of query logs to understand user behavior and system usage. These tasks require the ability to summarize large SQL workloads. In this paper, we identify primitives that are important to enable many important workload-summarization tasks. These primitives also appear to be useful in a variety of practical scenarios besides workload summarization. Today’s SQL is inadequate to express these primitives conveniently. We discuss possible extensions to SQL and the relational engine to efficiently support such summarization primitives.