Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations
- Jia Yu | Arizona State University
Database admins construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). Unfortunately, maintaining multiple secondary indexes in the same database can be extremely space consuming, causing significant performance degradation due to the potential exhaustion of memory space. In this talk, I will propose HERMIT, a succinct secondary indexing mechanism for modern RDBMSs. HERMIT judiciously leverages the rich soft functional dependencies hidden among columns to prune out redundant structures for indexed key access. Instead of building a complete index that stores every single entry in the key columns, HERMIT navigates any incoming key access queries to an existing index built on the correlated columns. This is achieved through the Tiered Regression Search Tree (TRS-Tree), a succinct, Machine Learning (ML)-enhanced data structure that performs fast curve fitting to adaptively and dynamically capture both column correlations and outliers. Our extensive experimental study in two different RDBMSs have confirmed that HERMIT can significantly reduce space consumption with limited performance overhead in terms of query response time and index maintenance time, especially when supporting complex range queries. This is joint work with researchers at IBM Almaden, Yingjun Wu, Yuanyuan Tian, Richard Sidle, and Ronald Barber.
[Slides]
-
-
Phil Bernstein
Distinguished Scientist
-
-
Watch Next
-
-
-
-
-
-
-
-
-
GenAI for Supply Chain Management: Present and Future
- Georg Glantschnig,
- Beibin Li,
- Konstantina Mellou
-
Using Optimization and LLMs to Enhance Cloud Supply Chain Operations
- Beibin Li,
- Konstantina Mellou,
- Ishai Menache