Data platforms and analytics

Data platforms and analytics

Helping people and organizations unlock the power of data.

Our research covers a broad range of topics related to the management and analysis of data. These include: infrastructure for large-scale cloud data systems, reducing the total cost of ownership of systems including auto-tuning of data platforms, query optimization and processing, enabling approximate ways to query large and complex data sets, applying statistical and machine learning techniques to improve database system components, stream processing, adding database capabilities to actor frameworks, self-service data cleaning and transformation at scale, search over structured data, metadata management, and information extraction.


Focus areas


Data platforms and architectures

Building infrastructure for next-generation cloud systems, from innovations in hardware, databases and distributed data management to the scalable data architectures they enable. Techniques such as optimized and approximate query processing over data that may be streaming, unstructured or encrypted ensure that data and insights are always available securely and on demand.

Data wrangling and enrichment

Infusing data platforms with state-of-the-art machine learning capabilities, including program synthesis for extracting, cleaning, and transforming data, automatic mining of data insights such as correlations, trends, and anomalies and the ability to train and deploy advanced machine learning models and neural network architectures at scale.

Social data and impact

Harnessing data from the social web to make a positive impact on people and society. Includes developing urban computing systems for monitoring air pollution and traffic congestion, healthcare systems for maintaining population health and wellness and forensic investigation tools for detecting and disrupting cybercrime.