Tea: A High-level Language and Runtime System for Automating Statistical Analysis

Current statistical tools place the burden of valid, reproducible statistical analyses on the user. Users must have deep knowledge of statistics to not only identify their research questions, hypotheses, and domain assumptions but also select valid statistical tests for their hypotheses. As quantitative data become increasingly available in all disciplines, data analysis will continue to become a common task for people who may not have statistical expertise. Tea, a high-level declarative language for automating statistical test selection and execution, abstracts the details of analyses from users, empowering them to perform valid analyses by expressing their goals and domain knowledge. In this talk, I will discuss the design and implementation of Tea, lessons learned through the process, and other ongoing work in this vein.

[SLIDES]

Speaker Details

Eunice Jun is a PhD student in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on developing new tools and methods for conducting valid and reproducible statistical analyses. She hopes to make conducting valid data analyses easy (and fun) for end-users. She incorporates methods and techniques from human-computer interaction, programming languages, and data science.

Date:
Speakers:
Eunice Jun
Affiliation:
University of Washington

Series: Microsoft Research Talks