Design and Evaluation of Effective, Interactive, and Interpretable Machine Learning

Machine learning is ubiquitous in domains such as criminal justice, credit, lending, and medicine. Traditionally, these models are evaluated based on their predictive performance on held-out data sets. However, to convince non-expert users that these models are trustworthy and reliable in these critical domains, we need to go beyond traditional setup, where models are thought of as black boxes and impossible to interact with.

I talk about my research on designing and evaluating machine learning based systems that are interpretable for humans and facilitate human interaction. I start by introducing an interactive system that uses machine learning techniques such as topic models and active learning to help non-expert users label document collections and make sense of them. I demonstrate that effective interaction of users with the machines leads to a better and faster understanding of the documents. Then, I discuss the necessity of empirical evaluation of interpretability with humans in the loop. I introduce a framework for isolating and measuring the effect of different properties of models on users’ performance and behavior. Finally, I walk through a set of large-scale human-subject studies that we ran to examine the effect of model interpretability on users’ ability in completing a specific task.

Speaker Details

Forough Poursabzi-Sangdeh is a PhD candidate in CS department at the University of Colorado Boulder, advised by Jordan Boyd-Graber. She earned her bachelor’s degrees in computer engineering from the University of Tehran in 2012 and her master’s degree in computer science from the University of Colorado Boulder in 2015. Her research interests lie at the intersection of machine learning, natural language processing, and social sciences.