I am a principal researcher in Microsoft Research at Redmond. My research is centered around the interplay between theories and systems related to data platforms and data science. I pursue fast, economic, scalable and practical solutions with theoretical guarantee. My recent work can be characterized under the emerging theme of MLSys (Machine Learning and Systems), such as:
One recurring trade-off is concerned with computational cost and accuracy of the result. Useful analytic tools to reason about such trade-off include approximate query processing studied in the database community, and multi-armed bandits in the ML community. To my surprise, there is rarely a good match between the available theoretical results and the data science primitives performed in the real world. I do both theoretical and empirical work to bridge this gap. Examples can be found here.
My past research in modeling and computing techniques for mining unstructured data such as text and graphs won me SIGKDD Data Science/Data Mining PhD Dissertation Award in 2015. A selection of my work in these areas can be found here.
I completed my PhD from the Department of Computer Science, University of Illinois at Urbana-Champaign (UIUC). I had the pleasure to be advised by Professor Jiawei Han during the course of my PhD as a member of the Database and Information System (DAIS) Lab. I received my Bachelor’s degree in CS from Tsinghua University.