I am a principal researcher in the Data Management, Exploration and Mining (DMX) group at Microsoft Research Redmond Lab. My research is centered around the interplay between theories and systems related to data platforms and data science. I pursue fast, economic, scalable and practical solutions with theoretical guarantee. My recent work can be characterized under the emerging theme of MLSys (Machine Learning and Systems), such as:
One recurring trade-off under concern is computational cost and certain accuracy target. As one of the findings, some useful analytic tools to reason about such trade-off can be found from approximate query processing studied in the database community, and multi-armed bandits in the ML community. But there is often a mismatch between the available theoretical results and the problem we want to tackle. Along the way, I solve interesting theoretical questions which have encouraging practical consequences. Examples can be found here.
My past research in modeling and computing techniques for mining unstructured data such as text and graphs won me SIGKDD Data Science/Data Mining PhD Dissertation Award in 2015. A selection of my work in these areas can be found here.
I completed my PhD from the Department of Computer Science, University of Illinois at Urbana-Champaign (UIUC). I had the pleasure to be advised by Professor Jiawei Han during the course of my PhD as a member of the Database and Information System (DAIS) Lab. I received my Bachelor’s degree in CS from Tsinghua University.