I am a researcher in the Data Management, Exploration and Mining (DMX) group at Microsoft Research Redmond Lab. My research is centered around the interplay between theories and systems related to data platforms. I pursue fast, cheap, scalable and practical solutions with theoretical guarantee. My recent work can be characterized under the emerging theme of MLSys (Machine Learning and Systems), such as:
One recurring trade-off under concern is computational cost and certain accuracy target. As one of the findings, some useful analytic tools to reason about such trade-off can be found from approximate query processing studied in the database community, and multi-armed bandits in the ML community. But there is often a mismatch between the available theoretical results and the problem we want to tackle. Along the way, I solve interesting theoretical questions which have encouraging practical consequences. For example, the following data mining tasks can be solved with sublinear cost (i.e., the cost grows slowly or independently with data size) with approximation guarantee:
As another example, during the study of AutoML, we discover an algorithm-independent way to compute the confidence interval for a model’s accuracy by only using a subsample of the training data. It is different from the generalization bound typically studied in learning theory, and has practical usage in achieving orders of magnitude speedup in model selection.
My past research in modeling and computing techniques for mining unstructured data such as text and graphs won me SIGKDD Data Science/Data Mining PhD Dissertation Award in 2015. A selection of my work in these areas can be found here.
I serve as a PC member in data mining, database, machine learning and NLP conferences.
I completed my PhD from the Department of Computer Science, University of Illinois at Urbana-Champaign (UIUC). I had the pleasure to be advised by Professor Jiawei Han during the course of my PhD as a member of the Database and Information System (DAIS) Lab. I received my Bachelor’s degree in CS from Tsinghua University.