Yeye He

Senior Researcher


I am a senior researcher at the Data Management, Exploration and Mining (DMX) group at Microsoft Research. I finished my PhD from University of Wisconsin-Madison, under the supervision of Prof. Jeffrey Naughton.

Recently I have been working on Self-service Data Preparation, where we develop intelligent technologies to automate a variety of labor-intensive data-preparation tasks, such as transform data by-examples (TDE), automatically join tables (Auto-Join, Sema-Join), automatically detect data errors from tables (Auto-Detect, Uni-Detect), automatically recognize data types (Auto-Type), split data into tables without examples (Auto-Split), produce mapping relationships (Auto-Map), automatically match records across tables (Auto-EM) etc. Some of these technologies have shipped in Microsoft products such as Power Query (available in Excel under the “data” tab), and Azure Machine Learning Data Prep.

Previously I worked on Synonym-Mining (e.g., Entity-Synonym, Attribute-Synonym, Acronym, etc.) using search engine query logs, which powers applications like Bing Snapp and Bing Knowledge Widget.