Portrait of Shi Han

Shi Han

SR. Principal Research Manager


I am a SR. Principal Research Manager in the DKI (Data, Knowledge, Intelligence) area at Microsoft Research Asia. I have been working in the same research group since I joined Microsoft in April 2006. Now I lead the Data Analytics Research team, research directions spanning machine learning, multi-dimensional data mining, explainable AI, causal inference, graph models, and their applications in tabular data intelligence, survey forms analytics, and software engineering. Key technologies have been/are being shipped to Office (Excel, Forms, Word), Power BI & Dynamics, Windows, and Bing Search. I would like to summarize my research in the past 15 years as using data-driven techniques to enable industry-leading features of Microsoft products. Before joining Microsoft Research, I received my M.E. and B.E. from Zhejiang University in 2006 and 2003, respectively.

Research Interests

  • Machine learning, multi-dimensional data mining, explainable AI
  • Applications in tabular data intelligence, survey forms analytics, software engineering

Selected Projects

EAReco (2006-2008)

EAReco was the project name of our research and tech-transfer of HMM-based handwriting recognition for East-Asian languages (i.e., Simplified Chinese, Traditional Chinese, Korean, and Japanese). It was my first project after joining Microsoft Research in Beijing. We advanced the state-of-the-art recognition accuracy, especially for the cursive writing style to an industry-leading height at that time. In addition, it also involved research and engineering to allow for fast performance and small model size. Our EAReco engine and models had been shipped with Windows 7.

StackMine (2009-2014)

As a continued collaboration with Windows after the EAReco project, we decided to further improve Windows quality and user experience at the core and fundamental levels. And I started the StackMine project to help scale up the analysis for identifying Windows performance issues. Incorporating machine learning, data mining, large-scale computing, and system domain knowledge, StackMine was a technology suite and scalable system for automatic mining and recommendation of performance bottlenecks based on large scale (i.e., millions of) execution traces. Since its tech-transfer to Windows, StackMine had identified 19 high-impact performance bugs for Windows 8.

Auto Insights (2014-present)

I have been leading the research and tech-transfer of Auto Insights since Nov 2014. Auto Insights is a research framework for automatic mining and recommendation of various insights from multi-dimensional data. It also involves research and engineering to allow near real-time experiences of insight mining based on commodity database systems, or even in cloud environments. As an enabling technique towards smart analytics, Auto Insights has been helping Microsoft demonstrate industry-leading vision and technical strengths in the Business Intelligence market, via a series of releases with Power BI and reviews with Gartner.

Tabular Data Intelligence (2017-present)

I have been leading a team of researchers working on spreadsheet intelligence to enable on-click intelligent experiences in Excel of Microsoft Office 365. Our vision is to solve the grand challenges behind such on-click intelligence for spreadsheets, including table range detection, table structure analysis, table metadata understanding, table format recommendation, etc. With our techniques for both spreadsheet intelligence and auto insights, we collaborate with Excel and shipped Ideas in Excel on March 1, 2019. Our TableSense technology and SDK have been powering intelligent features of multiple key products in Microsoft Office 365.

Impact on Microsoft Products

Talks, Lectures, and Events