Dr. Zhongyuan Wang is a Researcher at Microsoft Research Asia (MSRA). He received his PhD (advisors are Haixun Wang and Ji-Rong Wen), master’s degree (advisor was Xiaofeng Meng) and bachelor’s degree in computer science from Renmin University of China. Zhongyuan Wang won Wu Yuzhang Scholarship (Top-level Scholarship at Renmin University), Kwang-Hua Scholarship, and ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world) in the university. After he graduated from RUC, he joined MSRA as a Research Software Development Engineer. Until now, Zhongyuan Wang has published several papers in the leading international conferences, such as VLDB, ICDE, etc. He is also the translator of the book “Windows Phone 7 Programming for Android and iOS Developers”, published in 2012. His research interests include knowledge base, web data mining, online advertising, machine learning and natural language processing.
Currently, Zhongyuan Wang takes charge of Probase project. He focuses on acquiring web tables, attributes, knowledge facts from more than 7 billion web documents in MS Cloud platform, addressing entities disambiguation/attributes synonyms in Probase, understanding web documents by reasoning over uncertain data, and building cool applications (such as short text understanding, ads matching, and query recommendation) upon on the knowledge base.
Probase: a web knowledgebase that knows our mental world
Enterprise Dictionary: an enterprise knowledgebase that knows “What-Is” and “Who-Works-On-What”
My personal blog: 仲子说
- I lead two important projects in MSRA: Probase and Enterprise Dictionary, which were reviewed by Bill Gates, Harry Shum, Peter Lee, etc. The demo of Enterprise Dictionary was a candidate for MGX 2014.
- I publish 10+ papers in top international conferences
- I got the Best Paper Award in ICDE 2015
- I have 4 US patents, and 1 Chinese patent
- I’m the co-author/translator of 2 books: “Windows Phone 7 Programming for Android and iOS Developers”, and “Web Data Management: Concepts and Techniques”
- I won Wu Yuzhang Scholarship (Top-level Scholarship at Renmin University), and ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world)
- Short Text Understanding / Conceptualization
The goal of this project is to provide better text understanding.
A large variety of applications need to handle short texts such as search queries, ads keywords, tweets, image captions, etc. Understanding short texts is a big challenge for machines. Unlike long texts and documents, for which we can use “bag of words” based statistical approaches to analyze, short texts do not contain enough information or statistical signals to make the analysis meaningful. Furthermore, short texts are usually not well-formed sentences. For example, queries submitted to search engines usually do not follow grammar rules. Consequently, approaches based on sentence structure analysis do not work well either. Human beings are good at deriving meaning from noisy, ambiguous, and sparse input. We understand short texts because knowledge in our mind enriches the input to produce meaning. Thus, in order for machines to understand short texts, we need to supply such knowledge to machines so that the gap between insufficient input and understanding can be bridged.
We have been continuously improving our conceptualization mechanism, which is at the core of our short text understanding services. We leverage the co-occurrence network to enhance sense disambiguation. We also generate the mappings between auxiliary words and concept clusters. These can help sense disambiguation using context auxiliary words.
- Knowledgebase, Graph
- Database, Data Mining
- Machine Learning
- Web Search and Mining
- Natural Language Processing
- Program Committees, SIGKDD 2016
- Program Committees, IJCAI 2016
- Program Committees, CIKM 2014
- Program Committees, WAIM 2013
- Program Committees, CIKM 2012
- Program Committees, WAIM 2011
Tech Transfers to Products:
–Added semantic features based on semantic similarity between queries and ads keywords
–Shipped to Bing ads system, Oct. 2012
- Query Recommendation on MSN US
–Using article titles of each channel to train a classifier based on conceptualization techniques
–Compared with the previous QAS-based approach, our model made CTR increase by 36.8% and 80.0% in US Movie and US Music channels separately
- Related Topics for Bing Image Search
–Using is-a data to improve related topics in Bing image search
–Constructing and weighting an entity linkage graph to improve the related topics
–Shipped to Bing Image Search in June, 2013, and got ~200% gains on the total query share
- Microsoft Power Query for Excel
–Microsoft Power Query is an Excel add-in that enhances the self-service Business Intelligence experience in Excel by simplifying data discovery and access. Power Query enables users to easily discover, combine, and refine data for better analysis in Excel. Power Query includes a public search feature that is currently intended for use in the United States only.
–Download link: http://office.microsoft.com/en-us/excel/download-microsoft-power-query-for-excel-FX104018616.aspx
- Best Paper Award, Short Text Understanding Through Lexical-Semantic Analysis, in the 31st International Conference on Data Engineering (ICDE), 2015
- 2009 Wu Yuzhang Scholarship(Top-level Scholarship of Renmin University of China. Top 10/22000)
- 2008/2009 Kwang-Hua Scholarship(Twice)
- 2008 HP Distinguished Chinese Student Scholarship
- 2007 Excellent Graduate Student Award of Renmin University
- 2007 ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world)
- 2006 China Computer World Scholarship
- 2005~2006 The Outstanding Students Scholarship
- 2005 First Prize in Beijing Contest District in China Undergraduate Mathematical Contest in Modeling (CUMCM2005)
- 2005~2006 First-Class Scholarship
- 2003~2004 Fan Zhi’an Scholarship
- 2003~2004 Excellent League Member of RUC