Summary of Research Works
Wei-Ying Ma developed several technologies during his Ph.D. research at University of California at Santa Barbara, including one of the first content-based image retrieval systems on the Web (called Netra), the widely-used Gabor texture features for image retrieval, and one of the first practical image segmentation solutions for processing a large number and variety of natural scene images (which enables image retrieval systems to provide region-based search capabilities). He is also one of the first researchers to identify the problem of similarity measure in content-based image retrieval, and developed a machine learning approach to learn the similarity measure for image retrieval. In recent years, he has been leading a team at Microsoft Research Asia to develop a system to analyze large-scale multimedia data for automatic annotation.
Starting in 2003, Wei-Ying has expanded his research into general Web search and has applied many innovative ideas from image analysis and segmentation to Web page analysis and information extraction. In particular, he developed the first technique to analyze Web pages using visual cues and use the information to model the Web and extract structured data from Web pages. With these advanced Web-analysis techniques, Wei-Ying has led his team to develop a next-generation search engine that goes beyond traditional page-level relevance ranking. By extracting and integrating information about real-world entities such as people, places and things (e.g. products) from billions of public Web pages, his system creates a paradigm shift on Web search by enabling search queries, relevance ranking, and browsing and navigation of search results at the level of entities and objects. The resulting entity-level search engine – the first on the Web that provides automatic summaries of entities and allow users to navigate and explore their relationships – can be found at http://entitycube.research.microsoft.com (a Chinese-version of the search engine, called Renlifang, is also available at http://renlifang.msra.cn). He and his team also built the Microsoft academic search engine based on entity-level search technologies, which is available at http://academic.research.microsoft.com/. It provides many innovative ways to retrieve rank and explore scientific papers, conferences, journals, and authors based on their importance and relationship.
Wei-Ying and his team also initiated an effort in Microsoft to develop a web-scale data mining infrastructure for search. Different from traditional Internet services, search involves myriad offline computations to analyze the data at a very large scale, and an infrastructure for “scale” experiments is often required to evaluate the effectiveness of newly invented algorithms in a semi-real environment. Such an infrastructure is also critical for supporting massive web mining, knowledge discovery, and asynchronous metadata exchange in a search engine pipeline so that the cycle of idea formulation, experimentation, and deployment can be iterated quickly.
Wei-Ying is an inventor or co-inventor of over 80 patents in the area of web search and multimedia information retrieval.
The following are some of the systems Wei-Ying and his team have developed at Microsoft Research which have been released to the public.
EntityCube is a research prototype for exploring object-level search technologies, which automatically summarizes the Web for entities with a modest web presence. Key technologies include web-scale entity extraction, name disambiguation, entity ranking, and relationship extraction and exploration.
Renlifang is the Chinese version of EntityCube (and the name EntityCube is the English translation of Renlifang) which currently has millions of daily page-views during peak times. It has received wide press coverage and publicity in China.
Using similar technologies, Wei-Ying and his team created this academic search service to facilitate the exchange of ideas and communications between academic communities. A user can retrieve relevant information on academic papers, scientists, conferences, and journals and thus generate more accurate, relevant, and efficient results in comparison to document-level ranking. Features of this search service include the ability to find top scientists, conferences, and journals in a specific field, locate top research papers, and identify rising stars or hot topics in a specific field.
TravelGuide is a vertical search engine for the travel domain that utilizes deep web crawling and forum site structure analysis technologies developed by Wei-Ying and his team. This engine aggregates travel related information from across the web and presents relevant knowledge to the user, helping them understand more about travel destinations, such as popular places, themes, short trips, etc.
Wei-Ying Ma has been invited to give keynote speeches at the following academic conferences and industrial forums on web search, multimedia computing, and cloud computing.
- Empowering People with Knowledge: the Next Frontier for Web Search
The 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Hyderabad, India, 2010.
- Rethinking Multimedia Search in the new “Cloud + Clients” Era
The Workshop on Large-scale Multimedia Mining and Retrieval at ACM Multimedia Conference 2009
- Cloud Computing and the Future of Internet Services
The 10th International Mobile Data Management (MDM) Conference 2009
- Building Web-scale Data Mining Infrastructure for Search
The 10th Asia-Pacific Web Conference, APWeb 2008
- The Challenges and Opportunities of Mining Billions of Web Images for Search and Online Applications
The Multimedia Retrieval Workshop at SIGIR 2007
- The Challenges and Opportunities of Mining Billions of Web Images for Search and Advertising
The 9th International Conference on Visual Information Systems, VISUAL2007
- Building Infrastructure to Support Web-scale Data Mining for Search
DBWeb in Kyoto, Japan, 2006
- Object-level Vertical Search
Workshop on Web Information Retrieval and Integration at ICDE 2006
- From Relevance to Intelligence: Toward Next Generation Web Search
Multimedia Information Retrieval (MIR) Workshop at ACM Multimedia Conference 2005
- Adaptive Content Delivery on Mobile Internet across Multiple Form Factors
International Multimedia Modeling (MMM) Conference 2004
- Towards Next Generation Web Search
International Conference on Web Information Systems (WISE) 2004