Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Microsoft Web N-gram Services: Expanded Availability, New API

July 19, 2010 | Posted by Microsoft Research Blog

During the Association for Computing Machinery’s 33rd annual SIGIR Conference, on July 19-23, 2010 in Geneva, Microsoft Research is announcing enhancements to the Microsoft Web N-gram Services, available free via a cloud-based platform. Microsoft Research created Microsoft Web N-gram Services to help drive discovery and innovation by enabling scientists to conduct research on real-world, web data. Microsoft Web N-gram Services support many research areas that have the potential to change lives, including natural language processing and empowering people to take advantage of the vast amounts of information available on the Internet via new web search capabilities.

Introduced late last year, in partnership with Bing, the Microsoft Web N-gram Services public beta now is being extended beyond professors at accredited universities to include all researchers worldwide, provided they are using the service for non-commercial purposes. The service now also includes a predictive API in support of query-language models. By opening the service up to more researchers and making these important service enhancements, Microsoft Web N-gram Services will expand not only its audience, but also access to high-quality feedback

In the video below, Kuansan Wang, principal researcher at Microsoft Research Redmond, offers a more detailed explanation of Microsoft Web N-gram Services. Wang works with a team focused on developing technologies that provide a better understanding of human languages.

Professional gatherings such as the Web N-gram workshop during SIGIR 2010 serve as another important channel for using real-world expertise to enhance ongoing development of Microsoft Web N-gram Services. Research papers, selected by an international program committee, will be presented during the workshop and will be followed by discussions about the use of web-based data services for research. Workshops and other gatherings have been critical to the development of Microsoft Web N-gram Services from the beginning.  After the expansion of beta availability announced during the International World Wide Web Conference in April 2010, for example, many researchers took advantage of the opportunity to work with the services. One such researcher, Li Ding of Rensselaer Polytechnic Institute, has his work on multiword tag clouds featured in this demo.  

In addition to presentations, the workshop will include a panel discussion on issues related to query representation, including a rigorous definition of the task, modeling for the task, challenges and opportunities, implications for industrial research, and future research directions.

If you are attending SIGIR 2010, I cordially invite you to attend the workshop, at 9 a.m. July 23 and take advantage of this opportunity to share your perspectives and connect with other researchers in the field.  To stay updated and to learn about opportunities to participate in ongoing development, please visit the Microsoft Web N-gram Services home page.

Evelyne Viegas, senior research program manager, Microsoft External Research, a division of Microsoft Research