Documentation

Web Language Model API Overview

Welcome to the Microsoft Web Language Model API, a REST-based cloud service providing state-of-the-art tools for natural language processing. Using this API, your application can leverage the power of big data through language models trained on web-scale corpora collected by Bing in the EN-US market.

These smoothed backoff N-gram language models, supporting Markov order up to 5, are trained on the following corpora:

  • Web page body text
  • Web page title text
  • Web page anchor text
  • Web search query text

The Web LM REST API supports four lookup operations:

  1. Joint (log10) probability of a sequence of words.
  2. Conditional (log10) probability of one word given a sequence of preceding words.
  3. List of words (completions) most likely to follow a given sequence of words.
  4. Word breaking of strings that contain no spaces.

Getting Started

  1. Subscribe to the service.
  2. Download the SDK.
  3. Run the SDK sample code.
  4. Consult the API Reference for further details, including code snippets in a variety of languages.

Underlying Technology

The following paper provides details on the development of these language models, and should be cited in research publications that utilize this service:

Click here for a current list of papers citing this work.


Edit on Github | Join our community on StackOverflow and UserVoice