Knowledge Mining API

Established: February 1, 2016

To better satisfy the requirements of different scenarios, we built a set of knowledge mining APIs using our technologies developed for Satori Pipeline, Bing QnA and our Enterprise Dictionary project. We are pleased to announce our first round internal knowledge mining API release for knowledge users within MSRA. We are eager to learn feedback in using them and would like to customize the solutions for you in case they cannot satisfy your knowledge needs. For external users, if you would like to try these APIs, please contact us directly.

The APIs in this round of internal release include:

  1. Document Converter
    1. Input: document in covered format (Word, PowerPoint, OneNote, Web Page, PDF, etc.)
    2. Output: XML file of input document
    3. Detail: Office File Extractor Web API
  2. Acronym Mining (entity discovery)
    1. Input: document in target format
    2. Output: acronym-expansion pairs
    3. Detail: Acronym Web API
  3. Entity Definition Mining (entity discovery and enrichment)
    1. Input: document in target format, target entity (optional)
    2. Output: definition of entities
    3. Detail: Definition Web API
  4. Entity Conceptualization (entity semantic representation and relation mining)
    1. Input: target entity or concept
    2. Output: top concepts for this entity, or top entities for this concept
    3. Detail: Concept Web API
  5. Document Tagging
    1. Input: document in target format
    2. Output: semantic tags (phrases)
    3. Detail: Tagging Web API
  6. Table Mining (entity discovery and relation mining, NER only support People-project relation in this version)
    1. Input: document in target format contain tables (explicit table, visual table, logical table)
    2. Output: a List of objects that reflect the relation between people and projects
    3. Detail: Work On Web API
  7. QA Pair Mining
    1. Input: document in target format
    2. Output: explicit QA pairs, implicit QA pairs
    3. Detail: QA Extractor Web API
    4. Here is the data flow among these APIs:

Note: to use our APIs, users must declare webRequest.UseDefaultCredentials = true in the code.

Contacts


Zhongyuan Wang

Dawei Zhang

Lei Ji

Jun Yan

Wei-Ying Ma

Group

  Data Mining and Enterprise Intelligence Group, MSRA

People

Portrait of Lei Ji

Lei Ji

Senior Researcher