NCI-PID-PubMed Genomics Knowledge Base Completion Dataset
This dataset includes a database of regulation relationships among genes and corresponding textual mentions of pairs of genes in PubMed article abstracts. It was derived from the NCI PID Pathway Interaction Database, and the textual…
Intelligent Editing
The Intelligent Editing Project seeks to apply neural networks and other modern machine learning techniques to furnish editorial assistance. We look beyond traditional grammatical error checking to focus on facilitating writers by providing them with…
Chatbots and Conversation As A Platform (CAAP)
At Microsoft Build 2016 event, Microsoft CEO Satya Nadella said that chatbots, as next big thing, will have “as profound an impact as previous shifts we’ve had.” The past paradigm shifts include graphical user interface, the web browser and…
Knowledge Mining API
To better satisfy the requirements of different scenarios, we built a set of knowledge mining APIs using our technologies developed for Satori Pipeline, Bing QnA and our Enterprise Dictionary project. We are pleased to announce…
Learning from Explicit and Implicit Supervision Jointly For Algebra Word Problems
This is a public release of the dataset corresponding the paper “Learning from Explicit and Implicit Supervision Jointly For Algebra Word Problems” that appeared in EMNLP 2016. This set only contains the implicit supervised examples.…
Digital Me
Digital Me: Toward Digitalizing Everybody in the World Introduction: Artificial Intelligence (AI) applications such as chat bots, software assistants etc. are attracting increasing attention from both academic and industry. Most existing work aim to assist…
Enterprise Dictionary
1. Project Introduction “Everyday we are faced with a sea of acronyms, ever changing group structures, and fast-tracked projects.” Currently, collation and curation of corporate knowledge is a painstaking manual process. We seek to move…
Conceptualization
The Conceptualization model aims to map text format entities into semantic concept categories with some probabilities, which may depend on the context texts of the entities. As an example, “Microsoft” could be automatically mapped to…