Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
iBox – Internet in a Box
The iBox (Internet in a Box) project is an ongoing effort at Microsoft Research India on enabling data-informed network simulation. Currently, we use data in the form of per-packet traces gathered at senders and receivers…
Debiasing Item-to-Item Recommendations With Small Annotated Datasets Release
Implementation of “Debiasing Item-to-Item Recommendations With Small Annotated Datasets” (RecSys ’20)
rankerEval
rankerEval – a Python library for evaluating rankings. A fast numpy-based implementation of ranking metrics for information retrieval and recommendation.
Archai – Reproducible Rapid Research for Network Architecture Search
Archai is a platform for Neural Network Search (NAS) that allow you to generate efficient deep networks for your applications. Archai aspires to accelerate NAS research by enabling easy mix and match between different techniques…
TaxiNLI
Taoxnomic Re-annotation of NLI Examples in MultiNLI Dataset | Also on the Microsoft Download Center: https://www.microsoft.com/en-us/download/details.aspx?id=102127
CodeXGLUE
CodeXGLUE is a benchmark dataset and open challenge for code intelligence. It includes a collection of code intelligence tasks and a platform for model evaluation and comparison. CodeXGLUE stands for General Language Understanding Evaluation benchmark…
MPNet
MPNet: Masked and Permuted Pre-training for Language Understanding, by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu, is a novel pre-training method for language understanding tasks. It solves the problems of MLM (masked…