UniLM – Unified Language Model Pre-training
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities.
GitHub Project Publication Publication Publication Publication Publication
Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities.
GitHub Project Publication Publication Publication Publication Publication
The Confidential Consortium Framework (CCF) is an open-source framework for building a new category of secure, highly available, and performant applications that focus on multi-party compute and data. While not limited just to blockchain applications,…
Code to train and evaluate the GeNeVA-GAN model for the GeNeVA task proposed in our ICCV 2019 paper Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction.
Petridish, is a neural architecture search (NAS) algorithm which iteratively adds shortcut connections to existing network layers. The added shortcut connections effectively perform gradient boosting on the augmented layers. The proposed algorithm is motivated by…
Sign language recognition is a challenging and often underestimated problem comprising multi-modal articulators (handshape, orientation, movement, upper body and face) that integrate asynchronously on multiple streams. Learning powerful statistical models in such a scenario requires…
TSVD is an easy-to-use tool to efficiently detect thread-safety violation (e.g., data races) in .NET applications. It instruments application binaries to significantly increase the chance to find such violations with existing tests.
A type of Bayesian Neural Network which has a sparsity-inducing prior distribution in order to help interpret the learned weights. Particularly useful in the domain of healthcare but scales to many other domains, as it…
Microsoft Icecaps is a new open-source NLP toolkit featuring pre-trained models and an emphasis on conversational scenarios
FishStore is a new ingestion and storage layer for flexible- and fixed-schema datasets. It allows you to dynamically register complex predicates over the data, to define interesting subsets of the data. Such predicates are called…