Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
COMPASS
This repository contains the PyTorch implementation of the COMPASS model proposed in our paper: COMPASS: Contrastive Multimodal Pretraining for Autonomous Systems. COMPASS aims to build general purpose representations for autonomous systems from multimodal observations. Given…
Hippocorpus
To examine the cognitive processes of remembering and imagining and their traces in language, we introduce Hippocorpus, a dataset of 6,854 English diary-like short stories about recalled and imagined events. Using a crowdsourcing framework, we…
Carbon Insight
A platform to display the carbon neutralization information for researchers, decision-makers, and other participants in the community.
Poultry Barn Mapping
A repository for training models from high-resolution aerial imagery and a dataset of predicted poultry barns across the United States.
Solar Farms Mapping
The Solar Farms Mapping release is an artificial intelligence dataset for solar energy locations in India – a spatially explicit machine learning model to map utility-scale solar projects across India using freely available satellite imagery.
FLUTE
FLUTE (Federated Learning Utilities for Testing and Experimentation) is a platform for conducting high-performance federated learning simulations.
COVID19-CT segmentation
This project provides code to train semantic segmentation models for pulmonary lesions segmentation from Computer Tomography (CT) scans. Furthermore, we present a multitask model for joint segmentation of different classes of pulmonary lesions present in…
FastSeq
FastSeq provides efficient implementation of popular sequence models (e.g. Bart, ProphetNet) for text generation, summarization, translation tasks etc. It automatically optimizes inference speed based on popular NLP toolkits (e.g. FairSeq and HuggingFace-Transformers) without accuracy loss.