Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
Checked C Specification
This is a detailed specification for the Checked C extension that explains the design in-depth.
Checked C clang compiler
The Checked C extension to C is being implemented in a fork of the clang compiler. Â Â You can download the latest version of the Checked C compiler for Windows from this GitHub link.…
DPU Utilities
This contains a set of utilities used across projects of the DPU team.
Longitudinal, daily, per-county activity periods of aggregated Twitter users
This data set consists of timelines of geo-located Twitter activity per U.S. county over ~4 years, from Jan 1, 2011 through April 30, 2014. Each county-timeline represents the amount of Twitter activity seen coming from…
Common Runtime for Applications (CRA)
Common Runtime for Applications (CRA) is a software layer (library) that makes it easy to create and deploy distributed dataflow-style applications on top of resource managers such as Kubernetes, YARN, and stand-alone cluster execution.
Space Partition Tree and Graph (SPTAG)
SPTAG (Space Partition Tree And Graph) is a library for large scale vector approximate nearest neighbor search scenario. It assumes that the samples are represented as vectors and that the vectors can be compared by…
The SubseasonalRodeo Dataset
A benchmark dataset for training and evaluating subseasonal forecasting systems—systems predicting temperature or precipitation 2-6 weeks in advance—in the western contiguous United States.
Charticulator
Charticulator is an interactive authoring tool that enables the creation of bespoke and reusable chart layouts. Most existing chart construction interfaces require authors to choose from predefined chart layouts, thereby precluding the construction of novel…
Pseudo-Task MAML
This is PointSQL, the source codes of Natural Language to Structured Query Generation via Meta-Learning and Pointing Out SQL Queries From Text from Microsoft Research. We present the setup for the WikiSQL experiments.