Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
Chart Reader
Chart Reader is a web-based accessibility engine, which enables rendering of accessible visualizations for screen reader uses to read and better understand the visualizations and underlying data.
Analyzing PII Leakage
This repository contains the official code for our IEEE S&P 2023 paper using GPT-2 language models and Flair Named Entity Recognition (NER) models. It allows fine-tuning (i) undefended, (ii) differentially-private and (iii) scrubbed language models…
Syntheseus
Syntheseus is a package for retrosynthetic planning. It contains implementations of common search algorithms and a simple API to wrap custom reaction models and write custom algorithms. It is meant to allow for simple benchmarking…
Temporal Vision-Language Processing (BioViL-T)
BioViL-T is a Vision-Language model trained on sequences of biomedical image and text data at a scale. It does not require manual annotations and can leverage historical raw clinical image acquisitions and clinical notes. The…
SimpleRacerResearchPlatform
A simple racer research platform that showcases various imitation learning models with a web racing game.
Revizor: a fuzzer to search for microarchitectural leaks in CPUs
This is Revizor, a microarchitectural fuzzer. Instead of finding bugs in programs, Revizor searches for microarchitectural vulnerabilities in CPUs. What is a microarchitectural vulnerability? In the context of Revizor, it is a violation of out expectations…
Semantic Kernel
Semantic Kernel (SK) is a lightweight SDK enabling integration of AI Large Language Models (LLMs) with conventional programming languages. The SK extensible programming model combines natural language semantic functions, traditional code native functions, and embeddings-based…
Biomedical Visual-Language Processing (BioViL)
BioViL is a machine learning model trained on biomedical vision and language datasets at scale. It does not require manual annotations and can leverage historical raw clinical image acquisitions and clinical notes.