An index of datasets, SDKs, APIs and other open source code created by Microsoft researchers and shared with the broader academic community. We also maintain a collection highlighting some of the tools you’ll find here.
Chart Reader
Chart Reader is a web-based accessibility engine, which enables rendering of accessible visualizations for screen reader uses to read and better understand the visualizations and underlying data.
GitHubAnalyzing PII Leakage
This repository contains the official code for our IEEE S&P 2023 paper using GPT-2 language models and Flair Named Entity Recognition (NER) models. It allows fine-tuning (i) undefended, (ii) differentially-private and (iii) scrubbed language models…
GitHubsyntheseus
Syntheseus is a package for retrosynthetic planning. It contains implementations of common search algorithms and a simple API to wrap custom reaction models and write custom algorithms. It is meant to allow for simple benchmarking…
GitHubTemporal Vision-Language Processing (BioViL-T)
BioViL-T is a Vision-Language model trained on sequences of biomedical image and text data at a scale. It does not require manual annotations and can leverage historical raw clinical image acquisitions and clinical notes. The…
SimpleRacerResearchPlatform
A simple racer research platform that showcases various imitation learning models with a web racing game.
GitHubRevizor: a fuzzer to search for microarchitectural leaks in CPUs
This is Revizor, a microarchitectural fuzzer. Instead of finding bugs in programs, Revizor searches for microarchitectural vulnerabilities in CPUs. What is a microarchitectural vulnerability? In the context of Revizor, it is a violation of out expectations…
GitHub