Research Tools: code, datasets, & models

Tool

INTREPID

INTREPID (stands for INTeractive learning via REPresentatIon Discovery) is a library that contains various interactive learning algorithms that learn a representation (or a latent state) from observational data in order to complete their tasks.

GitHub

Tool

Chart Reader

Chart Reader is a web-based accessibility engine, which enables rendering of accessible visualizations for screen reader uses to read and better understand the visualizations and underlying data.

GitHub

Tool

Analyzing PII Leakage

This repository contains the official code for our IEEE S&P 2023 paper using GPT-2 language models and Flair Named Entity Recognition (NER) models. It allows fine-tuning (i) undefended, (ii) differentially-private and (iii) scrubbed language models…

GitHub

Tool

Syntheseus

Syntheseus is a package for retrosynthetic planning. It contains implementations of common search algorithms and a simple API to wrap custom reaction models and write custom algorithms. It is meant to allow for simple benchmarking…

GitHub

Tool

Guidance

Guidance enables you to control modern language models more effectively and efficiently than traditional prompting or chaining. Guidance programs allow you to interleave generation, prompting, and logical control into a single continuous flow matching how…

GitHub

Tool

Temporal Vision-Language Processing (BioViL-T)

BioViL-T is a Vision-Language model trained on sequences of biomedical image and text data at a scale. It does not require manual annotations and can leverage historical raw clinical image acquisitions and clinical notes. The…

Access Publication Publication

Tool

SimpleRacerResearchPlatform

A simple racer research platform that showcases various imitation learning models with a web racing game.

GitHub

Tool

Revizor: a fuzzer to search for microarchitectural leaks in CPUs

This is Revizor, a microarchitectural fuzzer. Instead of finding bugs in programs, Revizor searches for microarchitectural vulnerabilities in CPUs. What is a microarchitectural vulnerability? In the context of Revizor, it is a violation of out expectations…

GitHub Publication

Tool

Semantic Kernel

Semantic Kernel (SK) is a lightweight SDK enabling integration of AI Large Language Models (LLMs) with conventional programming languages. The SK extensible programming model combines natural language semantic functions, traditional code native functions, and embeddings-based…

GitHub

Tool

Biomedical Visual-Language Processing (BioViL)

BioViL is a machine learning model trained on biomedical vision and language datasets at scale. It does not require manual annotations and can leverage historical raw clinical image acquisitions and clinical notes.

Access Publication