Research Tools: code, datasets, & models

Tool

TamGen

TamGen is a transformer-based chemical language model for developing target-specific drug compounds. Research shows that TamGen can also optimize existing molecules by designing target-aware molecule fragments, potentially enabling the discovery of novel compounds that build…

Access

Tool

TileLang

TileIR (tile-ir) is a concise domain-specific IR designed to streamline the development of high-performance GPU/CPU kernels (e.g., GEMM, Dequant GEMM, FlashAttention, LinearAttention). By employing a Pythonic syntax with an underlying compiler infrastructure on top of…

GitHub

Tool

MMLU-CF

Paper: “MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark”

GitHub

Tool

RAD-DINO model

RAD-DINO is a vision transformer model trained to encode chest X-rays using the self-supervised learning method DINOv2 (opens in new tab). RAD-DINO is described in detail in RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision (F.…

Access Project

Tool

MAIRA-2 model

MAIRA-2 is a multimodal transformer designed for the generation of grounded or non-grounded radiology reports from chest X-rays. It is described in more detail in MAIRA-2: Grounded Radiology Report Generation (S. Bannur, K. Bouzid et al.,…

Access Project

Tool

RadFact: An LLM-based Evaluation Metric for AI-generated Radiology Reporting

RadFact is a framework for the evaluation of model-generated radiology reports given a ground-truth report, with or without grounding. Leveraging the logical inference capabilities of large language models, RadFact is not a single number but a suite of…

GitHub Project

Tool

Cheap Permutations

This repository replicates the experiments of the paper “Cheap Permutation Testing”.

GitHub

Tool

KBLaM: Knowledge Base augmented Language Model

KBLaM is a new method for augmenting LLMs with external knowledge. Unlike Retrieval-Augmented Generation, KBLAM eliminates external retrieval modules, and unlike in-context learning, its computational overhead scales linearly with KB size rather than quadratically.

GitHub

Tool

Magentic-One

Magentic-One is a generalist multi-agent system created to address intricate web and file-based tasks. By utilizing an intelligent Orchestrator alongside specialized agents, it facilitates the automation of complex, multi-step activities across various environments.

Access Video

Tool

PIKE-RAG

PIKE-RAG (sPecIalized KnowledgE and Rationale Augmented Generation) framework mainly consists of several basic modules, including document parsing, knowledge extraction, knowledge storage, knowledge retrieval, knowledge organization, knowledge-centric reasoning, and task decomposition and coordination while building coherent…

GitHub