Research Tools: code, datasets, & models

Tool

RepoClassBench

RepoClassBench (RCB): is a repository-level code-generation benchmark. Retrieve-RepoTools-Reflect (RRR) is a framework for code generation using Language Models (LLMs) with static-analysis tools in an agent setup.

GitHub Publication

Tool

Trace

Trace is a new AutoDiff-like tool for training AI systems end-to-end with general feedback (like numerical rewards or losses, natural language text, compiler errors, etc.). Trace generalizes the back-propagation algorithm by capturing and propagating an…

GitHub Publication

Tool

LongRoPE

LongRoPE is a novel method that extends the context window of pre-trained LLMs to an impressive 2048k tokens by non-uniformly rescaling RoPE positional embeddings. LongRoPE has been integrated into Microsoft Phi-3.

GitHub Publication

Tool

DNATagging

An implementation of data encoding and decoding using DNA Tags and paper tickets. The api directory contains implementations for REST API endpoints to enable a DNA Tagging application. The test directory contains configurations and tests…

GitHub

Tool

UniPrompt

UniPrompt provides a unified interface to prompt optimization. We have distilled common functions from different algorithms and provide a plug-n-play API to create new algorithms. It also provides an easy way to benchmark different prompt…

GitHub

Tool

SeqSNN

A public framework for time-series forecasting with spiking neural networks (SNNs).

GitHub Publication Publication

Tool

Intelligence Toolkit

The Intelligence Toolkit is a suite of interactive workflows for creating AI intelligence reports from real-world data sources. The toolkit is designed to help users identify patterns, answers, relationships, and risks within complex datasets, with…

GitHub

Tool

MetaOpt: Towards efficient heuristic design with quantifiable and confident performance

MetaOpt is the first general-purpose and scalable tool that enables users to analyze a broad class of heuristics through easy-to-use abstractions that apply to a broad range of practical heuristics. For more information, checkout MetaOpt’s project webpage and…

GitHub Publication

Tool

VisEval

VisEval: A NL2VIS Benchmark. VisEval is a benchmark designed to evaluate visualization generation methods. In this repository, we provide both the toolkit to support the benchmarking, as well as the data used for benchmarks.

GitHub Publication

Tool

DOSA

A dataset of social artifacts from different Indian geographical subcultures. This repo hosts the code to run experiments on the DOSA dataset.

GitHub Publication