Aurora
Aurora is a machine learning model that can predict atmospheric variables, such as temperature. It is a foundation model, which means that it was first generally trained on a lot of data and then can…
Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
Aurora is a machine learning model that can predict atmospheric variables, such as temperature. It is a foundation model, which means that it was first generally trained on a lot of data and then can…
vAttention is a memory manager for KV-cache in LLM serving systems. It decouples the allocation of virtual memory and physical memory using the CUDA virtual memory APIs. This approach enables allocating physical memory on demand…
RepoClassBench (RCB): is a repository-level code-generation benchmark. Retrieve-RepoTools-Reflect (RRR) is a framework for code generation using Language Models (LLMs) with static-analysis tools in an agent setup.
Trace is a new AutoDiff-like tool for training AI systems end-to-end with general feedback (like numerical rewards or losses, natural language text, compiler errors, etc.). Trace generalizes the back-propagation algorithm by capturing and propagating an…
LongRoPE is a novel method that extends the context window of pre-trained LLMs to an impressive 2048k tokens by non-uniformly rescaling RoPE positional embeddings. LongRoPE has been integrated into Microsoft Phi-3.
An implementation of data encoding and decoding using DNA Tags and paper tickets. The api directory contains implementations for REST API endpoints to enable a DNA Tagging application. The test directory contains configurations and tests…
A public framework for time-series forecasting with spiking neural networks (SNNs).
The Intelligence Toolkit is a suite of interactive workflows for creating AI intelligence reports from real-world data sources. The toolkit is designed to help users identify patterns, answers, relationships, and risks within complex datasets, with…