Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
ParrotServe
“Parrot: Efficient Serving of LLM-based Applications with Semantic Variable” Parrot is a distributed serving system for LLM-based Applications. The Parrot API w/ Semantic Variable is served by a centralized cluster manager called ServeCore, which manages…
GenAIScript
Scripting environment with convenient tooling for file ingestion, prompt development and structured data extraction.
promptbase
promptbase is an evolving collection of resources, best practices, and example scripts for eliciting the best performance from foundation models like GPT-4. We currently host scripts demonstrating the Medprompt methodology, including examples of how we…
Phi-1.5
The language model phi-1.5 is a Transformer with 1.3 billion parameters. It was trained using the same data sources as phi-1, augmented with a new data source that consists of various NLP synthetic texts. When…
Phi-1
The language model phi-1 is a Transformer with 1.3 billion parameters, specialized for basic Python coding. Its training involved a variety of data sources, including subsets of Python codes from The Stack v1.2, Q&A content…
InferredBugs
InferredBugs is a metadata-rich dataset of bugs and fixes in Java and C# programming languages, extracted using Infer (for Java) and InferSharp (for C#). The dataset has been constructed by systematically analyzing open-source repositories, scrutinizing…
NoFunEval
This repository hosts the official code and data artifact for the paper “NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness”. The work is a comprehensive evaluation of code language models on real-world code…