Lost in Conversation
Lost in Conversation is a code repository to facilitate benchmarking LLMs on multi-turn task completion and the reproduction of experiments included in the accompanying paper: “LLMs Get Lost in Multi-Turn Conversation”.
Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
Lost in Conversation is a code repository to facilitate benchmarking LLMs on multi-turn task completion and the reproduction of experiments included in the accompanying paper: “LLMs Get Lost in Multi-Turn Conversation”.
Raha (MetaOpt) uses our open-source heuristic analyzer to quantify the impact of failures on a traffic engineered WAN.
Implicit Language models are language models that calculate their outputs as a fixed-point iteration rather than a single forward pass. Implicit models offer increased representational power compared to transformers as we show in the accompanying…
This repository provides the code to construct the HORIZON benchmark — a large-scale, cross-domain benchmark built by refactoring the popular Amazon-Reviews 2023 dataset for evaluating sequential recommendation and user behavior modeling. We do not release…
MarkItDown is a lightweight Python utility for converting various files to Markdown for use with LLMs and related text analysis pipelines. To this end, it is most comparable to textract, but with a focus on…
We introduce Reprompting, an iterative sampling algorithm that automatically learns the Chain-of-Thought (CoT) recipes for a given task without human intervention. Through Gibbs sampling, Reprompting infers the CoT recipes that work consistently well for a…
Humans can learn to solve new tasks by inducing high-level strategies from example solutions to similar problems and then adapting these strategies to solve unseen problems. Can we use large language models to induce such…