Microsoft Research Cambridge

Machine Intelligence

Advanced machine learning research, grounded in trust, efficiency, capability.

KBLaM blog | A flowchart illustrating the process of handling a prompt using a language model. The process begins with documents being used to construct and summarize a knowledge base (KB) offline. The summarized KB is then encoded and fed into the main process. A prompt goes through a tokenizer, followed by rectangular attention, and then into the large language model (LLM). The LLM retrieves information from the encoded KB to generate an answer.

Introducing KBLaM: Bringing plug-and-play external knowledge to LLMs

Layered image of Phi Silica, a state-of-the-art small language model integrated into Windows 11 Copilot+PCs

Phi Silica, small but mighty on-device SLM

A Ladder of Reasoning: Testing the power of imagination in LLMs

The Machine Intelligence team at MSR Cambridge (UK) is dedicated to foundational machine-learning research, guided by the principles of responsible AI, collaboration, and scientific excellence. Our work is grounded in trust, capability, and efficiency, and we are deeply engaged in collaborations that build on these foundations for systems, the sciences, and human-centred AI.

Our workstreams

Cognition: We are advancing the reasoning capabilities of generative AI through principled machine learning approaches that combine NLP, formal methods, logic, and statistics. By uniting experts across disciplines, we aim to build AI systems that reason reliably, generalize effectively, and operate safely in high-stakes settings. Our work balances theoretical insight with empirical validation to ensure robustness, interpretability, and alignment with human values.

Memory: We are developing models of memory that make factual knowledge in large language models transparent, controllable, and fully traceable. Our approach enables precise, provenance-aware knowledge infusion with dynamic editing and access control, allowing models to distinguish grounded from sourceless outputs. In collaboration with Microsoft Research and Copilot teams, we bridge fundamental research and real-world deployment, advancing both scientific understanding and product impact.

Efficient AI: We are developing more efficient AI systems by automating quantization through a compiler-based approach grounded in programming language theory, enabling scalable and energy-efficient on-device inference. Our technology powers real-world applications like Copilot+ PCs (via Phi Silica) and the AI Toolkit in Visual Studio Code. In parallel, we are advancing quantization methods and designing next-generation optimizers based on first-principles analyses of learning dynamics to achieve more principled, efficient model training.

Diffusion: Diffusion Language Models (DLMs) are a promising alternative to autoregressive models, offering potentially higher generative quality but facing challenges with scalability and efficiency. Building on prior image diffusion research using Fourier domain analysis, we aim to enhance DLM training and inference by applying similar statistical insights to language.

Machine Intelligence

高亮

Introducing KBLaM: Bringing plug-and-play external knowledge to LLMs

Phi Silica, small but mighty on-device SLM

A Ladder of Reasoning: Testing the power of imagination in LLMs

Our workstreams