Project Catapult

Project Catapult





Project Catapult is the code name for a Microsoft Research (MSR) enterprise-level initiative that is transforming cloud computing by augmenting CPUs with an interconnected and configurable compute layer composed of programmable silicon.

Project Brainwave leverages Project Catapult to enable real-time AI

Try the first hardware accelerated model released May 7, 2018

Project Catapult is transforming cloud computing

We are living in an era where information grows exponentially and creates the need for massive computing power to process that information. At the same time, advances in silicon fabrication technology are approaching theoretical limits, and Moore’s Law has run its course. Chip performance improvements no longer keep pace with the needs of cutting-edge, computationally expensive workloads like software-defined networking (SDN) and artificial intelligence (AI). To create a faster, more intelligent cloud that keeps up with growing appetites for computing power, datacenters need to add other processors distinctly suited for critical workloads.

FPGAs offer a unique combination of speed and flexibility

Since the earliest days of cloud computing, we have answered the need for more computing power by innovating with special processors that give CPUs a boost. Project Catapult began in 2010 when a small team, led by Doug Burger and Derek Chiou, anticipated the paradigm shift to post-CPU technologies. We began exploring alternative architectures and specialized hardware such as graphics processing units (GPUs), field-programmable gate arrays (FPGAs), and custom application-specific integrated circuits (ASICs). We soon realized that FPGAs offer a unique combination of speed, programmability, and flexibility ideal for delivering cutting-edge performance and keeping pace with rapid innovation. Though FPGAs have been in use for decades, Microsoft Research (MSR) pioneered their use in cloud computing. MSR proved that FPGAs could deliver efficiency and performance without the cost, complexity, and risk of developing custom ASICs.

FPGA can perform line-rate computation

Project Catapult’s innovative board-level architecture is highly flexible. The FPGA can act as a local compute accelerator, an inline processor, or a remote accelerator for distributed computing. In this design, the FPGA sits between the datacenter’s top-of-rack (ToR) network switches and the server’s network interface chip (NIC). As a result, all network traffic is routed through the FPGA, which can perform line-rate computation on even high-bandwidth network flows.

The first hyperscale supercomputer

Today, nearly every new server in Microsoft datacenters integrates an FPGA into a unique distributed architecture, which creates an interconnected and configurable compute layer that extends the CPU compute layer. Using this acceleration fabric, we can deploy distributed hardware microservices (HWMS) with the flexibility to harness a scalable number of FPGAs—from one to thousands. Conversely, cloud-scale applications can leverage a scalable number of these microservices, with no knowledge of the underlying hardware. By coupling this approach with nearly a million Intel FPGAs deployed in our datacenters, we have built the world’s first hyperscale supercomputer, which can compute machine learning and deep learning algorithms with an unmatched combination of speed, efficiency, and scale.

Leading datacenter transformation by using programmable hardware

Through Project Catapult, Microsoft is leading the industry’s datacenter transformation by using programmable hardware. We were the first to prove the value of FPGAs for cloud computing, first to deploy them at cloud scale, and, with Bing, first to use them to accelerate enterprise-level applications.

Project Brainwave to enable real-time AI

Our leadership in accelerated networking has delivered the world’s fastest cloud network. Today, Project Brainwave is leveraging Project Catapult to enable real-time AI, with blazing fast inferencing performance at a remarkably affordable cost. A growing team of MSR researchers and engineers, in very close partnership with engineering groups such as Bing, Azure Machine Learning, Azure Networking, Azure Cloud Server Infrastructure (CSI), and Azure Storage, continue to push the boundaries of accelerated cloud computing.

Project Catapult’s waves of innovation will continue.


  • 2010: MSR demonstrated the first proof of concept to Bing leadership, with a proposal to use FPGAs at scale to accelerate Web search.
  • 2011: MSR researchers and Bing engineers developed the first prototype; identifying and accelerating computationally expensive operations in Bing’s IndexServe engine.
  • 2012: Project Catapult’s scale pilot of 1,632 FPGA-enabled servers was deployed to a datacenter, by using an early architecture with a custom secondary network.
  • 2013: Results of the pilot demonstrated a dramatic improvement in search latency, running Bing decision-tree algorithms 40 times faster than CPUs alone, and proved the potential to speed up search even while reducing the number of servers. Bing leadership committed to putting Project Catapult in production.
  • 2014: The Catapult v2 architecture introduced the breakthrough of placing FPGAs as a “bump in the wire” on the network path. Work began on accelerating software-designed networking for Azure. Project Catapult’s seminal paper was published.
  • 2015: FPGA-enabled servers were deployed at scale in Bing and Azure datacenters, and Bing first used FPGAs in production to accelerate search ranking. This enabled a 50 percent increase in throughput, or a 25 percent reduction in latency.
  • 2016: Azure launched Accelerated Networking, using FPGAs to enable the world’s fastest cloud network.
    FPGAs became a default part of most Azure and Bing server SKUs. MSR began Project Brainwave, focused on accelerating AI and deep learning.
  • 2017: MSR and Bing launched hardware microservices, enabling one web-scale service to leverage multiple FPGA-accelerated applications distributed across a datacenter. Bing deployed the first FPGA-accelerated Deep Neural Network (DNN). MSR demonstrated that FPGAs can enable real-time AI, beating GPUs in ultra-low latency, even without batching inference requests.
  • 2018: Bing and Azure deployed new multi-FPGA appliances into datacenters, shifting the ratio of computing power between CPUs and FPGAs, with multiple Intel Arria 10 FPGAs in each server. MSR, Bing, and Azure Machine Learning partnered to bring Project Brainwave to production for both Microsoft engineering groups and third-party customers. Azure Machine Learning launched the preview of Hardware Accelerated Models, powered by Project Brainwave, delivering ultra-fast DNN performance with ResNet-50, at remarkably low cost—only 21 cents per million images during preview.

This is still the beginning. Project Brainwave is gaining traction across the company, with accelerated models in development for text, speech, vision, and other areas. The company-wide Project Catapult virtual team continues to innovate in deep learning, networking, storage, and other areas.


In the news

FPGAs and the Road to Reprogrammable HPC
Inside HPC | July 3, 2019

Intel FPGAs: Accelerating the Future
Intel Newsroom | May 15, 2018

4 Big Takeaways from Satya Nadella’s Talk at Microsoft Build
Fortune | May 7, 2018

I do so like AML and HAM
ZDNet | May 7, 2018

Microsoft Charts Its Own Path on Artificial Intelligence
WIRED | May 7, 2018

Microsoft launches Project Brainwave, its deep learning acceleration platform
TechCrunch | May 7, 2018

Microsoft is luring A.I. developers to its cloud by offering them faster chips
CNBC | May 7, 2018

Microsoft’s Project Brainwave brings fast-chip smarts to AI at Build conference
CNET | May 7, 2018

Microsoft* Turbocharges AI with Intel FPGAs. You Can, Too.
Intel AI Academy | May 7, 2018

Intel FPGAs Bring Power to Artificial Intelligence Microsoft Azure
Intel News Byte | May 7, 2018

Why Microsoft Has Bet on FPGAs to Infuse Its Cloud With AI, by Mary Branscombe
Data Center Knowledge | April 25, 2018

Intel FPGAs Accelerate Artificial Intelligence for Deep Learning in Microsoft’s Bing Intelligent Search
Intel Newsroom | March 26, 2018

Microsoft’s Brainwave makes Bing’s AI over 10 times faster”>Intel FPGAs Accelerate Artificial Intelligence for Deep Learning in Microsoft’s Bing Intelligent Search
Venture Beat | March 26, 2018

Microsoft offloads networking to FPGA-powered NICs
The Register | January 8, 2018

Microsoft’s Project Catapult wins GeekWire’s “Innovation of the Year” award
ONMSFT | October 1, 2017

Chips Off the Old Block: Computers Are Taking Design Cues From Human Brains
NY Times | September 16, 2017

Microsoft FPGA Wins Versus Google TPUs for AI
Forbes | August 28, 2017

Microsoft is building its own AI Hardware with project Brainwave
Fortune | August 23, 2017

Microsoft’s project Brainwave puts ‘real-time artificial intelligence’ into high-tech chips
geekwire | August 22, 2017

Microsoft’s Configurable Cloud satisfies datacenters’ need for speed
InfoWorld | October 18, 2016

Why Microsoft Is Putting These Chips at the Center of Its Cloud
Fortune | October 17, 2016

Microsoft Bets its Future on a Reprogrammable Chip
Wired | Sept. 25, 2016

Microsoft ‘Catapults’ geriatric Moore’s Law from certain death
The Register | June 16, 2014

Microsoft to implement ‘Catapult’ programmable processors in its datacenters
ZD Net | June 16, 2014

Microsoft Supercharges Bing Search with Programmable Chips
Wired | June 14, 2014

Microsoft blogs

Clouds, catapults and life after the end of Moore’s Law with Dr. Doug Burger

Some of the world’s leading architects are people that you’ve probably never heard of, and they’ve designed and built some of the world’s most amazing structures that you’ve probably never seen. Or at least you don’t think you have. One of these architects is Dr. Doug Burger, Distinguished Engineer at Microsoft Research NExT. And, if you use a computer, or store anything in the Cloud, you’re a beneficiary of the beautiful architecture that he, and people like him, work on every day.

Microsoft Research Podcast | May 9, 2018

Real-time AI: Microsoft announces preview of Project Brainwave

Every day, thousands of gadgets and widgets whish down assembly lines run by the manufacturing solutions provider Jabil, on their way into the hands of customers. Along the way, an automated optical inspection system scans them for any signs of defects, with a bias toward ensuring that all potential anomalies are detected. It then sends those parts off to be checked manually. The speed of operations leaves manual inspectors with just seconds to decide if the product is really defective, or not.

The AI Blog | May 7, 2018

Bing Launches More Intelligent Search Features

In December, we announced new intelligent search features which tap into advances in AI to provide people with more comprehensive answers, faster. Today, we’re excited to announce improvements to our current features, and new scenarios that get you to your answer faster. Since December we’ve received a lot of great feedback on our experiences; based on that, we’ve expanded many of our answers to the UK, improved our quality and coverage of existing answers, and added new scenarios.

Bing Blog | March 26, 2018

Maximize your VM’s Performance with Accelerated Networking – now generally available for both Windows and Linux

We are happy to announce that Accelerated Networking (AN) is generally available (GA) and widely available for Windows and the latest distributions of Linux providing up to 30Gbps in networking throughput, free of charge! AN provides consistent ultra-low network latency via Azure’s in-house programmable hardware and technologies such as SR-IOV.

Microsoft Azure Blog | January 5, 2018

Bing launches new intelligent search features, powered by AI

Today we announced new Intelligent Search features for Bing, powered by AI, to give you answers faster, give you more comprehensive and complete information, and enable you to interact more naturally with your search engine. Intelligent answers leverage the latest state of the art machine reading comprehension, backed by Project Brainwave running on Intel’s FPGAs, to read and analyze billions of documents to understand the web and help you more quickly and confidently get the answers you need.

Bing Blog | December 13, 2017

Microsoft unveils Project Brainwave for real-time AI

Today at Hot Chips 2017, our cross-Microsoft team unveiled a new deep learning acceleration platform, codenamed Project Brainwave.  I’m delighted to share more details in this post, since Project Brainwave achieves a major leap forward in both performance and flexibility for cloud-based serving of deep learning models. We designed the system for real-time AI, which means the system processes requests as fast as it receives them, with ultra-low latency.

Microsoft Research Blog | August 22, 2017

The moonshot that succeeded: How Bing and Azure are using an AI supercomputer in the cloud

When we type in a search query, access our email via the cloud or stream a viral video, chances are we don’t spend any time thinking about the technological plumbing that is behind that instant gratification. Sitaram Lanka and Derek Chiou are two exceptions. They are engineers who spend their days thinking about ever-better and faster ways to get you all that information with the tap of a finger, as you’ve come to expect.

The AI Blog | October 17, 2016

Project Catapult servers available to academic researchers

At this year’s Supercomputing 2015 Conference in Austin, Texas, Microsoft is announcing the availability of Project Catapult clusters to academic researchers through the Texas Advanced Computing Center (TACC) at The University of Texas at Austin. Project Catapult, a Microsoft research venture, offers a groundbreaking way to vastly improve the performance and energy efficiency of datacenter workloads.

Microsoft Research Blog | November 12, 2015

Machine Learning Gets Big Boost from Ultra-Efficient Convolutional Neural Network Accelerator

I’m excited to highlight a breakthrough in high-performance machine learning from Microsoft researchers. Before describing our results, some background may be helpful. The high-level architecture of datacenter servers has been generally stable for many years, based on some combination of CPUs, DRAM, Ethernet, and disks (with solid-state drives a more recent addition). While the capacities and speeds of the components—and the datacenter scale—have grown, the basic server architecture has evolved slowly.

Microsoft Research Blog | February 23, 2015

Catapult: Moving Beyond CPUs in the Cloud

Operating a datacenter at web scale requires managing many conflicting requirements. The ability to deliver computation at a high level and speed is a given, but because of the demands such a facility must meet, a datacenter also needs flexibility. Additionally, it must be efficient in its use of power, keeping costs as low as possible. Addressing often conflicting goals is a challenge, leading datacenter providers to seek constant performance and efficiency improvements and to evaluate the merits of general-purpose versus task-tuned alternatives—particularly in an era in which Moore’s Law is nearing an end, as some suggest.

Microsoft Research Blog | June 16, 2014