{"id":171431,"date":"2015-02-02T08:20:51","date_gmt":"2015-02-02T08:20:51","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-catapult\/"},"modified":"2021-12-06T21:07:49","modified_gmt":"2021-12-07T05:07:49","slug":"project-catapult","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-catapult\/","title":{"rendered":"Project Catapult"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background- card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"720\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2015\/02\/ProjectCatapult_AI_Header_05_2018_1920x720_2.jpg\" class=\"attachment-full size-full\" alt=\"Project Catapult cloud computing CPU\" style=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2015\/02\/ProjectCatapult_AI_Header_05_2018_1920x720_2.jpg 1920w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2015\/02\/ProjectCatapult_AI_Header_05_2018_1920x720_2-300x113.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2015\/02\/ProjectCatapult_AI_Header_05_2018_1920x720_2-768x288.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2015\/02\/ProjectCatapult_AI_Header_05_2018_1920x720_2-1024x384.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2015\/02\/ProjectCatapult_AI_Header_05_2018_1920x720_2-1600x600.jpg 1600w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 align-self-center\">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 id=\"project-catapult\" class=\"h2\">Project Catapult<\/h1>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>Project Catapult is the code name for a Microsoft Research (MSR) enterprise-level initiative that is transforming cloud computing by augmenting CPUs with an interconnected and configurable compute layer composed of programmable silicon.<\/p>\n\n\n\n<h3 id=\"project-brainwave-leverages-project-catapult-to-enable-real-time-ai\" class=\"has-text-align-center\"><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-brainwave\/\">Project Brainwave<\/a> leverages Project Catapult to enable real-time AI<\/h3>\n\n\n\n<h4 id=\"try-the-first-hardware-accelerated-model-opens-in-new-tab-released-may-7-2018\" class=\"has-text-align-center\">Try the <a href=\"https:\/\/aka.ms\/aml-real-time-ai\" target=\"_blank\" rel=\"noopener noreferrer\">first hardware accelerated model<\/a> released May 7, 2018<\/h4>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-fill-github\"><a data-bi-type=\"button\" class=\"wp-block-button__link\">Try models<\/a><\/div>\n<\/div>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Azure Accelerated Machine Learning with Project Brainwave\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/DJfMobMjCX0?feature=oembed&rel=0\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h3 id=\"project-catapult-is-transforming-cloud-computing\">Project Catapult is transforming cloud computing<\/h3>\n\n\n\n<p>We are living in an era where information grows exponentially and creates the need for massive computing power to process that information. At the same time, advances in silicon fabrication technology are approaching theoretical limits, and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/en.wikipedia.org\/wiki\/Moore%27s_law\" target=\"_blank\" rel=\"noopener noreferrer\">Moore\u2019s Law<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> has run its course. Chip performance improvements no longer keep pace with the needs of cutting-edge, computationally expensive workloads like software-defined networking (SDN) and artificial intelligence (AI). To create a faster, more intelligent cloud that keeps up with growing appetites for computing power, datacenters need to add other processors distinctly suited for critical workloads.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"FPGAs in Microsoft's Intelligent Cloud\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/Oi0XLs9t4-8?feature=oembed&rel=0\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n<\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0006_Space-Chip-1-1024x576.jpg\" alt=\"a close up of a computer keyboard\" class=\"wp-image-484704\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0006_Space-Chip-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0006_Space-Chip-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0006_Space-Chip-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0006_Space-Chip-1.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0006_Space-Chip-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0006_Space-Chip-1-343x193.jpg 343w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h3 id=\"fpgas-offer-a-unique-combination-of-speed-and-flexibility\">FPGAs offer a unique combination of speed and flexibility<\/h3>\n\n\n\n<p>Since the earliest days of cloud computing, we have answered the need for more computing power by innovating with special processors that give CPUs a boost. Project Catapult began in 2010 when a small team, led by Doug Burger and Derek Chiou, anticipated the paradigm shift to post-CPU technologies. We began exploring alternative architectures and specialized hardware such as graphics processing units (GPUs), field-programmable gate arrays (FPGAs), and custom application-specific integrated circuits (ASICs). We soon realized that FPGAs offer a unique combination of speed, programmability, and flexibility ideal for delivering cutting-edge performance and keeping pace with rapid innovation. Though FPGAs have been in use for decades, Microsoft Research (MSR) pioneered their use in cloud computing. MSR proved that FPGAs could deliver efficiency and performance without the cost, complexity, and risk of developing custom ASICs.<\/p>\n<\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h3 id=\"fpga-can-perform-line-rate-computation\">FPGA can perform line-rate computation<\/h3>\n\n\n\n<p>Project Catapult\u2019s innovative board-level architecture is highly flexible. The FPGA can act as a local compute accelerator, an inline processor, or a remote accelerator for distributed computing. In this design, the FPGA sits between the datacenter\u2019s top-of-rack (ToR) network switches and the server\u2019s network interface chip (NIC). As a result, all network traffic is routed through the FPGA, which can perform line-rate computation on even high-bandwidth network flows.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0005_FPGA-Flare-3-1024x576.jpg\" alt=\"FPGA flare representation\" class=\"wp-image-484701\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0005_FPGA-Flare-3-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0005_FPGA-Flare-3-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0005_FPGA-Flare-3-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0005_FPGA-Flare-3.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0005_FPGA-Flare-3-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0005_FPGA-Flare-3-343x193.jpg 343w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0002_Eyeball-Labels-1024x576.jpg\" alt=\"visual representation of Microsoft's unique distributed architecture, which creates an interconnected and configurable compute layer that extends the CPU compute layer\" class=\"wp-image-484716\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0002_Eyeball-Labels-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0002_Eyeball-Labels-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0002_Eyeball-Labels-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0002_Eyeball-Labels.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0002_Eyeball-Labels-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0002_Eyeball-Labels-343x193.jpg 343w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h3 id=\"the-first-hyperscale-supercomputer\">The first hyperscale supercomputer<\/h3>\n\n\n\n<p>Today, nearly every new server in Microsoft datacenters integrates an FPGA into a unique distributed architecture, which creates an interconnected and configurable compute layer that extends the CPU compute layer. Using this acceleration fabric, we can deploy distributed hardware microservices (HWMS) with the flexibility to harness a scalable number of FPGAs\u2014from one to thousands. Conversely, cloud-scale applications can leverage a scalable number of these microservices, with no knowledge of the underlying hardware. By coupling this approach with nearly a million Intel FPGAs deployed in our datacenters, we have built the world\u2019s first hyperscale supercomputer, which can compute machine learning and deep learning algorithms with an unmatched combination of speed, efficiency, and scale.<\/p>\n<\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h3 id=\"leading-datacenter-transformation-by-using-programmable-hardware\">Leading datacenter transformation by using programmable hardware<\/h3>\n\n\n\n<p>Through Project Catapult, Microsoft is leading the industry\u2019s datacenter transformation by using programmable hardware. We were the first to prove the value of FPGAs for cloud computing, first to deploy them at cloud scale, and, with Bing, first to use them to accelerate enterprise-level applications.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0001_Eyeball-Sphere-1024x576.jpg\" alt=\"eyeball sphere graphic\" class=\"wp-image-484713\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0001_Eyeball-Sphere-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0001_Eyeball-Sphere-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0001_Eyeball-Sphere-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0001_Eyeball-Sphere.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0001_Eyeball-Sphere-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0001_Eyeball-Sphere-343x193.jpg 343w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<div class=\"wp-block-columns is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0000_2015-Chip-1024x576.jpg\" alt=\"Bing Ranking throughput increased by 50% in 2015\" class=\"wp-image-484710\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0000_2015-Chip-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0000_2015-Chip-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0000_2015-Chip-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0000_2015-Chip.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0000_2015-Chip-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/FPGA_History__0000_2015-Chip-343x193.jpg 343w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-layout-flow wp-block-column-is-layout-flow\">\n<h3 id=\"project-brainwave-to-enable-real-time-ai\">Project Brainwave to enable real-time AI<\/h3>\n\n\n\n<p>Our leadership in accelerated networking has delivered the world\u2019s fastest cloud network. Today, Project Brainwave is leveraging Project Catapult to enable real-time AI, with blazing fast inferencing performance at a remarkably affordable cost. A growing team of MSR researchers and engineers, in very close partnership with engineering groups such as Bing, Azure Machine Learning, Azure Networking, Azure Cloud Server Infrastructure (CSI), and Azure Storage, continue to push the boundaries of accelerated cloud computing.<\/p>\n<\/div>\n<\/div>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h2 id=\"milestones\">Milestones<\/h2>\n\n\n\n<h3 id=\"project-catapults-waves-of-innovation-will-continue\">Project Catapult\u2019s waves of innovation will continue.<\/h3>\n\n\n\t<div class=\"wp-block-msr-block-journey journey journey--numeric alignwide\" data-bi-aN=\"block-journey\">\n\t\t<ol class=\"journey__list\">\n\t\t\t\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2010\" class=\"moment__title\">2010<\/h3>\n\n\n\n<p>MSR demonstrated the first proof of concept to Bing leadership, with a proposal to use FPGAs at scale to accelerate Web search.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2011\" class=\"moment__title\">2011<\/h3>\n\n\n\n<p>MSR researchers and Bing engineers developed the first prototype; identifying and accelerating computationally expensive operations in Bing\u2019s IndexServe engine.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2012\" class=\"moment__title\">2012<\/h3>\n\n\n\n<p>Project Catapult\u2019s scale pilot of 1,632 FPGA-enabled servers was deployed to a datacenter, by using an early architecture with a custom secondary network.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2013\" class=\"moment__title\">2013<\/h3>\n\n\n\n<p>Results of the pilot demonstrated a dramatic improvement in search latency, running Bing decision-tree algorithms 40 times faster than CPUs alone, and proved the potential to speed up search even while reducing the number of servers. Bing leadership committed to putting Project Catapult in production.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2014\" class=\"moment__title\">2014<\/h3>\n\n\n\n<p>The Catapult v2 architecture introduced the breakthrough of placing FPGAs as a \u201cbump in the wire\u201d on the network path. Work began on accelerating software-designed networking for Azure. Project Catapult\u2019s seminal paper was published.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2015\" class=\"moment__title\">2015<\/h3>\n\n\n\n<p>FPGA-enabled servers were deployed at scale in Bing and Azure datacenters, and Bing first used FPGAs in production to accelerate search ranking. This enabled a 50 percent increase in throughput, or a 25 percent reduction in latency.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2016\" class=\"moment__title\">2016<\/h3>\n\n\n\n<p>Azure launched Accelerated Networking, using FPGAs to enable the world\u2019s fastest cloud network. FPGAs became a default part of most Azure and Bing server SKUs. MSR began Project Brainwave, focused on accelerating AI and deep learning.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2017\" class=\"moment__title\">2017<\/h3>\n\n\n\n<p>MSR and Bing launched hardware microservices, enabling one web-scale service to leverage multiple FPGA-accelerated applications distributed across a datacenter. Bing deployed the first FPGA-accelerated Deep Neural Network (DNN). MSR demonstrated that FPGAs can enable real-time AI, beating GPUs in ultra-low latency, even without batching inference requests.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\n\t<li class=\"wp-block-msr-block-moment moment \" data-bi-aN=\"block-moment\">\n\t\t<div class=\"moment__dot moment__dot--start\" role=\"presentation\"><\/div>\n\t\t<div role=\"presentation\"><\/div>\n\t\t<div class=\"moment__details\">\n\t\t\t\t\t\t<div class=\"moment__counter\"><\/div>\n\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t<div class=\"moment__content\">\n\t\t\t\n\n<h3 id=\"2018\" class=\"moment__title\">2018<\/h3>\n\n\n\n<p>Bing and Azure deployed new multi-FPGA appliances into datacenters, shifting the ratio of computing power between CPUs and FPGAs, with multiple Intel Arria 10 FPGAs in each server. MSR, Bing, and Azure Machine Learning partnered to bring Project Brainwave to production for both Microsoft engineering groups and third-party customers. Azure Machine Learning launched the preview of Hardware Accelerated Models, powered by Project Brainwave, delivering ultra-fast DNN performance with ResNet-50, at remarkably low cost\u2014only 21 cents per million images during preview.<\/p>\n\n\t\t<\/div>\n\t\t<div class=\"moment__dot moment__dot--end\" role=\"presentation\"><\/div>\n\t<\/li>\n\t\n\t\t<\/ol>\n\t<\/div>\n\t\n\n\n<p>This is still the beginning. Project Brainwave is gaining traction across the company, with accelerated models in development for text, speech, vision, and other areas. The company-wide Project Catapult virtual team continues to innovate in deep learning, networking, storage, and other areas.<\/p>\n\n\n","protected":false},"excerpt":{"rendered":"<p>Project Catapult is the code name for a Microsoft Research (MSR) enterprise-level initiative that is transforming cloud computing by augmenting CPUs with an interconnected and configurable compute layer composed of programmable silicon.<\/p>\n","protected":false},"featured_media":486438,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13555,13547],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-171431","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-search-information-retrieval","msr-research-area-systems-and-networking","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[492296,476496,476322,287882,293006,168381,167989,166465],"related-downloads":[],"related-videos":[190987,301223,428922,484395,484401,484350,484689,487865,494453,488903,494492,494501],"related-groups":[],"related-events":[419487],"related-opportunities":[],"related-posts":[509,4181,451146,421017,235461,484668,709066],"related-articles":[],"tab-content":[{"id":0,"name":"Additional news","content":"<a href=\"https:\/\/www.zdnet.com\/article\/at-microsoft-build-ai-azure-machine-learning-and-cosmos-db-go-big\/\" target=\"_blank\" rel=\"noopener\">I do so like AML and HAM<\/a>\r\n<em>ZDNet<\/em> | May 7, 2018\r\n\r\n<a href=\"https:\/\/www.cnet.com\/news\/microsoft-project-brainwave-speeds-ai-with-fpga-chips-on-azure-build-conference\/\" target=\"_blank\" rel=\"noopener\">Microsoft's Project Brainwave brings fast-chip smarts to AI at Build conference<\/a>\r\n<em>CNET<\/em>\u00a0| May 7, 2018\r\n\r\n<a href=\"www.datacenterknowledge.com\/microsoft\/why-microsoft-has-bet-fpgas-infuse-its-cloud-ai\">Why Microsoft Has Bet on FPGAs to Infuse Its Cloud With AI, by Mary Branscombe<\/a>\r\n<em>Data Center Knowledge<\/em> | April 25, 2018\r\n\r\n<a href=\"https:\/\/newsroom.intel.com\/editorials\/intel-fpgas-accelerating-artificial-intelligence-deep-learning-bing-intelligent-search\/\">Intel FPGAs Accelerate Artificial Intelligence for Deep Learning in Microsoft\u2019s Bing Intelligent Search<\/a>\r\n<em>Intel Newsroom<\/em> | March 26, 2018\r\n\r\n<a href=\"https:\/\/www.nytimes.com\/2017\/09\/16\/technology\/chips-off-the-old-block-computers-are-taking-design-cues-from-human-brains.html\">Chips Off the Old Block: Computers Are Taking Design Cues From Human Brains<\/a>\r\n<em>NY Times<\/em> | September 16, 2017\r\n\r\n<a href=\"https:\/\/www.forbes.com\/sites\/moorinsights\/2017\/08\/28\/microsoft-fpga-wins-versus-google-tpus-for-ai\/amp\/\">Microsoft FPGA Wins Versus Google TPUs for AI<\/a>\r\n<em>Forbes<\/em> | August 28, 2017\r\n\r\n<a href=\"http:\/\/fortune.com\/2017\/08\/23\/microsoft-project-brainwave-ai\/\">Microsoft is building its own AI Hardware with project Brainwave<\/a>\r\n<em>Fortune<\/em> | August 23, 2017\r\n\r\n<a href=\"https:\/\/www.geekwire.com\/2017\/microsofts-project-brainwave-puts-real-time-artificial-intelligence-high-tech-chips\/\">Microsoft's project Brainwave puts 'real-time artificial intelligence' into high-tech chips<\/a>\r\n<em>geekwire<\/em> | August 22, 2017\r\n\r\n<a href=\"https:\/\/www.infoworld.com\/article\/3131602\/artificial-intelligence\/microsofts-configurable-cloud-satisfies-data-centers-need-for-speed.html\">Microsoft's Configurable Cloud satisfies datacenters' need for speed<\/a>\r\n<em>InfoWorld<\/em> | October 18, 2016\r\n\r\n<a href=\"http:\/\/www.fortune.com\/2016\/10\/17\/microsoft-fpga-chips-azure\/\">Why Microsoft Is Putting These Chips at the Center of Its Cloud<\/a>\r\n<em>Fortune<\/em> | October 17, 2016\r\n\r\n<a href=\"https:\/\/www.wired.com\/2016\/09\/microsoft-bets-future-chip-reprogram-fly\/\">Microsoft Bets its Future on a Reprogrammable Chip<\/a>\r\n<em>Wired<\/em> | Sept. 25, 2016\r\n\r\n<a href=\"http:\/\/www.theregister.co.uk\/2014\/06\/16\/microsoft_catapult_fpgas\/\">Microsoft \u2018Catapults\u2019 geriatric Moore\u2019s Law from certain death<\/a>\r\n<em>The Register<\/em> | June 16, 2014\r\n\r\n<a href=\"http:\/\/www.zdnet.com\/article\/microsoft-to-implement-catapult-programmable-processors-in-its-datacenters\/\">Microsoft to implement \u2018Catapult\u2019 programmable processors in its datacenters<\/a>\r\n<em>ZD Net<\/em> | June 16, 2014\r\n\r\n<a href=\"https:\/\/www.wired.com\/2014\/06\/microsoft-fpga\/\">Microsoft Supercharges Bing Search with Programmable Chips<\/a>\r\n<em>Wired<\/em> | June 14, 2014"}],"slides":[{"attachment_id":486441,"headline":"First hardware accelerated model powered by Project Brainwave","cta":"Try it now","url":"https:\/\/aka.ms\/aml-real-time-ai","cta_style":"","slideshow_type":"feature"}],"related-researchers":[{"type":"user_nicename","display_name":"Logan Adams","user_id":37503,"people_section":"Group 1","alias":"loadams"},{"type":"guest","display_name":"Hari Angepat","user_id":431040,"people_section":"Group 1","alias":""},{"type":"user_nicename","display_name":"Doug Burger","user_id":31582,"people_section":"Group 1","alias":"dburger"},{"type":"guest","display_name":"Derek  Chiou","user_id":375089,"people_section":"Group 1","alias":""},{"type":"user_nicename","display_name":"Daniel Firestone","user_id":35969,"people_section":"Group 1","alias":"fstone"},{"type":"user_nicename","display_name":"Mahdi Ghandi","user_id":37506,"people_section":"Group 1","alias":"maghandi"},{"type":"guest","display_name":"Matt Humphrey","user_id":431043,"people_section":"Group 1","alias":""},{"type":"user_nicename","display_name":"Sitaram Lanka","user_id":37485,"people_section":"Group 1","alias":"slanka"},{"type":"user_nicename","display_name":"Todd Massengill","user_id":34236,"people_section":"Group 1","alias":"toddma"},{"type":"user_nicename","display_name":"Andrew Putnam","user_id":31049,"people_section":"Group 1","alias":"anputnam"},{"type":"user_nicename","display_name":"Adam Sapek","user_id":37491,"people_section":"Group 1","alias":"adamsap"},{"type":"user_nicename","display_name":"Alex Wetmore","user_id":37515,"people_section":"Group 1","alias":"awetmore"},{"type":"user_nicename","display_name":"Phillip Yi Xiao","user_id":37509,"people_section":"Group 1","alias":"phxiao"}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171431","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":24,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171431\/revisions"}],"predecessor-version":[{"id":802354,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/171431\/revisions\/802354"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/486438"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=171431"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=171431"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=171431"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=171431"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=171431"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}