{"id":486102,"date":"2018-08-14T09:49:27","date_gmt":"2018-08-14T16:49:27","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=486102"},"modified":"2023-07-10T07:52:57","modified_gmt":"2023-07-10T14:52:57","slug":"project-brainwave","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/project-brainwave\/","title":{"rendered":"Project Brainwave"},"content":{"rendered":"<p>Project Brainwave is a deep learning platform for real-time AI inference in the cloud and on the edge. A soft Neural Processing Unit (NPU), based on a high-performance field-programmable gate array (FPGA), accelerates deep neural network (DNN) inferencing, with applications in computer vision and natural language processing.\u202fProject Brainwave is transforming computing by augmenting CPUs with an interconnected and configurable compute layer composed of programmable silicon.<\/p>\n<p>For example, this FPGA configuration achieved more than an order of magnitude improvement in latency and throughput on RNNs for Bing, with no batching. By delivering real-time AI and ultra-low latency without batching required, software overhead and complexity are reduced.<\/p>\n<p>Learn more about Project Brainwave on:<\/p>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/machine-learning\/service\/how-to-deploy-fpga-web-service\" target=\"_blank\" rel=\"noopener noreferrer\">The cloud with Azure Machine Learning<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/databox-online\/data-box-edge-overview\" target=\"_blank\" rel=\"noopener noreferrer\">The edge with Azure DataBox Edge<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n<p>At Build 2019, Microsoft EVP Scott Guthrie talked about how Project Brainwave DNN inferencing can be used to keep supermarket shelves fully stocked:<\/p>\n<div style=\"width: 100%; height: 30px;\"><\/div>\n<p><iframe loading=\"lazy\" title=\"Microsoft Build 2019 - LIVE Stream - Day 1 (May 6)\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube-nocookie.com\/embed\/d3LHo2yXKoY?feature=oembed&rel=0\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<div style=\"width: 100%; height: 30px;\"><\/div>\n<h2>Serves state-of-the-art, pre-trained DNN models<\/h2>\n<p>With a high-performance, precision-adaptable FPGA soft processor, Microsoft datacenters can serve pre-trained DNN models with high efficiencies at low batch sizes. The use of an FPGA means that it is flexible for continuous innovations and improvements, making the infrastructure future-proof.<\/p>\n<p>Exploiting FPGAs on a datacenter-scale compute fabric, a single DNN model can be deployed as a scalable hardware microservice that leverages multiple FPGAs to create web-scale services. This can process massive amounts of data in real time.<\/p>\n<h2>Trifecta of high performance<\/h2>\n<p>To meet the growing computational demands of deep learning, cloud operators are turning toward specialized hardware for improved efficiency and performance, particularly for live data streams. Project Brainwave offers the trifecta of high-performance computing: low latency, high throughput, and high efficiency, all while also offering the flexibility of field-programmability.<\/p>\n<p>Because it is based on an FPGA, it can keep pace with new discoveries and stay current with the requirements of rapidly changing AI algorithms.<\/p>\n<h2>Put it in action<\/h2>\n<p>See Project Brainwave on Intel FPGAs in action on Microsoft Azure and Azure Databox Edge. The FPGAs in the cloud and edge support:<\/p>\n<ul>\n<li>Image classification and object detection scenarios<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Azure\/MachineLearningNotebooks\/tree\/master\/how-to-use-azureml\/deployment\/accelerated-models\" target=\"_blank\" rel=\"noopener noreferrer\">Jupyter Notebooks<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> to quickly get started<\/li>\n<\/ul>\n<p>Using this FPGA-enabled hardware architecture, trained neural networks run quickly and with lower latency. Azure can parallelize pre-trained deep neural networks (DNN) across FPGAs on Azure Kubernetes Service (AKS) to scale out your service. The DNNs can be pre-trained as a deep featurizer for transfer learning or fine-tuned with updated weights. Find out more:\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/machine-learning\/service\/concept-accelerate-with-fpgas\" target=\"_blank\" rel=\"noopener noreferrer\">https:\/\/docs.microsoft.com\/en-us\/azure\/machine-learning\/service\/concept-accelerate-with-fpgas<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Project Brainwave is a deep learning platform for real-time AI inference in the cloud and on the edge, transforming computing by augmenting CPUs with an interconnected and configurable compute layer composed of programmable silicon.<\/p>\n","protected":false},"featured_media":486441,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-486102","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[492296,476322],"related-downloads":[],"related-videos":[486935,484350,484689,486953,487109,487124,487865,488903,494474,494492,494501],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[421017,504269,709066,712588,810376],"related-articles":[],"tab-content":[],"slides":[{"attachment_id":486441,"headline":"First hardware accelerated model powered by Project Brainwave","cta":"Try it now","url":"https:\/\/aka.ms\/aml-real-time-ai","cta_style":"","slideshow_type":"feature"}],"related-researchers":[{"type":"user_nicename","display_name":"Logan Adams","user_id":37503,"people_section":"Section name 0","alias":"loadams"},{"type":"guest","display_name":"Hari Angepat","user_id":431040,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Doug Burger","user_id":31582,"people_section":"Section name 0","alias":"dburger"},{"type":"guest","display_name":"Derek  Chiou","user_id":375089,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Daniel Firestone","user_id":35969,"people_section":"Section name 0","alias":"fstone"},{"type":"user_nicename","display_name":"Mahdi Ghandi","user_id":37506,"people_section":"Section name 0","alias":"maghandi"},{"type":"guest","display_name":"Matt Humphrey","user_id":431043,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Sitaram Lanka","user_id":37485,"people_section":"Section name 0","alias":"slanka"},{"type":"user_nicename","display_name":"Todd Massengill","user_id":34236,"people_section":"Section name 0","alias":"toddma"},{"type":"user_nicename","display_name":"Andrew Putnam","user_id":31049,"people_section":"Section name 0","alias":"anputnam"},{"type":"user_nicename","display_name":"Adam Sapek","user_id":37491,"people_section":"Section name 0","alias":"adamsap"},{"type":"user_nicename","display_name":"Alex Wetmore","user_id":37515,"people_section":"Section name 0","alias":"awetmore"},{"type":"user_nicename","display_name":"Phillip Yi Xiao","user_id":37509,"people_section":"Section name 0","alias":"phxiao"}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/486102","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":30,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/486102\/revisions"}],"predecessor-version":[{"id":654093,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/486102\/revisions\/654093"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/486441"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=486102"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=486102"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=486102"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=486102"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=486102"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}