{"id":509,"date":"2014-06-16T15:12:01","date_gmt":"2014-06-16T15:12:01","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/inside_microsoft_research\/2014\/06\/16\/catapult-moving-beyond-cpus-in-the-cloud\/"},"modified":"2016-07-20T07:30:01","modified_gmt":"2016-07-20T14:30:01","slug":"catapult-moving-beyond-cpus-in-the-cloud","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/catapult-moving-beyond-cpus-in-the-cloud\/","title":{"rendered":"Catapult: Moving Beyond CPUs in the Cloud"},"content":{"rendered":"<p class=\"posted-by\">Posted by <span class=\"author\">Rob Knies<\/span><\/p>\n<p><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/msdnshared.blob.core.windows.net\/media\/TNBlogsFS\/prod.evol.blogs.technet.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/00\/90\/35\/Catapult.jpg\"><img decoding=\"async\" style=\"float: left; margin: 10px;\" title=\"Field-programmable gate array\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/TNBlogsFS\/prod.evol.blogs.technet.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/00\/90\/35\/Catapult.jpg\" alt=\"Field-programmable gate array\" width=\"300\" \/><span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/p>\n<p>Operating a datacenter at web scale requires managing many conflicting requirements. The ability to deliver computation at a high level and speed is a given, but because of the demands such a facility must meet, a datacenter also needs flexibility. Additionally, it must be efficient in its use of power, keeping costs as low as possible.<\/p>\n<p>Addressing often conflicting goals is a challenge, leading datacenter providers to seek constant performance and efficiency improvements and to evaluate the merits of general-purpose versus task-tuned alternatives\u2014particularly in an era in which Moore\u2019s Law is nearing an end, as some suggest.<\/p>\n<p>Microsoft researchers and colleagues from <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Bing\" href=\"http:\/\/www.bing.com\/\" target=\"_blank\">Bing<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> have been collaborating with others from industry and academia to examine datacenter hardware alternatives, and their work, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"a project known as Catapult\" href=\"http:\/\/www.theregister.co.uk\/2014\/06\/16\/microsoft_catapult_fpgas\/\" target=\"_blank\">a project known as Catapult<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, was presented in Minneapolis on June 16 during the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"41st International Symposium on Computer Architecture\" href=\"http:\/\/cag.engr.uconn.edu\/isca2014\/\" target=\"_blank\">41st International Symposium on Computer Architecture<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (ISCA).<\/p>\n<p>Their paper, titled <em><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"<em><a title=\"A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services\" href=\"http:\/\/research.microsoft.com\/apps\/pubs\/default.aspx?id=219544\" target=\"_blank\">A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services<\/a><\/em>\" href=\"http:\/\/research.microsoft.com\/apps\/pubs\/default.aspx?id=212001\" target=\"_blank\">A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/em>, describes an effort to combine programmable hardware and software that uses <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"field-programmable gate arrays\" href=\"http:\/\/en.wikipedia.org\/wiki\/Field-programmable_gate_array\" target=\"_blank\">field-programmable gate arrays<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (FPGAs) to deliver performance improvements of as much as 95 percent.<\/p>\n<p>The significance of this work, says <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Peter Lee\" href=\"http:\/\/research.microsoft.com\/en-us\/people\/petelee\/default.aspx\" target=\"_blank\">Peter Lee<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, head of Microsoft Research, could be dramatic.<\/p>\n<p><iframe loading=\"lazy\" src=\"http:\/\/research.microsoft.com\/apps\/video\/ifVideo.aspx?id=219544\" width=\"300\" height=\"150\"><\/iframe><\/p>\n<p>\u201cGoing into production with this new technology will be a watershed moment for Bing search,\u201d he says. \u201cFor the first time ever, the quality of Bing\u2019s page ranking will be driven not only by great algorithms but also by hardware\u2014incredibly advanced hardware that can be made more highly specialized than anything ever seen before at datacenter scale.\u201d<\/p>\n<p>Microsoft researcher <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"Doug Burger\" href=\"http:\/\/research.microsoft.com\/en-us\/people\/dburger\/\" target=\"_blank\">Doug Burger<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, one of 23 co-authors of the ISCA paper, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" title=\"explains the motivation behind this project\" href=\"http:\/\/research.microsoft.com\/apps\/video\/default.aspx?id=219486\" target=\"_blank\">explains the motivation behind this project<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>\u201cWe are addressing two problems,\u201d he says. \u201cFirst, how do we keep accelerating services and reducing costs in the cloud as the performance gains from CPUs continue to flatten?<\/p>\n<p>\u201cSecond, we wanted to enable Bing to run computations at a scale that was not possible in software alone, for much better results at lower cost.\u201d<\/p>\n<p><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/msdnshared.blob.core.windows.net\/media\/TNBlogsFS\/prod.evol.blogs.technet.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/00\/90\/35\/Catapult-team.jpg\"><img decoding=\"async\" style=\"float: right; margin: 10px;\" title=\"Members of the Project Catapult team: Front row (from left), Joo-Young Kim, Stephen Heil, Derek Chiou, Sitaram Lanka, Andrew Putnam, and Eric Chung. Back row (from left), Eric Peterson, Scott Hauck, Aaron Smith, Jan Gray, Adrian Caulfield, Phillip Yi Xiao, Michael Haselman, and Doug Burger.\" src=\"https:\/\/msdnshared.blob.core.windows.net\/media\/TNBlogsFS\/prod.evol.blogs.technet.com\/CommunityServer.Blogs.Components.WeblogFiles\/00\/00\/00\/90\/35\/Catapult-team.jpg\" alt=\"Members of the Project Catapult team\" width=\"400\" \/><span class=\"sr-only\"> (opens in new tab)<\/span><\/a>Derek Chiou, a Bing hardware architect, discusses the benefits of the collaboration.<\/p>\n<p>\u201cThe partnership between Doug and his team at Microsoft Research and Bing has been fantastic and has resulted in significant results that will have real impact on Bing,\u201d Chiou says. \u201cThe factor of two throughput improvement demonstrated in the pilot means we can do the same amount of work with half the number of servers or double the amount of work with the same number of servers\u2014or some mix of the two.<\/p>\n<p>\u201cThose kinds of numbers are especially significant at the scale of a datacenter. The potential benefits go beyond simple dollars. To give some examples, Bing\u2019s ranking could be further enhanced to provide an even better customer experience, power could be saved, and the size of the datacenters could be reduced. The strength of the pilot results have led to Bing deploying this technology in one datacenter for customers, starting in early 2015.\u201d<\/p>\n<p>As the ISCA paper notes, FPGAs have become powerful computing devices in recent years, making them particularly suited for use as fine-grained accelerators.<\/p>\n<p>\u201cWe designed a platform that permits the software in the cloud, which is inherently programmable, to partner with programmable hardware,\u201d Burger says. \u201cYou can move functions into custom hardware, but rather than burning them into fixed chips [application-specific integrated circuits], we map them to Altera FPGAs, which can run hardware designs but can be changed by reconfiguring the FPGA.<\/p>\n<p>\u201cWe\u2019ve demonstrated a \u2018programmable hardware\u2019 enhanced cloud, running smoothly and reliably at large scale.\u201d<\/p>\n<p>In the evaluation deployment outlined in the paper, the reconfigurable fabric\u2014interconnected nodes linked by high-bandwidth connections\u2014was tested on a collection of 1,632 servers to measure its efficacy in accelerating the workload of a production web-search service. The results were impressive: a 95 percent improvement in throughput at a latency comparable to a software-only solution. With an increase in power consumption and total per-server cost increase of less than 30 percent, the net results deliver substantial savings and efficiencies.<\/p>\n<p>The results demonstrated the project\u2019s capability to run stably for long periods, and all the stages in the pipeline exceeded the overall throughput goal. In addition, a service to handle failures quickly reconfigures the fabric after errors or machine failures.<\/p>\n<p>The ISCA paper concludes by underscoring the belief that distributed reconfigurable fabrics will play a critical role as server performance increases level off. Such techniques could become indispensable to datacenter managers balancing their conflicting goals.<\/p>\n<p>\u201cThis portends a future where systems are specialized dynamically by compiling a good chunk of demanding workloads into hardware,\u201d Burger says. \u201cI would imagine that a decade hence, it will be common to compile applications into a mix of programmable hardware and programmable software.<\/p>\n<p>\u201cThis is a radical shift that will offer continued performance improvements past the end of Moore\u2019s Law as we move more and more of our applications and services into hardware.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Posted by Rob Knies Operating a datacenter at web scale requires managing many conflicting requirements. The ability to deliver computation at a high level and speed is a given, but because of the demands such a facility must meet, a datacenter also needs flexibility. Additionally, it must be efficient in its use of power, keeping [&hellip;]<\/p>\n","protected":false},"author":30766,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[194470,194463],"tags":[200189,186604,194924,201063,201231,195319,195383,195583,186412,202067,202113,196808,203507,203753],"research-area":[13547],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-509","post","type-post","status-publish","format-standard","hentry","category-computer-architecture","category-systems","tag-a-reconfigurable-fabric-for-accelerating-large-scale-datacenter-services","tag-bing","tag-catapult","tag-computer-systems-and-networking","tag-datacenter","tag-derek-chiou","tag-doug-burger","tag-field-programmable-gate-array","tag-fpga","tag-international-symposium-on-computer-architecture","tag-isca","tag-peter-lee","tag-reconfigurable-fabric","tag-server","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[171431],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"June 16, 2014","formattedExcerpt":"Posted by Rob Knies Operating a datacenter at web scale requires managing many conflicting requirements. The ability to deliver computation at a high level and speed is a given, but because of the demands such a facility must meet, a datacenter also needs flexibility. Additionally,&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/509","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/30766"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=509"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/509\/revisions"}],"predecessor-version":[{"id":235607,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/509\/revisions\/235607"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=509"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=509"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=509"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=509"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=509"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=509"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=509"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=509"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=509"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=509"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=509"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}