{"id":840592,"date":"2022-05-24T09:17:07","date_gmt":"2022-05-24T16:17:07","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-blog-post&#038;p=840592"},"modified":"2022-08-16T13:48:03","modified_gmt":"2022-08-16T20:48:03","slug":"improve-edge-device-ai-efficiency","status":"publish","type":"msr-blog-post","link":"https:\/\/www.microsoft.com\/en-us\/research\/articles\/improve-edge-device-ai-efficiency\/","title":{"rendered":"Improve edge-device AI efficiency"},"content":{"rendered":"\n<div class=\"wp-block-media-text has-vertical-padding-none  alignwide has-media-on-the-right is-stacked-on-mobile is-vertically-aligned-top is-style-spectrum is-style-border is-style-offset-media--top\"><figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1.jpg\" alt=\"Urban innovation: farmer selling vegetables to a customer using a cellphone to pay\" class=\"wp-image-775270 size-full\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1.jpg 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/09\/UrbanInnovation-economy-farmer-1400x788-1-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure><div class=\"wp-block-media-text__content\">\n<p>Machine learning models are increasingly running on edge hardware, such as mobile phones or Internet of Things (IoT) devices. Motivations include protection of private data and avoidance of networking latency, for example with applications that recognize speech. Ensuring efficient inference is especially important on battery-powered devices with constrained processor, memory and power budgets. Several approaches have proven fruitful.<\/p>\n<\/div><\/div>\n\n\n\n<p>In collaboration with NVIDIA, we\u2019ve developed efficient Neural Architecture Search (NAS) to find network architectures that will run efficiently on hardware with specific constraints, such as low power consumption for mobile devices. <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/hant-hardware-aware-network-transformation\/\" target=\"_blank\" rel=\"noreferrer noopener\">Hardware-Aware Network Transformation (HANT)<\/a> employs a two-level strategy to achieve this goal. First, knowledge distillation is used to train a library of efficient operators, once. HANT can then search this library quickly and repeatedly to generate energy-efficient, hardware-specific architectures. This highly efficient method can find high-performing architectures in minutes, enabling carbon savings compared to previous methods.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"418\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/05\/HANT-diagram-1024x418.png\" alt=\"HANT diagram\" class=\"wp-image-844096\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/05\/HANT-diagram-1024x418.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/05\/HANT-diagram-300x122.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/05\/HANT-diagram-768x313.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/05\/HANT-diagram-240x98.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/05\/HANT-diagram.png 1231w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Hardware-Aware Network Transformation (HANT)<\/figcaption><\/figure>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/asymo-scalable-and-efficient-deep-learning-inference-on-asymmetric-mobile-cpus\/\">AsyMo<\/a> takes a different approach that focuses on deep learning inference latency and energy efficiency on the asymmetric processors found in mobile phones. Mobile CPUs have multiple cores with different characteristics, such as a larger core intended for high performance, and a smaller core for when energy conservation is more important than performance. AsyMo incorporates knowledge of the processor asymmetry and model architecture into the partitioning of neural network inferencing tasks to reduce inference latency. AsyMo also identifies that high CPU clock speeds do not benefit (and can actually harm) models that are memory-bandwidth limited. Leveraging this insight, AsyMo intelligently sets the CPU clock speed based on the hardware and model architecture, to improve energy efficiency. Depending on the deep learning framework and model evaluated, AsyMo achieved improvements of 46% or more for inference latency, and up to 37% for energy efficiency.\u00a0<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Machine learning models are increasingly running on edge hardware, such as mobile phones or Internet of Things (IoT) devices. Motivations include protection of private data and avoidance of networking latency, for example with applications that recognize speech. Ensuring efficient inference is especially important on battery-powered devices with constrained processor, memory and power budgets. Several approaches [&hellip;]<\/p>\n","protected":false},"author":40306,"featured_media":775270,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-content-parent":804847,"msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-locale":[268875],"msr-post-option":[],"class_list":["post-840592","msr-blog-post","type-msr-blog-post","status-publish","has-post-thumbnail","hentry","msr-locale-en_us"],"msr_assoc_parent":{"id":804847,"type":"project"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/840592","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-blog-post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/40306"}],"version-history":[{"count":11,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/840592\/revisions"}],"predecessor-version":[{"id":870183,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-blog-post\/840592\/revisions\/870183"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/775270"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=840592"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=840592"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=840592"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=840592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}