{"id":555282,"date":"2018-12-04T18:10:52","date_gmt":"2018-12-05T02:10:52","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=555282"},"modified":"2023-07-10T03:41:13","modified_gmt":"2023-07-10T10:41:13","slug":"deep-learning-compiler-and-optimizer","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/deep-learning-compiler-and-optimizer\/","title":{"rendered":"Deep Learning Compiler and Optimizer"},"content":{"rendered":"<h3>Project Overview<\/h3>\n<p><span style=\"font-size: 1rem\">This project aims to build a deep learning compiler and optimizer infrastructure that can provide automatic scalability and efficiency optimization for distributed and local execution.\u00a0 Overall, this stack covers two types of general optimizations: fast distributed training over large-scale servers and efficient local execution on various hardware devices.\u00a0 Currently, our optimizations focus on many different parts of the system stack, such as fast distributed training over RDMA, automatic computation placement across devices, automatic operator batching and kernel fusion, tensor algebra compiler, sparse and quantization optimizations, and so on.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-831568 aligncenter\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/12\/Picture1-893x1024.png\" alt=\"graphical user interface, application\" width=\"455\" height=\"522\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/12\/Picture1-893x1024.png 893w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/12\/Picture1-262x300.png 262w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/12\/Picture1-768x881.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/12\/Picture1-1339x1536.png 1339w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/12\/Picture1-1786x2048.png 1786w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/12\/Picture1-157x180.png 157w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/12\/Picture1.png 1864w\" sizes=\"auto, (max-width: 455px) 100vw, 455px\" \/><\/p>\n<h3>Open-source Release<\/h3>\n<p>Some of our projects have been open-sourced, and welcome to try, contribute and collaborate with us.<\/p>\n<ul>\n<li><strong>NNFusion: <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/microsoft\/nnfusion\">https:\/\/github.com\/microsoft\/nnfusion<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/strong>\n<ul>\n<li>A flexible and efficient DNN compiler that can generate high-performance executables from a DNN model description (e.g., TensorFlow frozen models and ONNX format).<\/li>\n<\/ul>\n<\/li>\n<li><strong>Antares: <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/microsoft\/antares\">https:\/\/github.com\/microsoft\/antares\u00a0<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0<\/strong>\n<ul>\n<li>An automatic engine for multi-platform kernel generation and optimization.<\/li>\n<\/ul>\n<\/li>\n<li>And more to come&#8230;<\/li>\n<\/ul>\n<h3>Job Opportunity<\/h3>\n<ul>\n<li>Research Intern [<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.msra.cn\/zh-cn\/jobs\/interns\/intelligent-cloud-and-edge-group-research-intern?language=chinese\">Link<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>]<\/li>\n<li>FTE [<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.msra.cn\/zh-cn\/jobs?language=chinese&job-type=full-time-job\">Link<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>]<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Project Overview This project aims to build a deep learning compiler and optimizer infrastructure that can provide automatic scalability and efficiency optimization for distributed and local execution.\u00a0 Overall, this stack covers two types of general optimizations: fast distributed training over large-scale servers and efficient local execution on various hardware devices.\u00a0 Currently, our optimizations focus on [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13547],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-555282","msr-project","type-msr-project","status-publish","hentry","msr-research-area-systems-and-networking","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[581590,428202,596290,700210,831574,858267,941055,954468,954474],"related-downloads":[],"related-videos":[504914],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"related-researchers":[{"type":"user_nicename","display_name":"Youshan Miao","user_id":35038,"people_section":"Section name 1","alias":"yomia"},{"type":"user_nicename","display_name":"Wei Cui","user_id":38859,"people_section":"Section name 1","alias":"weicu"},{"type":"user_nicename","display_name":"Fan Yang","user_id":31782,"people_section":"Section name 1","alias":"fanyang"},{"type":"user_nicename","display_name":"Lidong Zhou","user_id":32673,"people_section":"Section name 1","alias":"lidongz"}],"msr_research_lab":[199560],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/555282","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":13,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/555282\/revisions"}],"predecessor-version":[{"id":555585,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/555282\/revisions\/555585"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=555282"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=555282"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=555282"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=555282"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=555282"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}