{"id":691449,"date":"2020-09-11T18:52:35","date_gmt":"2020-09-12T01:52:35","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=691449"},"modified":"2023-11-13T23:10:33","modified_gmt":"2023-11-14T07:10:33","slug":"enhancing-the-interoperability-between-deep-learning-frameworks-by-model-conversion","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/enhancing-the-interoperability-between-deep-learning-frameworks-by-model-conversion\/","title":{"rendered":"Enhancing the Interoperability between Deep Learning Frameworks by Model Conversion"},"content":{"rendered":"<p>Deep learning (DL) has become one of the most successful machine learning techniques. To achieve the optimal development result, there are emerging requirements on the interoperability between DL frameworks that the trained model files and training\/serving programs can be re-utilized. Faithful model conversion is a promising technology to enhance the framework interoperability in which a source model is transformed into the semantic equivalent in another target framework format. However, several major challenges need to be addressed. First, there are apparent discrepancies between DL frameworks. Second, understanding the semantics of a source model could be difficult due to the framework scheme and optimization. Lastly, there exist a large number of DL frameworks, bringing potential significant engineering efforts.<\/p>\n<p>In this paper, we propose MMdnn, an open-sourced, comprehensive, and faithful model conversion tool for popular DL frameworks. MMdnn adopts a novel unified intermediate representation (IR)-based methodology to systematically handle the conversion challenges. The source model is first transformed into an intermediate computation graph represented by the simple graph-based IR of MMdnn and then to the target framework format, which greatly reduces the engineering complexity. Since the model structure expressed by developers may have been changed by DL frameworks (e.g., graph optimization), MMdnn tries to recover the original high-level neural network layers for better semantic comprehension via a pattern matching similar method. In the meantime, a piece of model construction code is generated to facilitate later retraining or serving. MMdnn implements an extensible conversion architecture from the compilation point of view, which eases contribution from the community to support new DL operators and frameworks. MMdnn has reached good maturity and quality, and is applied for converting production models.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Deep learning (DL) has become one of the most successful machine learning techniques. To achieve the optimal development result, there are emerging requirements on the interoperability between DL frameworks that the trained model files and training\/serving programs can be re-utilized. Faithful model conversion is a promising technology to enhance the framework interoperability in which a [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"ACM","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"1320","msr_page_range_end":"1330","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"ESEC\/FSE 2020","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2020-11-7","msr_highlight_text":"","msr_notes":"The 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Industry Track","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"https:\/\/2020.esec-fse.org","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13560,13547],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[248506,246694,246673,246691,246658,251863,253579,248230,249472,248851],"msr-conference":[259201],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-691449","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-programming-languages-software-engineering","msr-research-area-systems-and-networking","msr-locale-en_us","msr-field-of-study-architecture","msr-field-of-study-artificial-intelligence","msr-field-of-study-artificial-neural-network","msr-field-of-study-computer-science","msr-field-of-study-deep-learning","msr-field-of-study-extensibility","msr-field-of-study-interoperability","msr-field-of-study-operator-computer-programming","msr-field-of-study-pattern-matching","msr-field-of-study-semantics"],"msr_publishername":"ACM","msr_edition":"","msr_affiliation":"","msr_published_date":"2020-11-7","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"The 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Industry Track","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/11\/mmdnn.pdf","id":"764665","title":"mmdnn-3","label_id":"243103","label":0},{"type":"doi","viewUrl":"false","id":"false","title":"10.1145\/3368089.3417051","label_id":"243106","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Yu Liu","user_id":0,"rest_url":false},{"type":"text","value":"Cheng Chen","user_id":0,"rest_url":false},{"type":"text","value":"Ru Zhang","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Tingting Qin","user_id":34062,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Tingting Qin"},{"type":"text","value":"Xiang Ji","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Haoxiang Lin","user_id":31972,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Haoxiang Lin"},{"type":"user_nicename","value":"Mao Yang","user_id":32798,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Mao Yang"}],"msr_impact_theme":[],"msr_research_lab":[199560],"msr_event":[],"msr_group":[510017],"msr_project":[809443],"publication":[],"video":[],"msr-tool":[464916],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":809443,"post_title":"AI Tooling and MLOps","post_name":"ai-tooling-and-mlops","post_type":"msr-project","post_date":"2022-01-06 18:35:35","post_modified":"2023-08-12 19:16:12","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/ai-tooling-and-mlops\/","post_excerpt":"In recent years, artificial intelligence (AI), including machine learning (ML) and deep learning (DL), has been widely adopted in many application domains, such as computer vision, speech recognition, natural language processing, and gaming. However, developers currently rely on traditional paradigms for AI development and operation, which causes significant job failures, runtime performance degradation, information breach, etc. and slows down development productivity severely. We adopt technologies from the areas of Systems, Programming Languages, and Software Engineering&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/809443"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/691449","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":6,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/691449\/revisions"}],"predecessor-version":[{"id":734644,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/691449\/revisions\/734644"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=691449"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=691449"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=691449"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=691449"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=691449"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=691449"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=691449"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=691449"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=691449"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=691449"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=691449"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=691449"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=691449"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}