{"id":323543,"date":"2017-05-02T03:55:00","date_gmt":"2017-05-02T10:55:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=323543"},"modified":"2022-03-03T01:28:05","modified_gmt":"2022-03-03T09:28:05","slug":"deep-program-understanding","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/deep-program-understanding\/","title":{"rendered":"Deep Program Understanding"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background-catalina-blue card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1044\" height=\"450\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/05\/deep-program4.jpg\" class=\"attachment-full size-full\" alt=\"\" style=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/05\/deep-program4.jpg 1044w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/05\/deep-program4-300x129.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/05\/deep-program4-1024x441.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/05\/deep-program4-768x331.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/05\/deep-program4-240x103.jpg 240w\" sizes=\"auto, (max-width: 1044px) 100vw, 1044px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 align-self-center\">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"h2 wp-block-heading\" id=\"deep-program-understanding\">Deep Program Understanding<\/h1>\n\n\n\n<p>Teaching machines to understand complex algorithms<\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>The Deep Program Understanding project aims to teach machines to understand complex algorithms, combining methods from the programming languages, software engineering and the machine learning communities.<\/p>\n\n\n\n<p>We have open-sourced many of our work and implementations, including utilities and project-specific sample code. See our <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/deep-program-understanding\/publications\/\">Publications<\/a> and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/deep-program-understanding\/downloads\/\">Downloads<\/a> tabs for more details.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"learning-to-understand-programs\">Learning to understand programs<\/h3>\n\n\n\n<p>Building \u201csmart\u201d software engineering tools requires learning to analyse and understand existing code and related artefacts such as documentation and online resources (e.g., StackOverflow). One of our primary concerns is the integration of standard static analysis methods with machine learning methods to create learning-based program analyses that can be used within software engineering tools. Such tools can then be used to find bugs, automatically retrieve or produce relevant documentation, or verify programs.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"highlighted-publications\">Highlighted publications<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/self-supervised-bug-detection-and-repair-2\/\">Self-Supervised Bug Detection and Repair<\/a> (NeurIPS&#8217;21) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/microsoft\/neurips21-self-supervised-bug-detection-and-repair\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/typilus-neural-type-hints\/\">Typilus: Neural Type Hints<\/a> (PLDI&#8217;20)<\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/learning-to-represent-edits\/\">Learning to Represent Edits<\/a> (ICLR&#8217;19) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/msrc-dpu-learning-to-represent-edits\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/learning-represent-programs-graphs\/\">Learning to Represent Programs with Graphs<\/a> (ICLR&#8217;18) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/graph-based-code-modelling\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/a-survey-of-machine-learning-for-big-code-and-naturalness\/\">A Survey of Machine Learning for Big Code and Naturalness<\/a> (<em>ACM Computing Surveys 2018<\/em>)<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"learning-to-generate-programs\">Learning to generate programs<\/h3>\n\n\n\n<p>A core problem of machine learning is to learn algorithms that explain observed behaviour. This can take several forms, such as program synthesis from examples, in which an interpretable program matching given input\/output pairs has to be produced; or alternatively programming by demonstration, in which a system has to learn to mimic sequences of actions.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"highlighted-publications\">Highlighted publications<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/learning-to-complete-code-with-sketches\/\">Learning to Complete Code with Sketches<\/a> (ICLR&#8217;22)<\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/fast-and-memory-efficient-neural-code-completion\/\">Fast and Memory-Efficient Neural Code Completion<\/a> (MSR&#8217;20)<\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/generative-code-modeling-with-graphs\/\">Generative Code Modeling with Graphs<\/a> (ICLR&#8217;19) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/graph-based-code-modelling\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><li><a style=\"font-size: 18px\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/deepcoder-learning-write-programs\/\">DeepCoder: Learning to Write Programs<\/a> (ICLR&#8217;17) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/DeepCoder-Utils\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/terpret-probabilistic-programming-language-program-induction\/\">TerpreT: A Probabilistic Programming Language for Program Induction<\/a> (Tech Report, 2016) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/51alg\/TerpreT\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/bimodal-modelling-of-source-code-and-natural-language\/\">Bimodal Modelling of Source Code and Natural Language<\/a> (ICML&#8217;15)<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"advancing-the-machine-learning-frontier\">Advancing the machine learning frontier<\/h3>\n\n\n\n<p>Structured data such as programs represent a challenge for machine learning methods. The combination of domain constraints, known semantics and complex structure requires new machine learning methods and techniques. Our focus in this area is the analysis and generation of graphs, for which we have developed novel neural network architectures and generative procedures.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"highlighted-publications\">Highlighted publications<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/heat-hyperedge-attention-networks\/\">HEAT: Hyperedge Attention Networks<\/a><\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/constrained-graph-variational-autoencoders-for-molecule-design\/\">Constrained Graph Variational Autoencoders for Molecule Design<\/a> (NeurIPS&#8217;18) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/constrained-graph-variational-autoencoder\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/graph-partition-neural-networks-semi-supervised-classification\/\" target=\"_blank\" rel=\"noreferrer noopener\">Graph Partition Neural Networks for Semi-Supervised Classification<\/a> (ICLR&#8217;18 Workshop) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/graph-partition-neural-network-samples\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/gated-graph-sequence-neural-networks\/\">Gated Graph Sequence Neural Networks<\/a> (ICLR&#8217;16) | <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/gated-graph-neural-network-samples\">Code on GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li><\/ul>\n\n\n\n\n\n<p>We have open-sourced many of our work and implementations.<\/p>\n<h4>Libraries<\/h4>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/dpu-utils\">dpu-utils<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: Useful Python utilities for projects on deep program understanding.<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/gated-graph-neural-network-samples\">gated-graph-neural-networks<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>:\u00a0A set of efficient TensorFlow implementations of graph neural networks that can handle large and sparse graphs.<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/microsoft\/tf2-gnn\">tf2-gnn<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: TensorFlow 2 library implementing Graph Neural Networks<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/microsoft\/ptgnn\">ptgnn<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: A PyTorch Graph Neural Network Library<\/li>\n<\/ul>\n<h4>Project-Specific Utilities<\/h4>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/constrained-graph-variational-autoencoder\">constrained-graph-variational-autoencoders<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: code for constrained graph VAEs.<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/DeepCoder-Utils\">DeepCoder-Utils<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: Code used in the experiments of the DeepCoder paper (ICLR 2017)<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/graph-partition-neural-network-samples\">graph-partition-neural-network-samples<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: Sample code for Graph Partition Neural Networks.<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/msrc-dpu-learning-to-represent-edits\">dpu-learning-to-represent-edits<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: C# data extraction for &#8220;Learning to Represent Edits&#8221;<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/Microsoft\/graph-based-code-modelling\">graph-based-code-modelling<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: The code for the ICLR&#8217;18 and ICLR&#8217;19 papers<\/li>\n<\/ul>\n\n\n","protected":false},"excerpt":{"rendered":"<p>The Deep Program Understanding project aims to teach machines to understand complex algorithms, combining methods from the programming languages, software engineering and the machine learning communities.<\/p>\n","protected":false},"featured_media":823894,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556,13560],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-323543","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-programming-languages-software-engineering","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[442605,442575,442587,442596,742480,742471,742486,742492,797443,823021,823042,823063,823078,823087,823099,823105,823117,823867,823873,372188,167493,215044,325712,326123,327005,369701,369707,369713,166142,393785,543570,574716,605025,606108,742453,742462,742468],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[551157,482808,582940],"related-articles":[],"tab-content":[{"id":0,"name":"Relevant Software","content":"We have open-sourced many of our work and implementations.\r\n<h4>Libraries<\/h4>\r\n<ul>\r\n \t<li><a href=\"https:\/\/github.com\/Microsoft\/dpu-utils\">dpu-utils<\/a>: useful Python utilities for projects on deep program understanding.<\/li>\r\n \t<li><a href=\"https:\/\/github.com\/Microsoft\/gated-graph-neural-network-samples\">gated-graph-neural-networks<\/a>:\u00a0A set of efficient TensorFlow implementations of graph neural networks that can handle large and sparse graphs.<\/li>\r\n<\/ul>\r\n<h4>Project-Specific Utilities<\/h4>\r\n<ul>\r\n \t<li><a href=\"https:\/\/github.com\/Microsoft\/constrained-graph-variational-autoencoder\">constrained-graph-variational-autoencoders<\/a>: code for constrained graph VAEs.<\/li>\r\n \t<li><a href=\"https:\/\/github.com\/Microsoft\/DeepCoder-Utils\">DeepCoder-Utils<\/a>: Code used in the experiments of the DeepCoder paper (ICLR 2017)<\/li>\r\n \t<li><a href=\"https:\/\/github.com\/Microsoft\/graph-partition-neural-network-samples\">graph-partition-neural-network-samples<\/a>: Sample code for Graph Partition Neural Networks.<\/li>\r\n \t<li><a href=\"https:\/\/github.com\/Microsoft\/msrc-dpu-learning-to-represent-edits\">dpu-learning-to-represent-edits<\/a>: C# data extraction for \"Learning to Represent Edits\"<\/li>\r\n \t<li><a href=\"https:\/\/github.com\/Microsoft\/graph-based-code-modelling\">graph-based-code-modelling<\/a>: The code for the ICLR'18 and ICLR'19 papers<\/li>\r\n<\/ul>"}],"related-researchers":[],"msr_research_lab":[199561],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/323543","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":26,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/323543\/revisions"}],"predecessor-version":[{"id":919485,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/323543\/revisions\/919485"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/823894"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=323543"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=323543"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=323543"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=323543"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=323543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}