{"id":875721,"date":"2022-09-07T02:13:28","date_gmt":"2022-09-07T09:13:28","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=875721"},"modified":"2022-09-07T02:13:32","modified_gmt":"2022-09-07T09:13:32","slug":"code-intelligence","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/code-intelligence\/","title":{"rendered":"Code Intelligence"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background- card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"613\" height=\"414\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/content.jpg\" class=\"attachment-full size-full\" alt=\"AI for Code\" style=\"object-position: 67% 67%\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/content.jpg 613w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/content-300x203.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/content-240x162.jpg 240w\" sizes=\"auto, (max-width: 613px) 100vw, 613px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h2 id=\"code-intelligence\">Code Intelligence<\/h2>\n\n\n\n<p>Apply AI techniques for software engineering<\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>Code intelligence project aims to leverage AI techniques to help software developers improve the productivity of the development process. We focus on building large-scale pre-trained models to understand and generate source codes. The research directions include pre-trained models for code, benchmark datasets, code completion, code retrieval, code review, etc. More AI-assisted products under collaboration with DevDiv, GitHub, and LinkedIn will be released which can empower the software developers all over the world.<\/p>\n\n\n\n<p><strong>What have we done?<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>We propose several pre-trained models for source code, including CodeBERT, GraphCodeBERT and UniXcoder.<ol><li>CodeBERT is the first bimodal pre-trained model for programming language and natural language.<\/li><li>GraphCodeBERT, based on CodeBERT, leverages a semantic-level structure of code, i.e., data flow, in the pre-training stage.<\/li><li>UniXcoder is a unified cross-modal pre-trained model for programming language that incorporates semantic and syntax information from code comment and AST. <\/li><\/ol><\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907144851-1024x454.png\" alt=\"CodeBERT series pre-trained models\" class=\"wp-image-875775\" width=\"799\" height=\"354\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907144851-1024x454.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907144851-300x133.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907144851-768x341.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907144851-1536x681.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907144851-240x106.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907144851.png 1881w\" sizes=\"auto, (max-width: 799px) 100vw, 799px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>We establish a benchmark CodeXGLUE for code intelligence which includes a collection of code intelligence tasks and a platform for model evaluation and comparison. CodeXGLUE includes 14 datasets for 10 diversified code intelligence tasks.<\/li><li>Besides the general pre-trained models and datasets, we also explore deeply in some specific code scenarios, including code completion, code search, code review, etc. For code completion, we develop eWASH, which uses extended context for code completion; Grammformer, which learns to complete code with sketches; and ReACC, a retrieval-augmented framework. We have also developed CodeReviewer for automating code review activities such as review comment generation and code refinement.<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"347\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907162919-1024x347.png\" alt=\"Current works\" class=\"wp-image-875796\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907162919-1024x347.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907162919-300x102.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907162919-768x260.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907162919-1536x520.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907162919-240x81.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2022\/09\/WeChat-Screenshot_20220907162919.png 1903w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n","protected":false},"excerpt":{"rendered":"<p>Apply AI techniques for software engineering Code intelligence project aims to leverage AI techniques to help software developers improve the productivity of the development process. We focus on building large-scale pre-trained models to understand and generate source codes. The research directions include pre-trained models for code, benchmark datasets, code completion, code retrieval, code review, etc. [&hellip;]<\/p>\n","protected":false},"featured_media":875733,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556,13560],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-875721","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-programming-languages-software-engineering","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[695541,715684,758380,773176,782512,823873,843787,844153],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Yeyun Gong","user_id":39186,"people_section":"Section name 0","alias":"yegong"}],"msr_research_lab":[199560],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/875721","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":7,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/875721\/revisions"}],"predecessor-version":[{"id":876615,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/875721\/revisions\/876615"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/875733"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=875721"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=875721"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=875721"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=875721"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=875721"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}