{"id":144931,"date":"2020-02-27T20:34:30","date_gmt":"2015-01-30T00:14:18","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/group\/deep-learning-technology-center\/"},"modified":"2025-10-31T13:56:57","modified_gmt":"2025-10-31T20:56:57","slug":"deep-learning-group","status":"publish","type":"msr-group","link":"https:\/\/www.microsoft.com\/en-us\/research\/group\/deep-learning-group\/","title":{"rendered":"Deep Learning Group"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background-grey card-background--full-bleed\">\n\t\t\t\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 align-self-center\">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/lab\/microsoft-research-redmond\/\" class=\"icon-link icon-link--reverse mb-2\" data-bi-cN=\"Return to Microsoft Research Lab - Redmond\">\n\t\t\t\t\t\t\t\t\t<span class=\"c-glyph glyph-chevron-left\" aria-hidden=\"true\"><\/span>\n\t\t\t\t\t\t\t\t\tReturn to Microsoft Research Lab &#8211; Redmond\t\t\t\t\t\t\t\t<\/a>\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading h2\" id=\"deep-learning-group\">Deep Learning Group<\/h1>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>The Deep Learning group&#8217;s mission is to advance the state-of-the-art on deep learning and its application to natural language processing, computer vision, multi-modal intelligence, and for making progress on conversational AI. Our research interests are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Neural language modeling for natural language understanding and generation. Some ongoing projects are <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/namisan\/mt-dnn\" target=\"_blank\" rel=\"noopener noreferrer\">MT-DNN<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/unilm\" target=\"_blank\" rel=\"noopener noreferrer\">UniLM<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/microsoft-deberta-surpasses-human-performance-on-the-superglue-benchmark\/\" target=\"_blank\" rel=\"noreferrer noopener\">DeBERTa<\/a>, question-answering, long text generation, etc.<\/li>\n\n\n\n<li>Neural symbolic computing. We are developing next-generation architectures to bridge gap between neural and symbolic representations with\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/next-generation-architectures-bridge-gap-between-neural-and-symbolic-representations-with-neural-symbols\/\">neural symbols<\/a>. Some ongoing projects are relational encoding using Tensor-Product Representations, AI for Code, etc.<\/li>\n\n\n\n<li>Vision-language grounding and understanding. Some ongoing projects are <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2204.03610\">UniCL<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/vinvl-advancing-the-state-of-the-art-for-vision-language-models\/\" target=\"_blank\" rel=\"noreferrer noopener\">VinVL<\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/Oscar\" target=\"_blank\" rel=\"noopener noreferrer\">OSCAR<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/LuoweiZhou\/VLP\" target=\"_blank\" rel=\"noopener noreferrer\">vision-language pre-training<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, vision language navigation, image editing and generation, image commenting and captioning, etc.<\/li>\n\n\n\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/1809.08267\" target=\"_blank\" rel=\"noopener noreferrer\">Conversational AI<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Some ongoing projects are <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/ConversationLearner-Samples\" target=\"_blank\" rel=\"noopener noreferrer\">conversation learner<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/2005.05298\" target=\"_blank\" rel=\"noopener noreferrer\">SOLOIST<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> which enable dialog authors to build task-oriented dialog systems at scale via machine teaching and transfer learning, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/ConvLab\/ConvLab\" target=\"_blank\" rel=\"noopener noreferrer\">ConvLab<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> which is an open-source multi-domain dialog system platform, and response generation for social bots such as <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/1812.08989\" target=\"_blank\" rel=\"noopener noreferrer\">Microsoft XiaoIce<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, etc.<\/li>\n\n\n\n<li>Fundamental research in understanding and scaling large neural networks. For example, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/tuning-large-neural-networks-via-zero-shot-hyperparameter-transfer\/\">maximal update parametrization (\u00b5P) and \u00b5Transfer<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/feature-learning-in-infinite-width-neural-networks\/\">the feature learning limit of neural networks<\/a>, and, more generally, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/gregyang\/\">the theory of Tensor Programs<\/a>.<\/li>\n<\/ul>\n\n\n","protected":false},"excerpt":{"rendered":"<p>The Deep Learning group advances the state-of-the-art in deep learning\u00a0to achieve general intelligence. We\u00a0develop algorithms, models, and systems\u00a0in deep supervised and unsupervised learning, deep reinforcement learning,\u00a0and neural-symbolic reasoning, and then pursue breakthroughs in computer vision, natural language processing,\u00a0multimodal intelligence, time series analysis, and other relevant areas.<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_group_start":"","footnotes":""},"research-area":[13561,13556,13562,13545],"msr-group-type":[243694],"msr-locale":[268875],"msr-impact-theme":[],"class_list":["post-144931","msr-group","type-msr-group","status-publish","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-research-area-human-language-technologies","msr-group-type-group","msr-locale-en_us"],"msr_group_start":"","msr_detailed_description":"","msr_further_details":"","msr_hero_images":[],"msr_research_lab":[199565],"related-researchers":[{"type":"guest","display_name":"Jyoti Aneja","user_id":700582,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Harkirat Behl","user_id":41548,"people_section":"Section name 0","alias":"hbehl"},{"type":"guest","display_name":"Yonatan Bisk","user_id":788162,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Hao Cheng","user_id":39922,"people_section":"Section name 0","alias":"chehao"},{"type":"user_nicename","display_name":"Michel Galley","user_id":32887,"people_section":"Section name 0","alias":"mgalley"},{"type":"user_nicename","display_name":"Jianfeng Gao","user_id":32246,"people_section":"Section name 0","alias":"jfgao"},{"type":"user_nicename","display_name":"Lars Liden","user_id":32612,"people_section":"Section name 0","alias":"laliden"},{"type":"user_nicename","display_name":"Xiaodong Liu","user_id":34877,"people_section":"Section name 0","alias":"xiaodl"},{"type":"user_nicename","display_name":"Swadheen Shukla","user_id":38248,"people_section":"Section name 0","alias":"swads"},{"type":"user_nicename","display_name":"Chandan Singh","user_id":42126,"people_section":"Section name 0","alias":"chansingh"},{"type":"user_nicename","display_name":"Paul Smolensky","user_id":36353,"people_section":"Section name 0","alias":"psmo"},{"type":"guest","display_name":"Ryuichi Takanobu","user_id":646683,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Andrea Tupini","user_id":40339,"people_section":"Section name 0","alias":"andreatupini"},{"type":"user_nicename","display_name":"Chenglong Wang","user_id":41251,"people_section":"Section name 0","alias":"chenwang"}],"related-publications":[593410,591757,570588,572352,553113,591013,506918,570432,508631,502184,552729,490436,574707,559176,506933,506330,551151,482721,500372,466368,493946,466362,495725,506924,488027,328118,437235,481140,473007,495689,438609,454764,424101,435120,435126,369872,498779,377222,397103,398654,294695,383546,369965,383549,376403,354179,391253,391196,400139,393209,374186,372953,340424,380765,391262,339806,309554,328361,324893,391292,391277,314969,294440,294713,294722,294719,299276,238331,270507,283535,238203,238137,314984,238121,242609,242621,237177,238157,391304,238284,238202,238204,215435,215434,238173,215430,168904,365066,354209,238152,168801,215140,311459,168759,168302,168551,168020,168486,168443,167899,168021,168090,168019,167946,340433,167876,167045,167739,166529,166513,166451,166669,166530,166528,165896,165215,164901,164452,164450,163543,162865,364730,364694,364721,608640,613824,626973,626988,628500,630861,633303,640596,645273,645300,647163,650868,650883,654966,672750,673644,676965,680259,680343,681282,701698,706384,706405,707338,707347,715066,716989,717001,717016,717262,725137,741730,744436,751342,766183,783775,785344,788207,788249,804010,810415,821809,838312,841159,846181,847417,847423,847432,847438,847444,847483,847984,858045,860352,879321,881913,883905,885843,888294,922803,922809,936855,940530,940548,944574,966210,966927,982488,998313,1004331,1004340,1004352,1004367,1004373,1005141,1007064,1030533,1093152,1114974,1126704,1128837,1136383,1141355,1154630,1166856],"related-downloads":[],"related-videos":[301163,775504,1132209,777859,777850,775894,775786,775777,775513,689436,775492,775480,775438,775429,755203,738595,689448],"related-projects":[931254,890049,847462,804847,810817,811012,811027,811039,466380,425502,394646,572997,377990,171217],"related-events":[394283,394319],"related-opportunities":[1154500,1158459,1163417],"related-posts":[416867,444303,456975,728458,1136909,1101105,1008426,989523,965166,895428,851928,823648,754636,488363,717256,715399,696898,657990,648279,632745,626169,610734,592603,578602,573408,564300],"tab-content":[{"id":0,"name":"Highlights","content":"<ul>\r\n \t<li>Three papers accepted to\u00a0<a href=\"https:\/\/2017.icml.cc\/\" target=\"_blank\" rel=\"noopener\">ICML 17<\/a><\/li>\r\n \t<li>Three papers accepted\u00a0to <a href=\"http:\/\/cvpr2017.thecvf.com\/\" target=\"_blank\" rel=\"noopener\">CVPR 17<\/a> and our team which included University of Adelaide and Australian National University won 1st place in the <a href=\"http:\/\/www.visualqa.org\/challenge.html\" target=\"_blank\" rel=\"noopener\">Visual Question Answering Challenge 2017<\/a>. See the <a href=\"http:\/\/www.visualqa.org\/roe_2017.html\" target=\"_blank\" rel=\"noopener\">leaderboard <\/a>and <a href=\"https:\/\/x.com\/MSFTResearch\/status\/890348749746089984\" target=\"_blank\" rel=\"noopener\">MSFTResearch Twitter<\/a>.<\/li>\r\n \t<li>One paper accepted to <a href=\"http:\/\/acl2017.org\/\" target=\"_blank\" rel=\"noopener\">ACL 17<\/a><\/li>\r\n \t<li>Two papers accepted to <a href=\"http:\/\/www.iclr.cc\/doku.php?id=ICLR2017:main&amp;redirect=1\" target=\"_blank\" rel=\"noopener\">ICLR 17<\/a><\/li>\r\n \t<li>Released the image captioning service in <a href=\"https:\/\/www.microsoft.com\/cognitive-services\/en-us\/computer-vision-api\" target=\"_blank\" rel=\"noopener\">Microsoft Cognitive Services\/Vision API<\/a>. You can try the <em>CaptionBot<\/em> demo\u00a0at <a href=\"http:\/\/captionbot.ai\/\" target=\"_blank\" rel=\"noopener\">http:\/\/CaptionBot.ai<\/a>,\u00a0or <a href=\"https:\/\/bots.botframework.com\/bot?id=captionbot\" target=\"_blank\" rel=\"noopener\">add it to\u00a0your Skype<\/a> and have a photo chat.\u00a0The technology is based on our competition winning deep learning multimodal intelligence\u00a0models.<\/li>\r\n \t<li>Our\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/from-captions-to-visual-concepts-and-back\/\" target=\"_blank\" rel=\"noopener\">Microsoft Research\u00a0entry<\/a>\u00a0won 1st Prize, tied with Google, at the\u00a0<a href=\"http:\/\/mscoco.org\/dataset\/#cap2015\" target=\"_blank\" rel=\"noopener\">MS COCO Captioning Challenge 2015<\/a>, achieved the highest\u00a0score in\u00a0the Turing Test among all submissions. More details in the CVPR\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/from-captions-to-visual-concepts-and-back\/\" target=\"_blank\" rel=\"noopener\">paper<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/from-captions-to-visual-concepts-and-back\/\" target=\"_blank\" rel=\"noopener\">demo<\/a>,\u00a0<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/Xiaodong_He_ImageCaptioning_InvitedTalk.pdf\" target=\"_blank\" rel=\"noopener\">relevant talk<\/a>, and recent media coverage by\u00a0<a href=\"http:\/\/blogs.microsoft.com\/next\/2015\/05\/28\/picture-this-microsoft-research-project-can-interpret-caption-photos\/\" target=\"_blank\" rel=\"noopener\">Microsoft blog<\/a>,\u00a0<a href=\"http:\/\/blogs.technet.com\/b\/inside_microsoft_research\/archive\/2015\/06\/11\/microsoft-researchers-tie-for-best-image-captioning-technology.aspx\" target=\"_blank\" rel=\"noopener\">TechNet<\/a>,\u00a0<a href=\"http:\/\/www.slashgear.com\/microsoft-auto-photo-captioning-research-has-eye-on-ai-28385605\/\" target=\"_blank\" rel=\"noopener\">SlashGear<\/a>,\u00a0<a href=\"http:\/\/www.engadget.com\/2015\/05\/28\/microsoft-imaging-caption\/\" target=\"_blank\" rel=\"noopener\">Engadget<\/a>,\u00a0<a href=\"http:\/\/venturebeat.com\/2015\/06\/09\/google-ties-with-microsoft-in-microsofts-own-contest-for-generating-image-captions\/\" target=\"_blank\" rel=\"noopener\">ventureBeat<\/a>,\u00a0<a href=\"http:\/\/www.androidheadlines.com\/2015\/06\/google-and-microsoft-tie-in-robotic-caption-contest.html\" target=\"_blank\" rel=\"noopener\">androidHeadlines<\/a>.<\/li>\r\n<\/ul>"}],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/144931","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-group"}],"version-history":[{"count":40,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/144931\/revisions"}],"predecessor-version":[{"id":1154375,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/144931\/revisions\/1154375"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=144931"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=144931"},{"taxonomy":"msr-group-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group-type?post=144931"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=144931"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=144931"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}