{"id":783091,"date":"2021-10-11T14:01:00","date_gmt":"2021-10-11T21:01:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-group&#038;p=783091"},"modified":"2022-10-17T13:12:36","modified_gmt":"2022-10-17T20:12:36","slug":"speech-research-team","status":"publish","type":"msr-group","link":"https:\/\/www.microsoft.com\/en-us\/research\/group\/speech-research-team\/","title":{"rendered":"Speech Research Team"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background-catalina-blue card-background--full-bleed\">\n\t\t\t\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 align-self-center\">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/lab\/microsoft-research-redmond\/\" class=\"icon-link icon-link--reverse mb-2\" data-bi-cN=\"Return to Microsoft Research Lab - Redmond\">\n\t\t\t\t\t\t\t\t\t<span class=\"c-glyph glyph-chevron-left\" aria-hidden=\"true\"><\/span>\n\t\t\t\t\t\t\t\t\tReturn to Microsoft Research Lab &#8211; Redmond\t\t\t\t\t\t\t\t<\/a>\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 id=\"speech-research-team\">Speech Research Team<\/h1>\n\n\n\n<p><\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>The Speech Research Team is part of the&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/cognitive-services-research\/\">Azure Cognitive Services Research (CSR) group<\/a> and is responsible for fundamental advances in audio, speech, and spoken language processing technologies. We also work closely with engineering and product teams to bring the new technologies into Microsoft products.<\/p>\n\n\n\n<p>We work on a wide range of speech processing problems, including speech enhancement, speech recognition, speaker diarization, multi-lingual speech recognition, spoken language understanding, end-to-end modeling, self-supervised learning, and multi-modal modeling.  Our recent work covers the following topics.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Deep learning-based real-time speech enhancement<\/li><li>Monaural and multi-channel speech separation for meeting transcription<\/li><li>Ad hoc microphone arrays<\/li><li>End-to-end modeling for speaker-attributed speech recognition<\/li><li>Unified speech representation learning<\/li><li>Speech-language pre-training<\/li><\/ul>\n\n\n\n<p>The results of our work are delivered to Microsoft speech technologies and interwoven into various products. We also contributed to the development of new services, such as <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/cognitive-services\/speech-service\/conversation-transcription\" target=\"_blank\" rel=\"noopener noreferrer\">Conversation Transcription<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> of Azure Cognitive Services which is powering the transcription features of several Microsoft products. We received the IEEE Signal Processing Society Conference Best Paper Award for Industry at ICASSP 2022. Our work resulted in the first place in the speaker diarization track of <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.robots.ox.ac.uk\/~vgg\/data\/voxceleb\/competition2020.html\" target=\"_blank\" rel=\"noopener noreferrer\">VoxSRC-20<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (joint work with other Microsoft scientists and Microsoft Research researchers) and the breakthrough human parity performance on the&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/microsoft-researchers-achieve-new-conversational-speech-recognition-milestone\/\" target=\"_blank\" rel=\"noreferrer noopener\">Switchboard conversational speech recognition task<\/a>.&nbsp;<\/p>\n\n\n","protected":false},"excerpt":{"rendered":"<p>The Speech Research Team is part of the\u00a0Azure Cognitive Services Research (CSR) group and is responsible for fundamental advances in audio, speech, and spoken language processing technologies.<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_group_start":"","footnotes":""},"research-area":[13556,243062,13545],"msr-group-type":[243694],"msr-locale":[268875],"msr-impact-theme":[],"class_list":["post-783091","msr-group","type-msr-group","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-audio-acoustics","msr-research-area-human-language-technologies","msr-group-type-group","msr-locale-en_us"],"msr_group_start":"","msr_detailed_description":"","msr_further_details":"","msr_hero_images":[],"msr_research_lab":[199565],"related-researchers":[{"type":"guest","display_name":"Kenichi Kumatani","user_id":607155,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Yao Qian","user_id":34976,"people_section":"Section name 0","alias":"yaoqian"},{"type":"user_nicename","display_name":"Manthan Thakker","user_id":39627,"people_section":"Section name 0","alias":"mathakke"},{"type":"user_nicename","display_name":"Xiaofei Wang","user_id":38658,"people_section":"Section name 0","alias":"xiaofewa"}],"related-publications":[768004,815227,815212,814795,814789,785779,785644,785632,783448,783442,815233,767797,767779,767728,764521,764509,755719,747511,747499,741088,897855,1060068,1060059,1060050,898116,898095,898047,897990,897963,897942,741082,897849,897843,897831,897795,897789,897783,897684,864003,815242,557775,626688,626682,626367,608625,605619,595954,583603,578896,578824,630372,557769,502496,480237,480222,480210,480201,480186,480174,480144,704635,741076,726376,723043,722401,722395,712249,704656,704647,704641,467697,703255,702925,701653,701647,701485,683817,669003,650463,649689],"related-downloads":[],"related-videos":[],"related-projects":[],"related-events":[],"related-opportunities":[],"related-posts":[1083645],"tab-content":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/783091","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-group"}],"version-history":[{"count":36,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/783091\/revisions"}],"predecessor-version":[{"id":934074,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group\/783091\/revisions\/934074"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=783091"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=783091"},{"taxonomy":"msr-group-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-group-type?post=783091"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=783091"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=783091"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}