{"id":5145,"date":"2017-02-03T10:35:55","date_gmt":"2017-02-03T18:35:55","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/translation\/?p=5145"},"modified":"2017-02-03T10:35:55","modified_gmt":"2017-02-03T18:35:55","slug":"microsoft-translator-publicly-releases-speech-translation-corpus","status":"publish","type":"post","link":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/","title":{"rendered":"Microsoft Translator publicly releases speech translation corpus"},"content":{"rendered":"<table align=\"left\">\n<tbody>\n<tr>\n<td><span style=\"color: #000000;font-family: Calibri\"><a href=\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2017\/02\/Christian_Federmann.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"312\" height=\"340\" class=\"size-full wp-image-5155 alignleft\" alt=\"christian_federmann\" src=\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2017\/02\/Christian_Federmann.jpg\" \/><\/a><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-size: 8pt\">Christian Federmann, senior program manager<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"color: #000000;font-family: Calibri\">As part of an ongoing effort within Microsoft to improve the accuracy of artificial intelligence (AI) systems, Microsoft Translator is publicly releasing a set of data that includes multiple conversations between bilingual speakers who are speaking French, German and English.<\/span><\/p>\n<p><span style=\"color: #000000;font-family: Calibri\">This corpus, which was produced by Microsoft using bilingual speakers, aims to create a standard by which people can measure how well their conversational speech translation systems work. It can serve as a standardized data set for testing bilingual conversational speech translation systems such as the <\/span><a target=\"_blank\" href=\"https:\/\/translator.microsoft.com\/\"><span style=\"color: #0563c1;font-family: Calibri\">Microsoft Translator live feature<\/span><\/a><span style=\"color: #000000;font-family: Calibri\"> and <\/span><a target=\"_blank\" href=\"https:\/\/www.skype.com\/en\/features\/skype-translator\/\"><span style=\"color: #0563c1;font-family: Calibri\">Skype Translator<\/span><\/a><span style=\"color: #000000;font-family: Calibri\">.<\/span><\/p>\n<p><a target=\"_blank\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/chrife\/\"><span style=\"color: #0563c1;font-family: Calibri\">Christian Federmann<\/span><\/a><span style=\"color: #000000;font-family: Calibri\">, a senior program manager working with the Microsoft Translator team, said there aren\u2019t as many standardized data sets for testing bilingual conversational speech translation systems. \u201cYou need high-quality data in order to have high-quality testing,\u201d Federmann said.<\/span><\/p>\n<p><span style=\"color: #000000;font-family: Calibri\">The Microsoft team hopes the corpus, which is freely available, will benefit the entire field of conversational translation and help to create more standardized benchmarks that researchers can use to measure their work against others.<\/span><\/p>\n<p><span style=\"color: #000000;font-family: Calibri\">\u201cThis helps propel the field forward,\u201d said <\/span><a target=\"_blank\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/wilewis\/\"><span style=\"color: #0563c1;font-family: Calibri\">Will Lewis<\/span><\/a><span style=\"color: #000000;font-family: Calibri\">, a principal technical program manager with the Microsoft Translator team who also worked on the project.<\/span><\/p>\n<p><span style=\"color: #000000;font-family: Calibri\">Download the Microsoft Speech Language Translation corpus <\/span><a target=\"_blank\" href=\"https:\/\/www.microsoft.com\/en-us\/download\/details.aspx?id=54689\"><span style=\"color: #0563c1;font-family: Calibri\">here<\/span><\/a><span style=\"color: #000000;font-family: Calibri\">.<\/span><\/p>\n<p><span style=\"color: #000000;font-family: Calibri\">Learn more about this release as well as other ways Microsoft is working to make AI smarter and more accurate in the <\/span><a target=\"_blank\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/ai-getting-smarter-microsoft-researchers-ensure-ai-accuracy\"><span style=\"color: #0563c1;font-family: Calibri\">Microsoft Research blog<\/span><\/a><span style=\"color: #000000;font-family: Calibri\">.<\/span><\/p>\n<p><span style=\"color: #000000;font-family: Calibri\">\u00a0<\/span><\/p>\n<p><span style=\"color: #000000;font-family: Calibri\">Learn More<\/span><\/p>\n<ul>\n<li><a target=\"_blank\" href=\"http:\/\/workshop2016.iwslt.org\/downloads\/IWSLT_2016_paper_12.pdf\"><span style=\"color: #0563c1;font-family: Calibri\">Research paper: Microsoft Speech Language Translation (MSLT) Corpus: The IWSLT 2016 release for English, French and German<\/span><\/a><\/li>\n<li><a target=\"_blank\" href=\"https:\/\/www.microsoft.com\/en-us\/translator\/mt.aspx\"><span style=\"color: #0563c1;font-family: Calibri\">How machine translation works<\/span><\/a><\/li>\n<li><a target=\"_blank\" href=\"https:\/\/translator.microsoft.com\/\"><span style=\"color: #0563c1;font-family: Calibri\">Try speech translation in the Microsoft Translator live feature<\/span><\/a><\/li>\n<li><a target=\"_blank\" href=\"https:\/\/www.microsoft.com\/en-us\/translator\/speech.aspx\"><span style=\"color: #0563c1;font-family: Calibri\">Microsoft Translator Speech Translation API<\/span><\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Christian Federmann, senior program manager As part of an ongoing effort within Microsoft to improve the accuracy of artificial intelligence (AI) systems, Microsoft Translator is publicly releasing a set of data that includes multiple conversations between bilingual speakers who are speaking French, German and English. This corpus, which was produced by Microsoft using bilingual speakers, aims to create a standard<span class=\"read-more-ellipsis\">&#8230;.<\/span><\/p>\n <p class=\"c-paragraph-3 read-more-link\"><a class=\"c-call-to-action c-glyph f-lightweight\" href=\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/\">CONTINUE READING <span class=\"x-screen-reader\">\"Microsoft Translator publicly releases speech translation corpus\"<\/span><\/a><\/p>","protected":false},"author":54,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-5145","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"acf":[],"yoast_head":"<title>Microsoft Translator publicly releases speech translation corpus - Microsoft Translator Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Microsoft Translator publicly releases speech translation corpus - Microsoft Translator Blog\" \/>\n<meta property=\"og:description\" content=\"Christian Federmann, senior program manager As part of an ongoing effort within Microsoft to improve the accuracy of artificial intelligence (AI) systems, Microsoft Translator is publicly releasing a set of data that includes multiple conversations between bilingual speakers who are speaking French, German and English. This corpus, which was produced by Microsoft using bilingual speakers, aims to create a standard....\" \/>\n<meta property=\"og:url\" content=\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft Translator Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/microsofttranslator\" \/>\n<meta property=\"article:published_time\" content=\"2017-02-03T18:35:55+00:00\" \/>\n<meta property=\"og:image\" content=\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2017\/02\/Christian_Federmann.jpg\" \/>\n<meta name=\"author\" content=\"Microsoft Translator\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@mstranslator\" \/>\n<meta name=\"twitter:site\" content=\"@mstranslator\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Microsoft Translator\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/#article\",\"isPartOf\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/\"},\"author\":{\"name\":\"Microsoft Translator\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/0a163e1bf796b3bb651085032849cf37\"},\"headline\":\"Microsoft Translator publicly releases speech translation corpus\",\"datePublished\":\"2017-02-03T18:35:55+00:00\",\"dateModified\":\"2017-02-03T18:35:55+00:00\",\"mainEntityOfPage\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/\"},\"wordCount\":274,\"publisher\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/\",\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/\",\"name\":\"Microsoft Translator publicly releases speech translation corpus - Microsoft Translator Blog\",\"isPartOf\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#website\"},\"datePublished\":\"2017-02-03T18:35:55+00:00\",\"dateModified\":\"2017-02-03T18:35:55+00:00\",\"breadcrumb\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https://www.microsoft.com\/en-us\/translator/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Microsoft Translator publicly releases speech translation corpus\"}]},{\"@type\":\"WebSite\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#website\",\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/\",\"name\":\"Microsoft Translator Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https://www.microsoft.com\/en-us\/translator/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#organization\",\"name\":\"Microsoft Corporation\",\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/logo\/image\/\",\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2021\/05\/microsoft_logo_element-300x300-1.png\",\"contentUrl\":\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2021\/05\/microsoft_logo_element-300x300-1.png\",\"width\":300,\"height\":300,\"caption\":\"Microsoft Corporation\"},\"image\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.youtube.com\/playlist?list=PLD7HFcN7LXRd4kd2XgZjIbQ8TwTC32Zc9\",\"https:\/\/www.facebook.com\/microsofttranslator\",\"https:\/\/twitter.com\/mstranslator\"]},{\"@type\":\"Person\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/0a163e1bf796b3bb651085032849cf37\",\"name\":\"Microsoft Translator\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/d22a72f3ca14b9d59f8bcdc837a51c6bf52b4a675c30ef18a9275753db5eda6c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/d22a72f3ca14b9d59f8bcdc837a51c6bf52b4a675c30ef18a9275753db5eda6c?s=96&d=mm&r=g\",\"caption\":\"Microsoft Translator\"},\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/author\/mtteam\/\"}]}<\/script>","yoast_head_json":{"title":"Microsoft Translator publicly releases speech translation corpus - Microsoft Translator Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/","og_locale":"en_US","og_type":"article","og_title":"Microsoft Translator publicly releases speech translation corpus - Microsoft Translator Blog","og_description":"Christian Federmann, senior program manager As part of an ongoing effort within Microsoft to improve the accuracy of artificial intelligence (AI) systems, Microsoft Translator is publicly releasing a set of data that includes multiple conversations between bilingual speakers who are speaking French, German and English. This corpus, which was produced by Microsoft using bilingual speakers, aims to create a standard....","og_url":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/","og_site_name":"Microsoft Translator Blog","article_publisher":"https:\/\/www.facebook.com\/microsofttranslator","article_published_time":"2017-02-03T18:35:55+00:00","og_image":[{"url":"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2017\/02\/Christian_Federmann.jpg"}],"author":"Microsoft Translator","twitter_card":"summary_large_image","twitter_creator":"@mstranslator","twitter_site":"@mstranslator","twitter_misc":{"Written by":"Microsoft Translator","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/#article","isPartOf":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/"},"author":{"name":"Microsoft Translator","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/0a163e1bf796b3bb651085032849cf37"},"headline":"Microsoft Translator publicly releases speech translation corpus","datePublished":"2017-02-03T18:35:55+00:00","dateModified":"2017-02-03T18:35:55+00:00","mainEntityOfPage":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/"},"wordCount":274,"publisher":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/#organization"},"inLanguage":"en-US"},{"@type":"WebPage","@id":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/","url":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/","name":"Microsoft Translator publicly releases speech translation corpus - Microsoft Translator Blog","isPartOf":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/#website"},"datePublished":"2017-02-03T18:35:55+00:00","dateModified":"2017-02-03T18:35:55+00:00","breadcrumb":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/"]}]},{"@type":"BreadcrumbList","@id":"https://www.microsoft.com\/en-us\/translator/blog\/2017\/02\/03\/microsoft-translator-publicly-releases-speech-translation-corpus\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https://www.microsoft.com\/en-us\/translator/blog\/"},{"@type":"ListItem","position":2,"name":"Microsoft Translator publicly releases speech translation corpus"}]},{"@type":"WebSite","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#website","url":"https://www.microsoft.com\/en-us\/translator/blog\/","name":"Microsoft Translator Blog","description":"","publisher":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https://www.microsoft.com\/en-us\/translator/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#organization","name":"Microsoft Corporation","url":"https://www.microsoft.com\/en-us\/translator/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/logo\/image\/","url":"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2021\/05\/microsoft_logo_element-300x300-1.png","contentUrl":"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2021\/05\/microsoft_logo_element-300x300-1.png","width":300,"height":300,"caption":"Microsoft Corporation"},"image":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.youtube.com\/playlist?list=PLD7HFcN7LXRd4kd2XgZjIbQ8TwTC32Zc9","https:\/\/www.facebook.com\/microsofttranslator","https:\/\/twitter.com\/mstranslator"]},{"@type":"Person","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/0a163e1bf796b3bb651085032849cf37","name":"Microsoft Translator","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/d22a72f3ca14b9d59f8bcdc837a51c6bf52b4a675c30ef18a9275753db5eda6c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d22a72f3ca14b9d59f8bcdc837a51c6bf52b4a675c30ef18a9275753db5eda6c?s=96&d=mm&r=g","caption":"Microsoft Translator"},"url":"https://www.microsoft.com\/en-us\/translator/blog\/author\/mtteam\/"}]}},"_links":{"self":[{"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/posts\/5145","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/users\/54"}],"replies":[{"embeddable":true,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/comments?post=5145"}],"version-history":[{"count":0,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/posts\/5145\/revisions"}],"wp:attachment":[{"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/media?parent=5145"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/categories?post=5145"},{"taxonomy":"post_tag","embeddable":true,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/tags?post=5145"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}