{"id":398906,"date":"2017-07-11T10:41:39","date_gmt":"2017-07-11T17:41:39","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=398906"},"modified":"2018-10-16T20:03:24","modified_gmt":"2018-10-17T03:03:24","slug":"unified-stochastic-engine-use-speech-recognition","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/unified-stochastic-engine-use-speech-recognition\/","title":{"rendered":"Unified stochastic engine (USE) for speech recognition"},"content":{"rendered":"<div id=\"LayoutWrapper\">\n<div class=\"ng-scope\">\n<div>\n<div class=\"pure-g document stats-document ng-isolate-scope\">\n<section class=\"tab-pane pure-u-1-1 u-printing-display-inline-ie u-printing-display-inline-ff\">\n<div class=\"ng-scope\">\n<div class=\"ng-scope\">\n<section id=\"319386\" class=\"pure-u-1-1 document-abstract document-tab u-p-2 ng-isolate-scope\">\n<div class=\"pure-g\">\n<div class=\"ng-scope pure-u-1-1\">\n<div class=\"ng-scope\">\n<div class=\"abstract-text ng-binding\">A unified stochastic engine (USE) that jointly optimizes both acoustic and language models is presented. In the USE, not only can one iteratively adjust language probabilities to fit the given acoustic representations, but one can also adjust acoustic models (including feature representation) guided by language constraints. From the language modeling point of view, the USE makes it possible to encode acoustically confusable words in the language probabilities. From the acoustic modeling point of view, the language-constraint approach makes it possible to focus on acoustic words for which language models lack enough discrimination capacity. The authors report preliminary experimental results for Wall Street Journal continuous 5000-word speaker-independent dictation. The error rate is reduced from 7.3% to 6.9% with the proposed method.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A unified stochastic engine (USE) that jointly optimizes both acoustic and language models is presented. In the USE, not only can one iteratively adjust language probabilities to fit the given acoustic representations, but one can also adjust acoustic models (including feature representation) guided by language constraints. From the language modeling point of view, the USE [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1993. ICASSP-93.","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1993. ICASSP-93.","msr_doi":"10.1109\/ICASSP.1993.319386","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"1993-04-27","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"http:\/\/ieeexplore.ieee.org\/document\/319386\/","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13545],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-398906","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_publishername":"","msr_edition":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1993. ICASSP-93.","msr_affiliation":"","msr_published_date":"1993-04-27","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"http:\/\/ieeexplore.ieee.org\/document\/319386\/","msr_doi":"10.1109\/ICASSP.1993.319386","msr_publication_uploader":[{"type":"url","title":"http:\/\/ieeexplore.ieee.org\/document\/319386\/","viewUrl":false,"id":false,"label_id":0},{"type":"doi","title":"10.1109\/ICASSP.1993.319386","viewUrl":false,"id":false,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":0,"url":"http:\/\/ieeexplore.ieee.org\/document\/319386\/"}],"msr-author-ordering":[{"type":"user_nicename","value":"xdh","user_id":34869,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=xdh"},{"type":"text","value":"Marie-Th\u00e9r\u00e8se Belin","user_id":0,"rest_url":false},{"type":"user_nicename","value":"fil","user_id":31811,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=fil"},{"type":"user_nicename","value":"mehwang","user_id":32872,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=mehwang"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/398906","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/398906\/revisions"}],"predecessor-version":[{"id":398912,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/398906\/revisions\/398912"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=398906"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=398906"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=398906"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=398906"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=398906"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=398906"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=398906"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=398906"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=398906"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=398906"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=398906"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=398906"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=398906"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}