{"id":188668,"date":"2012-11-02T00:00:00","date_gmt":"2012-11-05T19:08:12","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/information-extraction-crossing-language-robustness-and-domain-barriers\/"},"modified":"2016-08-02T06:11:21","modified_gmt":"2016-08-02T13:11:21","slug":"information-extraction-crossing-language-robustness-and-domain-barriers","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/information-extraction-crossing-language-robustness-and-domain-barriers\/","title":{"rendered":"Information Extraction Crossing Language, Robustness and Domain Barriers"},"content":{"rendered":"<div class=\"asset-content\">\n<p>Modern communication technologies have made massive amounts of real-time news information in several languages readily available.  This led to the need to develop news-monitoring system that allows users to monitor multilingual news media in near real-time and search over stored content. One example of such a system is Translingual Automatic Language Exploration System, codenamed TALES.  In this talk I will briefly describe the architecture of TALES and focus on its information extraction component.  Information extraction is a crucial step toward understanding a text, as it identifies the important conceptual objects and relations between them in a discourse. I will address the portability of the used approach to different languages and show a method of propagating information into low resource languages from richer ones. Compared to other approaches that focuses on clean-text, I will also show the robustness of our technique to less-well-formed input.  For example, information extraction in a multilingual broadcast processing system has to deal with inaccurate automatic transcription and translation.  The resulting presence of non-target-language text in this case yields many false alarms, which raise the research problem of making information extraction robust to such noisy input text. If time permit, I will also discuss the application and adaptation of these techniques to health-care domain.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Modern communication technologies have made massive amounts of real-time news information in several languages readily available. This led to the need to develop news-monitoring system that allows users to monitor multilingual news media in near real-time and search over stored content. One example of such a system is Translingual Automatic Language Exploration System, codenamed TALES. [&hellip;]<\/p>\n","protected":false},"featured_media":197281,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-video-type":[206954],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-188668","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-video-type-microsoft-research-talks","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/IloalRDpDjo","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/188668","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/188668\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/197281"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=188668"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=188668"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=188668"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=188668"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=188668"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=188668"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=188668"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=188668"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=188668"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=188668"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}