{"id":714694,"date":"2020-12-30T03:10:23","date_gmt":"2020-12-30T11:10:23","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=714694"},"modified":"2021-03-24T19:03:33","modified_gmt":"2021-03-25T02:03:33","slug":"learning-latent-semantic-annotations-for-grounding-natural-language-to-structured-data","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/learning-latent-semantic-annotations-for-grounding-natural-language-to-structured-data\/","title":{"rendered":"Learning Latent Semantic Annotations for Grounding Natural Language to Structured Data"},"content":{"rendered":"<p>Previous work on grounded language learning did not fully capture the semantics underlying the correspondences between structured world state representations and texts, especially those between numerical values and lexical terms. In this paper, we attempt at learning explicit latent semantic annotations from paired structured tables and texts, establishing correspondences between various types of values and texts. We model the joint probability of data fields, texts, phrasal spans, and latent annotations with an adapted semi-hidden Markov model, and impose a soft statistical constraint to further improve the performance. As a by-product, we leverage the induced annotations to extract templates for language generation. Experimental results suggest the feasibility of the setting in this study, as well as the effectiveness of our proposed framework.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Previous work on grounded language learning did not fully capture the semantics underlying the correspondences between structured world state representations and texts, especially those between numerical values and lexical terms. In this paper, we attempt at learning explicit latent semantic annotations from paired structured tables and texts, establishing correspondences between various types of values and [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"Association for Computational Linguistics","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"3761","msr_page_range_end":"3771","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"Empirical Methods in Natural Language Processing","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"2891487602","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2018-1-1","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13545,13555],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246694,246691,248863,248845,248866,248857,248854,246805,246808,248851,248860],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-714694","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-research-area-search-information-retrieval","msr-locale-en_us","msr-field-of-study-artificial-intelligence","msr-field-of-study-computer-science","msr-field-of-study-data-field","msr-field-of-study-data-model","msr-field-of-study-joint-probability-distribution","msr-field-of-study-language-acquisition","msr-field-of-study-markov-model","msr-field-of-study-natural-language","msr-field-of-study-natural-language-processing","msr-field-of-study-semantics","msr-field-of-study-value-type"],"msr_publishername":"Association for Computational Linguistics","msr_edition":"","msr_affiliation":"","msr_published_date":"2018-1-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/www.aclweb.org\/anthology\/D18-1411.pdf","label_id":"243132","label":0},{"type":"doi","viewUrl":"false","id":"false","title":"10.18653\/V1\/D18-1411","label_id":"243106","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/dblp.uni-trier.de\/db\/conf\/emnlp\/emnlp2018.html#QinYWWL18","label_id":"243109","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/dx.doi.org\/10.18653\/v1\/d18-1411","label_id":"243109","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/www.aclweb.org\/anthology\/D18-1411\/","label_id":"243109","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Guanghui Qin","user_id":0,"rest_url":false},{"type":"text","value":"Jin-Ge Yao","user_id":0,"rest_url":false},{"type":"text","value":"Xuening Wang","user_id":0,"rest_url":false},{"type":"text","value":"Jinpeng Wang","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Chin-Yew Lin","user_id":31493,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Chin-Yew Lin"}],"msr_impact_theme":[],"msr_research_lab":[199560],"msr_event":[],"msr_group":[144919],"msr_project":[717085,714646],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":717085,"post_title":"Data2Text: Automated Text Generation from Structured Data","post_name":"data2text-automated-text-generation-from-structured-data","post_type":"msr-project","post_date":"2021-01-13 07:51:45","post_modified":"2021-01-15 00:46:14","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/data2text-automated-text-generation-from-structured-data\/","post_excerpt":"The Data2Text project aims to automatically generate fluent and fact-based descriptions or utterances given a data table. Typical business applications for text generation include the generation of financial and sports news stories, the generation of product descriptions, the analysis and interpretation of business data, and the analysis and interpretation of Internet of Things data, etc. Figure 1 gives an example of the automatic generation of weather forecasts. Figure 1a is a structured weather data collected&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/717085"}]}},{"ID":714646,"post_title":"VERT: Versatile Entity Recognition &amp; Disambiguation Toolkit","post_name":"vert-versatile-entity-recognition-disambiguation-toolkit","post_type":"msr-project","post_date":"2020-12-30 02:54:35","post_modified":"2021-10-13 21:15:01","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/vert-versatile-entity-recognition-disambiguation-toolkit\/","post_excerpt":"While knowledge about entities is a key building block in the mentioned systems, creating effective\/efficient models for real-world scenarios remains a challenge (tech\/data\/real workloads). Based on such needs, we've created VERT - a Versatile Entity Recognition &amp; Disambiguation Toolkit. VERT is a pragmatic toolkit that combines rules and ML, offering both powerful pretrained models for core entity types (recognition and linking) and the easy creation of custom models. Custom models use our deep learning-based NER\/EL&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/714646"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/714694","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/714694\/revisions"}],"predecessor-version":[{"id":714697,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/714694\/revisions\/714697"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=714694"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=714694"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=714694"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=714694"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=714694"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=714694"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=714694"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=714694"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=714694"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=714694"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=714694"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=714694"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=714694"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}