{"id":152302,"date":"1993-05-01T00:00:00","date_gmt":"1993-05-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/automatically-deriving-structured-knowledge-bases-from-on-line-dictionaries\/"},"modified":"2018-10-16T22:05:36","modified_gmt":"2018-10-17T05:05:36","slug":"automatically-deriving-structured-knowledge-bases-from-on-line-dictionaries","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/automatically-deriving-structured-knowledge-bases-from-on-line-dictionaries\/","title":{"rendered":"Automatically Deriving Structured Knowledge Bases From On-Line Dictionaries"},"content":{"rendered":"<div class=\"asset-content\">\n<p>We propose combining dictionary-based and example-based natural language (NL) processing techniques in a framework that we believe will provide substantive enhancements to NL analysis systems. The centerpiece of this framework is a relatively large-scale lexical knowledge base that we have constructed automatically from an online version of Longman&#8217;s Dictionary of Contemporary English (LDOCE), and that is currently used in our NL analysis system to direct phrasal attachments. After discussing the effective use of example-based processing in hybrid NL systems, we compare recent dictionary-based and example-based work, and identify the aspects of this work that are included in the proposed framework. We then describe the methods employed in automatically creating our lexical knowledge base from LDOCE, and its current and planned use as a large-scale example base in our NL analysis system. This knowledge base is structured as a highly interconnected network of words linked by semantic relations such as is_a, has_part, location_of, typical_object, and is_for. We claim that within the proposed hybrid framework, it provides a uniquely rich source of information for use during NL analysis.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We propose combining dictionary-based and example-based natural language (NL) processing techniques in a framework that we believe will provide substantive enhancements to NL analysis systems. The centerpiece of this framework is a relatively large-scale lexical knowledge base that we have constructed automatically from an online version of Longman&#8217;s Dictionary of Contemporary English (LDOCE), and that [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"The Fifth Internatinal Conference on Theoretical and Methodological Issues in Machine Translation with special emphasis on: MT in the Next Generation","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"MSR-TR-93-07","msr_organization":"","msr_pages_string":"15","msr_page_range_start":"15","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"William Dolan, Stephen D. Richardson","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"Microsoft Research","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"1993-07-14","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":1993,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13545],"msr-publication-type":[193718],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-152302","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_publishername":"","msr_edition":"The Fifth Internatinal Conference on Theoretical and Methodological Issues in Machine Translation with special emphasis on: MT in the Next Generation","msr_affiliation":"","msr_published_date":"1993-07-14","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"15","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"MSR-TR-93-07","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"303833","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"tmi-1993-richardson","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/1993\/05\/TMI-1993-Richardson.pdf","id":303833,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":303833,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/1993\/05\/TMI-1993-Richardson.pdf"}],"msr-author-ordering":[{"type":"text","value":"Stephen D. Richardson","user_id":0,"rest_url":false},{"type":"user_nicename","value":"lucyv","user_id":32746,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=lucyv"},{"type":"text","value":"William Dolan","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[169675],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"techreport","related_content":{"projects":[{"ID":169675,"post_title":"MindNet","post_name":"mindnet","post_type":"msr-project","post_date":"2001-12-19 17:44:32","post_modified":"2019-08-14 14:34:33","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/mindnet\/","post_excerpt":"Overview MindNet is a knowledge representation project that uses our broad-coverage parser to build semantic networks from dictionaries, encyclopedias, and free text. MindNets are produced by a fully automatic process that takes the input text, sentence-breaks it, parses each sentence to build a semantic dependency graph (Logical Form), aggregates these individual graphs into a single large graph, and then assigns probabilistic weights to subgraphs based on their frequency in the corpus as a whole. The&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/169675"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/152302","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/152302\/revisions"}],"predecessor-version":[{"id":541995,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/152302\/revisions\/541995"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=152302"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=152302"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=152302"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=152302"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=152302"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=152302"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=152302"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=152302"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=152302"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=152302"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=152302"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=152302"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=152302"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}