Automatically Deriving Structured Knowledge Bases From On-Line Dictionaries

  • Stephen D. Richardson ,
  • Lucy Vanderwende ,
  • William Dolan

MSR-TR-93-07 |

We propose combining dictionary-based and example-based natural language (NL) processing techniques in a framework that we believe will provide substantive enhancements to NL analysis systems. The centerpiece of this framework is a relatively large-scale lexical knowledge base that we have constructed automatically from an online version of Longman’s Dictionary of Contemporary English (LDOCE), and that is currently used in our NL analysis system to direct phrasal attachments. After discussing the effective use of example-based processing in hybrid NL systems, we compare recent dictionary-based and example-based work, and identify the aspects of this work that are included in the proposed framework. We then describe the methods employed in automatically creating our lexical knowledge base from LDOCE, and its current and planned use as a large-scale example base in our NL analysis system. This knowledge base is structured as a highly interconnected network of words linked by semantic relations such as is_a, has_part, location_of, typical_object, and is_for. We claim that within the proposed hybrid framework, it provides a uniquely rich source of information for use during NL analysis.