We describe a machine learning approach to the development of several key components in a question answering system and the way they were used in the UIUC QA system.
A unified learning approach is used to develop a part-of-speech tagger, a shallow parser, a named entity recognizer and a module for identifying a question’s target. These components are used in analyzing questions, as well as in the analysis of selected passages that may contain the sought after answer.
The performance of the learned modules seems to be very high, (e.g., mid 90% for identifying noun phrases in sentences), though evaluating those on a large number of passages proved to be time consuming. Other components of the system, a passage retrieval module and an answer selection module, were put together in an ad-hoc fashion and significantly acted the overall performance. We ran the system only over about 60% of questions, answering a third of them correctly.