Coupled Hierarchical IR and Stochastic Models for Surface Information Extraction
- H. Zaragoza ,
- P. Gallinari
The 20th Annual Colloquium on IR Research, British Computer Society's Information Retrieval Specialist Group (IRSG) |
We present in this paper a combination of Machine Learning based Information Retrieval (IR) techniques and stochastic language modelling in a hierarchical system that extracts surface information from text. At the lowest level of this hierarchy, documents and paragraphs are successively routed with IR techniques. At the top level, a stochastic language model extracts the most relevant phrases, and labels the type of information they contain. The approach and preliminary results are demonstrated on a subset of the MUC-6 Scenario Templates task.