The Potential and Limitations of Sentence Extraction for Summarization

In this paper we present an empirical study of the potential and limitation of sentence extraction in text summarization. Our results show that the single document generic summarization task as defined in DUC 2001 needs to be carefully refocused as reflected in the low inter-human agreement at 100-word 1 (0.40 score) and high upper bound at full text 2 (0.88) summaries. For 100-word summaries, the performance upper bound, 0.65, achieved oracle extracts3. Such oracle extracts show the promise of sentence extraction algorithms; however, we first need to raise inter-human agreement to be able to achieve this performance level. We show that compression is a promising direction and that the compression ratio of summaries affects average human and system performance.