Feature Subsumption for Opinion Analysis

  • Ellen Riloff ,
  • Siddharth Patwardhan ,
  • Janyce Wiebe

Proceedings of EMNLP-06, the Conference on Empirical Methods in Natural Language Processing |

Published by Association for Computational Linguistics

Lexical features are key to many approaches to sentiment analysis and opinion detection. A variety of representations have been used, including single words, multi-word Ngrams, phrases, and lexicosyntactic patterns. In this paper, we use a subsumption hierarchy to formally define different types of lexical features and their relationship to one another, both in terms of representational coverage and performance. We use the subsumption hierarchy in two ways: (1) as an analytic tool to automatically identify complex features that outperform simpler features, and (2) to reduce a feature set by removing unnecessary features. We show that reducing the feature set improves performance on three opinion classification tasks, especially when combined with traditional feature selection.