WikiTrends: Unstructured Wikipedia-Based Text Analytics Framework
- Michel Naim Gerguis ,
- Cherif R. Salama ,
- M. El-Kharashi
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) | , pp. 45-57
WikiTrends is a new analytics framework for Wikipedia articles. It adds the temporal/spatial dimensions to Wikipedia to visualize the extracted information converting the big static encyclopedia to a vibrant one by enabling the generation of aggregated views in timelines or heat maps for any user-defined collection from unstructured text. Data mining techniques were applied to detect the location, start and end year of existence, gender, and entity class for 4.85 million pages. We evaluated our extractors over a small manually tagged random set of articles. Heat maps of notable football players’ counts over history or dominant occupations in some specific era are samples of WikiTrends maps while timelines can easily illustrate interesting fame battles over history between male and female actors, music genres, or even between American, Italian, and Indian films. Through information visualization and simple configurations, WikiTrends starts a new experience in answering questions through a figure.