Abstract

We present a methodology for automatically extracting and summarizing reports of significant local events from large-scale Twitter feeds.   While previous work has relied on an analysis of tweet text to identify local events, we show how to reliably detect events using only time series analysis of geotagged tweet volumes from localized regions. The algorithm sweeps through different spatial and temporal resolutions and finds events as anomalous spikes in the rate of geotagged tweets. We applied the approach to a corpus of over 733 million geotagged tweets. Using a panel of 103 crowdsourced judges who tagged 2400 detected events, we achieved a local event detection precision of 70%. Using these judged events as ground truth, a decision tree classifier was able to raise the detection precision to 93%.