Identifying Similar Past Events in a Continuous Monitoring System
- Magdalena Balazinska | Dept. of Computer Science and Engineering, University of Washington
Stream processing engines (SPEs) are a new type of data management systems that provide continuous, low latency processing of data streams. These tools are useful in many application domains including computer system monitoring and network monitoring. SPEs focus on processing live data, directly as it arrives into the system. They provide limited or no support for combining live data with historical data. In most monitoring applications, however, when an abnormal event occurs, an administrator must manually examine the current state and the history of the system to understand what happened and diagnose the problem. To facilitate this process, we propose a new technique that automatically compares events on streams and identifies past events similar to newly detected ones. With our technique, an administrator is shown not only an alert but also past alerts that resemble the current situation. At the heart of our technique is a new similarity measure geared specifically toward the continuous monitoring domain. In this domain, interesting events typically correspond to abnormal situations. Two events are thus most alike when the same monitored objects record similar abnormal values. We show that existing similarity measures do not work well in this environment and we develop a new measure, the Context Distance Measure (CDM, geared specifically toward the monitoring domain. Through experiments with a real dataset from the PlanetLab overlay network, we show that CDM outperforms existing techniques by producing more accurate rankings of similar past events.
Speaker Details
Magdalena Balazinska is an assistant professor in the Computer Science and Engineering Department at the University of Washington. She received a PhD from MIT in February 2006 and was selected as one of five Microsoft New Faculty Fellows in 2007. Magdalena’s research interests are broadly in the fields of databases and distributed systems. She is currently working on Moirae, a system that integrates historical information into continuous monitoring engines, and the RFID Ecosystem, a system for managing RFID data, detecting probabilistic events from that data, and studying building-scale RFID deployments.
-
-
Jeff Running
-
Watch Next
-
-
Dion2: A new simple method to shrink matrix in Muon
- Anson Ho,
- Kwangjun Ahn
-
-
-
-
-
-
-
-