WebView: Scalable Information Monitoring for Data-Intensive Web Applications
- Navendu Jain ,
- Mike Dahlin ,
- Yin Zhang
WWW2008, Beijing, China |
We present WebView, a scalable information monitoring service for data-intensive Web applications that continuously monitors local application state, aggregates local state into a global view, and uses the global view to help ensure high performance and high availability for these applications. We demonstrate the effectiveness of WebView by building three key Web applications: (a) a data prefetching service for content distribution and (b) a heavy hitter monitoring service for detecting anomalies such as flash crowds and denial-of service attacks, and (c) monitoring large-scale systems hosting Web services. To provide a global system view in real-time, a monitoring service faces two key challenges: (1) scalability to many nodes and attributes as well as high volume of data updates and (2) bounding the freshness of monitoring results i.e., the time between an event update and its notification to the application. To address these challenges, we design, implement, and evaluate WebView that leverages Distributed Hash Tables (DHT) to build scalable aggregation trees, and that exploits precision-performance tradeoffs which tolerate a bounded small approximation error to significantly reduce the monitoring load. WebView enables applications to control this tradeoff by providing (1) arithmetic filtering that caches recent reports and only transmits new information if it differs by some numeric threshold (e.g., ± 10%) from the cached report and (2) temporal batching that combines multiple updates that arrive near one another in time into a single network message, and that further bounds the freshness of monitoring query results. Our prototype implementation of WebView combines these techniques with DHTbased aggregation hierarchy to implement a single highlyscalable monitoring system. Evaluation of our WebView prototype for our three Web applications shows that our system provides significant application benefits and is an order of magnitude more scalable than existing approaches while still delivering fresh results with high accuracy.