{"id":182820,"date":"2007-10-01T00:00:00","date_gmt":"2009-10-31T10:03:49","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/anomaly-detection-in-large-networks-using-approximation-techniques\/"},"modified":"2016-09-09T09:52:33","modified_gmt":"2016-09-09T16:52:33","slug":"anomaly-detection-in-large-networks-using-approximation-techniques","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/anomaly-detection-in-large-networks-using-approximation-techniques\/","title":{"rendered":"Anomaly Detection in Large Networks using Approximation Techniques"},"content":{"rendered":"<div class=\"asset-content\">\n<p>A tremendous enthusiasm for amassing enormous amounts of network measurement data has spurred the development of numerous applications that incorporate data mining techniques. In this talk we question the hidden assumption in these applications that one needs to collect &#8220;all the data all the time&#8221;.  We consider this question in the context of an anomaly detection application. We study the popular &#8220;Subspace method detector&#8221; that is based on PCA analysis.  This method normally collects data from many parts of the network, centralizes the data, and then analyzes it to uncover anomalies. In our research, we ask whether we can throw away some of the data. Can we still do anomaly detection accurately without all the data?<\/p>\n<p>To avoid backhauling large amounts of data through networks, we present a framework that couples filtering at local monitors with centralized detectors that can operate on approximate views of the global data (i.e. network state). We show that the errors made by the central detector &#8211; due to the use of approximate data &#8211; can be upper bounded using matrix perturbation theory. The challenge is to design the filtering parameters; these are determined by the bound on detection errors and the criteria being tracked for detection. Our approximate anomaly detector can detect anomalies with 80 to 90% less data than the original method, and incurs less than a 1% reduction in detection accuracy. Finally, we comment on issues and future directions for data reduction in the context of anomaly detection.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A tremendous enthusiasm for amassing enormous amounts of network measurement data has spurred the development of numerous applications that incorporate data mining techniques. In this talk we question the hidden assumption in these applications that one needs to collect &#8220;all the data all the time&#8221;. We consider this question in the context of an anomaly [&hellip;]<\/p>\n","protected":false},"featured_media":194762,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr_hide_image_in_river":0,"footnotes":""},"research-area":[],"msr-video-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-session-type":[],"msr-impact-theme":[],"msr-pillar":[],"msr-episode":[],"msr-research-theme":[],"class_list":["post-182820","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/Me4ebFEcWQk","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/182820","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":0,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/182820\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/194762"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=182820"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=182820"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=182820"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=182820"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=182820"},{"taxonomy":"msr-session-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-session-type?post=182820"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=182820"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=182820"},{"taxonomy":"msr-episode","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-episode?post=182820"},{"taxonomy":"msr-research-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-theme?post=182820"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}