{"id":164904,"date":"2012-09-04T00:00:00","date_gmt":"2012-09-04T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/scalable-telemetry-classification-for-automated-malware-detection\/"},"modified":"2018-10-16T21:12:34","modified_gmt":"2018-10-17T04:12:34","slug":"scalable-telemetry-classification-for-automated-malware-detection","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/scalable-telemetry-classification-for-automated-malware-detection\/","title":{"rendered":"Scalable Telemetry Classification for Automated Malware Detection"},"content":{"rendered":"<div class=\"asset-content\">\n<p>Industry reports and blogs have estimated the amount of malware based on it known malicious files. This paper extends this analysis to the amount of unknown malware. The study is based on 26.7 million files referenced in telemetry reports from 50 million computers running commercial anti-malware (AM) products. To estimate the undetected malware, a classifier predicts the underlying nature of unknown files recorded in the telemetry reports. The telemetry classifier predicts that 69.6% (4.27 million) of the unknown files are malicious. Assuming the unknown files predicted to be malicious by the classifier are malware, the telemetry classifier also allows us to estimate the efficacy of the AM system indicating that signatures detected 82.8% (20.6 million) of the malicious files. We have validated our system by conducting a longitudinal study to measure the false positive and false negative rates over a period of thirteen months.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Industry reports and blogs have estimated the amount of malware based on it known malicious files. This paper extends this analysis to the amount of unknown malware. The study is based on 26.7 million files referenced in telemetry reports from 50 million computers running commercial anti-malware (AM) products. To estimate the undetected malware, a classifier [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"Springer","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"Proceedings of European Symposium on Research in Computer Security (ESORICS2012)","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"Proceedings of European Symposium on Research in Computer Security (ESORICS2012)","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"Jack W. Stokes, Joe Faulhaber, Mady Marinescu, Anil Thomas, Marius Gheorghescu","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2012-09-10","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":2012,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13558],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-164904","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-security-privacy-cryptography","msr-locale-en_us"],"msr_publishername":"Springer","msr_edition":"Proceedings of European Symposium on Research in Computer Security (ESORICS2012)","msr_affiliation":"","msr_published_date":"2012-09-10","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"282902","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"Scalable Telemetry Classification for Automated Malware Detection","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2012\/09\/stokesScalableTelemetry.pdf","id":282902,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"edited_text","value":"Jack W. Stokes","user_id":32427,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Jack W. Stokes"},{"type":"user_nicename","value":"jplatt","user_id":32416,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=jplatt"},{"type":"user_nicename","value":"helenw","user_id":31994,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=helenw"},{"type":"text","value":"Joe Faulhaber","user_id":0,"rest_url":false},{"type":"text","value":"Mady Marinescu","user_id":0,"rest_url":false},{"type":"text","value":"Anil Thomas","user_id":0,"rest_url":false},{"type":"text","value":"Marius Gheorghescu","user_id":0,"rest_url":false}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[385718],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":385718,"post_title":"Machine Learning for Security","post_name":"machine-learning-for-security-2","post_type":"msr-project","post_date":"2017-05-22 10:56:08","post_modified":"2019-03-19 16:37:41","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/machine-learning-for-security-2\/","post_excerpt":"In this project, we use machine learning to improve computer security. Primarily, this project involves developing new machine learning methods to detect malware, but has involved other, non-published work on detecting DDOS and cloud-based threats. A number of non-published efforts related to this project\u00a0have been helping protect Windows users for years.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/385718"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/164904","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/164904\/revisions"}],"predecessor-version":[{"id":533833,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/164904\/revisions\/533833"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=164904"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=164904"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=164904"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=164904"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=164904"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=164904"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=164904"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=164904"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=164904"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=164904"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=164904"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=164904"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=164904"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}