Predictive technologies are already effective at detecting and blocking malware at first sight. A new malware prediction competition on Kaggle will challenge the data science community to push these technologies even further—to stop malware before it is even seen.
The Microsoft-sponsored competition calls for participants to predict if a device is likely to encounter malware given the current machine state. Participants will build models using 9.4GB of anonymized data from 16.8M devices, and the resulting models will be scored by their ability to make correct predictions. Winning teams get $25,000 in total prizes.
The competition provides academics and researchers with varied backgrounds a fresh opportunity to work on a real-world problem using a fresh set of data from Microsoft. Results from the contest will help us identify opportunities to further improve Microsoft’s layered defenses, focusing on preventative protection. Not all machines are equally likely to get malware; competitors will help build models for identifying devices that have a higher risk of getting malware so that preemptive action can be taken.
Cybersecurity is the central challenge of our digital age. Today, Windows Defender Advanced Threat Protection (Windows Defender ATP) uses intelligent systems to protect millions of devices against cyberattacks every day. Machine learning and artificial intelligence drive cloud-delivered protections that catch and predict new and emerging threats.
We also believe in the power of working with the broader research community to stay ahead of threats. Microsoft’s 2015 malware classification competition on Kaggle was a huge success, with the dataset provided by Microsoft cited in more than 50 research papers in multiple languages. To this day, the 0.5TB dataset from that competition is still used for research and continues to produce value for Microsoft and the data science community. This new competition is organized by the Windows Defender ATP Research team, in cooperation with Northeastern University and Georgia Institute of Technology as academic partners, with the goal of bringing new ideas to the fight against malware attacks and breaches.
Kaggle is a platform for data scientists to create data science projects, download datasets, and participate in contests. Microsoft is happy to use the Kaggle platform to engage a rich community of amazing thinkers. We think this collaboration will result in better protection for Microsoft customers and the Internet at large. Stay tuned for the results, we can’t wait to see what the data science community comes up with!
Chase Thomas and Robert McCann
Windows Defender Research team