Skip to main content
Microsoft Security

Microsoft researchers work with Intel Labs to explore new deep learning approaches for malware classification

The opportunities for innovative approaches to threat detection through deep learning, a category of algorithms within the larger framework of machine learning, are vast. Microsoft Threat Protection today uses multiple deep learning-based classifiers that detect advanced threats, for example, evasive malicious PowerShell.

In continued exploration of novel detection techniques, researchers from Microsoft Threat Protection Intelligence Team and Intel Labs are collaborating to study new applications of deep learning for malware classification, specifically:

For the first part of the collaboration, the researchers built on Intel’s prior work on deep transfer learning for static malware classification and used a real-world dataset from Microsoft to ascertain the practical value of approaching the malware classification problem as a computer vision task. The basis for this study is the observation that if malware binaries are plotted as grayscale images, the textural and structural patterns can be used to effectively classify binaries as either benign or malicious, as well as cluster malicious binaries into respective threat families.

The researchers used an approach that they called static malware-as-image network analysis (STAMINA). Using the dataset from Microsoft, the study showed that the STAMINA approach achieves high accuracy in detecting malware with low false positives.

The results and further technical details of the research are listed in the paper STAMINA: Scalable deep learning approach for malware classification and set the stage for further collaborative exploration.

The role of static analysis in deep learning-based malware classification

While static analysis is typically associated with traditional detection methods, it remains to be an important building block for AI-driven detection of malware. It is especially useful for pre-execution detection engines: static analysis disassembles code without having to run applications or monitor runtime behavior.

Static analysis produces metadata about a file. Machine learning classifiers on the client and in the cloud then analyze the metadata and determine whether a file is malicious. Through static analysis, most threats are caught before they can even run.

For more complex threats, dynamic analysis and behavior analysis build on static analysis to provide more features and build more comprehensive detection. Finding ways to perform static analysis at scale and with high effectiveness benefits overall malware detection methodologies.

To this end, the research borrowed knowledge from  computer vision domain to build an enhanced static malware detection framework that leverages deep transfer learning to train directly on portable executable (PE) binaries represented as images.

Analyzing malware represented as image

To establish the practicality of the STAMINA approach, which posits that malware can be classified at scale by performing static analysis on malware codes represented as images, the study covered three main steps: image conversion, transfer learning, and evaluation.

Diagram showing the steps for the STAMINA approach: pre-processing, transfer learning, and evaluation

First, the researchers prepared the binaries by converting them into two-dimensional images. This step involved pixel conversion, reshaping, and resizing. The binaries were converted into a one-dimensional pixel stream by assigning each byte a value between 0 and 255, corresponding to pixel intensity. Each pixel stream was then transformed into a two-dimensional image by using the file size to determine the width and height of the image.

The second step was to use transfer learning, a technique for overcoming the isolated learning paradigm and utilizing knowledge acquired for one task to solve related ones. Transfer learning has enjoyed tremendous success within several different computer vision applications. It accelerates training time by bypassing the need to search for optimized hyperparameters and different architectures—all this while maintaining high classification performance. For this study, the researchers used Inception-v1 as the base model.

The study was performed on a dataset of 2.2 million PE file hashes provided by Microsoft. This dataset was temporally split into 60:20:20 segments for training, validation, and test sets, respectively.

Diagram showing a DNN with pre-trained weights on natural images, and the last portion fine-tuned with new data

Finally, the performance of the system was measured and reported on the holdout test set. The metrics captured include recall at specific false positive range, along with accuracy, F1 score, and area under the receiver operating curve (ROC).

Findings

The joint research showed that applying STAMINA to real-world hold-out test data set achieved a recall of 87.05% at 0.1% false positive rate, and 99.66% recall and 99.07% accuracy at 2.58% false positive rate overall. The results certainly encourage the use of deep transfer learning for the purpose of malware classification. It helps accelerate training by bypassing the search for optimal hyperparameters and architecture searches, saving time and compute resources in the process.

The study also highlights the pros and cons of sample-based methods like STAMINA and metadata-based classification methods. For example, STAMINA can go in-depth into samples and extract additional signals that might not be captured in the metadata.  However, for bigger size applications, STAMINA becomes less effective due to limitations in converting billions of pixels into JPEG images and then resizing them. In such cases, metadata-based methods show advantages over our research.

Conclusion and future work

The use of deep learning methods for detecting threats drives a lot of innovation across Microsoft. The collaboration with Intel Labs researchers is just one of the ways in which Microsoft researchers and data scientists continue to explore novel ways to improve security overall.

This joint research is a good starting ground for more collaborative work. For example, the researchers plan to collaborate further on platform acceleration optimizations that can allow deep learning models to be deployed on client machines with minimal performance impact. Stay tuned.

 

Jugal Parikh, Marc Marino

Microsoft Threat Protection Intelligence Team