Cybercrime is as much a people problem as it is a technology problem. To respond effectively, the defender community must harness machine learning to compliment the strengths of people. This is the philosophy that undergirds Azure Sentinel. Azure Sentinel is a cloud-native SIEM that exploits machine learning techniques to empower security analysts, data scientists, and engineers to focus on the threats that matter. You may have heard of similar solutions from other vendors, but the Fusion technology that powers Azure Sentinel sets this SIEM apart for three reasons:

  1. Fusion finds threats that fly under the radar, by combining low fidelity, “yellow” anomalous activities into high fidelity “red” incidents.
  2. Fusion does this by using machine learning to combine disparate data—network, identity, SaaS, endpoint—from both Microsoft and Partner data sources.
  3. Fusion incorporates graph-based machine learning and a probabilistic kill chain to reduce alert fatigue by 90 percent.

Azure Sentinel

Intelligent security analytics for your entire enterprise.

Learn more

You can get a sense of how powerful Fusion is by looking at data from December 2019. During that month, billions of events flowed into Azure Sentinel from thousands of Azure Sentinel customers. Nearly 50 billion anomalous alerts were identified and graphed. After Fusion applied the probabilistic kill chain, the graph was reduced to 110 sub graphs. A second level of machine learning reduced it further to just 25 actionable incidents. This is how Azure Sentinel reduces alert fatigue by 90 percent.

Infographic showing alerts to high-fidelity incidents.

New Fusion scenarios—Microsoft Defender ATP + Palo Alto firewalls

There are currently 35 multi-stage attack scenarios generally available through Fusion machine learning technology in Azure Sentinel. Today, Microsoft has introduced several additional scenarios—in public preview—using Microsoft Defender Advanced Threat Protection (ATP) and Palo Alto logs. This way, you can leverage the power of Sentinel and Microsoft Threat Protection as complementary technologies for the best customer protection.

  • Detect otherwise missed attacks—By stitching together disparate datasets using Bayesian methods, Fusion helps to detect attacks that could have been missed.
  • Reduce mean time to remediate—Microsoft Threat Protection provides a best in class investigation experience when addressing alerts from Microsoft products. For non-Microsoft datasets, you can leverage hunting and investigation tools in Azure Sentinel.

Here are a few examples:

An endpoint connects to TOR network followed by suspicious activity on the Internal network—Microsoft Defender ATP detects that a user inside the network made a request to a TOR anonymization service. On its own this incident would be a low-level fidelity. It’s suspicious but doesn’t rise to the level of a high-level threat. Palo Alto firewalls registers anomalous activity from the same IP address, but it isn’t risky enough to block. Separately neither of these alerts get elevated, but together they indicate a multi-stage attack. Fusion makes the connection and promotes it to a high-fidelity incident.

Infographic of the Palo Alto firewall detecting threats.

A PowerShell program on an endpoint connects to a suspicious IP address, followed by suspicious activity on the Internal network—Microsoft Defender ATP generates an alert when a PowerShell program makes a suspicious network connection. If Palo Alto allows traffic from that IP address back into the network, Fusion ties the two incidents together to create a high-fidelity incident

An endpoint connects to a suspicious IP followed by anomalous activity on the Internal network—If Microsoft Defender ATP detects an outbound connection to an IP with a history of unauthorized access and Palo Alto firewalls allows an inbound request from that same IP address, it’s elevated by Fusion.

How Fusion works

  1. Construct graph

The process starts by collecting data from several data sources, such as Microsoft products, Microsoft security partner products, and other cloud providers. Each of those security products output anomalous activity, which together can number in the billions or trillions. Fusion gathers all the low and medium level alerts detected in a 30-day window and creates a graph. The graph is hyperconnected and consists of billions of vertices and edges. Each entity is represented by a vertex (or node). For example, a vertex could be a user, an IP address, a virtual machine (VM), or any other entity within the network. The edges (or links) represent all the activities. If a user accesses company resources with a mobile device, both the device and the user are represented as vertices connected by an edge.

Image of an AAD Detect graph.

Once the graph is built there are still billions of alerts—far too many for any security operations team to make sense of. However, within those connected alerts there may be a pattern that indicates something more serious. The human brain is just not equipped to quickly remove it. This is where machine learning can make a real difference.

  1. Apply probabilistic kill chain

Fusion applies a probabilistic kill chain which acts as a regularizer to the graph. The statistical analysis is based on how real people—Microsoft security experts, vendors, and customers—triage alerts. For example, defenders prioritize kill chains that are time bound. If a kill chain is executed within a day, it will take precedence over one that is enacted over a few days. An even higher priority kill chain is one in which all steps have been completed. This intelligence is encoded into the Fusion machine learning statistical model. Once the probabilistic kill chain is applied, Fusion outputs a smaller number of sub graphs, reducing the number of threats from billions to hundreds.

  1. Score the attack

To reduce the noise further, Fusion uses machine learning to apply a final round of scoring. If labeled data exists, Fusion uses random forests. Labeled data for attacks is generated from the extensive Azure red team that execute these scenarios. If labeled data doesn’t exist Fusion uses spectral clustering.

Some of the criteria used to elevate threats include the number of high impact activity in the graph and whether the subgraph connects to another subgraph.

The output of this machine learning process is tens of threats. These are extremely high priority alerts that require immediate action. Without Fusion, these alerts would likely remain hidden from view, since they can only be seen after two or more low level threats are stitched together to shine a light on stealth activities. AI-generated alerts can now be handed off to people who will determine how to respond.

The great promise of AI in cybersecurity is its ability to enable your cybersecurity people to stay one step ahead of the humans on the other side. AI-backed Fusion is just one example of the innovative potential of partnering technology and people to take on the threats of today and tomorrow.

Learn more

Read more about Azure Sentinel and dig into all the Azure Sentinel detection scenarios.

Also, bookmark the Security blog to keep up with our expert coverage on security matters. Follow us at @MSFTSecurity for the latest news and updates on cybersecurity.