Microsoft Digital recently implemented Azure Sentinel to replace a preexisting, on-premises solution for security information and event management (SIEM). Azure Sentinel supplies cloud-scale SIEM functionality that enables ingestion of, and response to, more than 20 billion cybersecurity events per day.

EXPLORE RELATED CONTENT

Microsoft Digital works diligently 24 hours a day, 7 days a week to help protect Microsoft IP, its employees, and its overall business health from security threats. It recently implemented Azure Sentinel to replace a preexisting, on-premises solution for security information and event management (SIEM). With Azure Sentinel, Microsoft Digital can ingest and appropriately respond to more than 20 billion cybersecurity events per day. Azure Sentinel supplies cloud-scale SIEM functionality that allows integration with crucial systems, provides accurate and timely response to security threats, and supports the SIEM requirements of Microsoft Digital.

Understanding SIEM at Microsoft

Microsoft Digital is responsible for maintaining security and compliance standards across Microsoft. Managing the massive volume of incoming security-related data is critical to Microsoft’s business health. Historically, Microsoft Digital has performed SIEM using a third-party tool hosted on-premises in Microsoft datacenters. However, Microsoft Digital recognized several areas in which they could improve their service by implementing a next-generation SIEM tool. Some of the challenges when using the old tool included:

  • Limited ability to accommodate increasing incoming traffic. Ingesting data into the previous SIEM tool was time consuming due to limited ingestion processes. As the number of incoming cybersecurity events continued to grow, it became more evident that the solution we were using wouldn’t be able to maintain the necessary throughput for data ingestion.
  • On-premises scalability and agility issues. The previous solution’s on-premises nature limited our ability to scale effectively and respond to changing business and security requirements at the speed that we required.
  • Increased training requirements. We needed to invest more resources in training and onboarding with the previous solution, because it was on-premises and customized to meet our requirements. If we recruited employees from outside Microsoft, they needed to learn the new solution—including its complex on-premises architecture—from the ground up.

As part of our ongoing digital transformation, Microsoft Digital is moving to cloud-based solutions with proven track records and active, customer-facing development and involvement. We need our technology stack to evolve at the speed of our business.

Modernizing SIEM with Azure Sentinel

In response to the challenges presented, we began assessing options for a new SIEM environment that would address the challenges positioning Microsoft Digital to manage continued growth of the cybersecurity landscape.

Feature assessment and planning

In partnership with the Azure Sentinel product team, Microsoft Digital’s security division assessed whether Sentinel would be a suitable replacement for our previous solution. Sentinel is a Microsoft-developed, cloud-native enterprise SIEM solution that uses the cloud’s agility and scalability to ensure rapid threat detection and response through:

  • Elastic scaling.
  • AI–infused detection capability.
  • A broad set of out-of-the-box data connectivity and ingestion solutions.

To move to Azure Sentinel, we needed to verify that equivalent features and capabilities were available in the new environment. We aligned security teams across Microsoft to ensure that we met all requirements. Some of these teams had mature monitoring and detection definitions in place, and we needed to understand those scenarios to accommodate feature-performance requirements. The issues that our previous solution presented narrowed our focus with respect to whether Sentinel would work, including throughput, agility, and usability.

Throughout the assessment period and into migration, Microsoft Digital worked closely with the Azure Sentinel product team to ensure that Azure Sentinel could provide the feature set Microsoft Digital required. Our engagement with the Sentinel team addressed two sets of needs simultaneously. We received significant incident-response benefits from Azure Sentinel while the product team worked with Microsoft Digital as if it were a customer. This close collaboration meant that the product team could identify what enterprise-scale customers needed more quickly. Not only were our requirements met, but we were able to provide feedback and testing for the Sentinel product team. This helped them better serve their large customers that have similar challenges, requirements, and needs.

Defining and refining SIEM detections

As we developed standards that met our new requirements, we also evaluated our previous SIEM solution’s functionality to determine how it would transition to Azure Sentinel. We examined three key aspects of incoming security data ingestion and event detection:

  • Data-source validity. We pull incoming SIEM data from hundreds of data locations across Microsoft. As time has passed, some of these data sources remained valid but others no longer provided relevant SIEM data. We assessed our entire data-source footprint to determine which data sources Azure Sentinel should ingest and which ones were no longer required. This process helped us to better understand our data-source environment and refine the amount of data ingested. There were several data sources that we weren’t ingesting with the previous solution because of performance limitations. We knew that we wanted to increase ingestion capability when moving to Azure Sentinel.
  • Detection importance. Our team examined event-detection definitions used throughout the previous SIEM solution, so we could understand how detections were being performed, which detection definitions generated alerts, and the volume of alerts from each detection. This information helped us identify the most important detection definitions, so we could prioritize these definitions in the migration process.
  • Detection validity. Our security teams evaluated the list of detections from our SIEM environment so we could identify invalid detections or detection definitions that required refinement. This helped us create a more streamlined set of detections when moving into Azure Sentinel, including combining multiple detection definitions and removing several detections.

Throughout this process, we worked with the Security Operations team to evaluate detections end-to-end. They got involved in the detection and data-source refinement process and were exposed to how these detections and data sources would work in Azure Sentinel.

Implementation

After feature parity and throughput capabilities were confirmed, we began the migration process from our previous solution to Azure Sentinel. Based on our initial testing, we added several implementation steps to ensure that our Azure Sentinel environment would readily meet our security environment’s needs.

Onboarding data sources

Properly onboarding data sources was a critical component in our implementation and one of the biggest benefits of the Azure Sentinel environment. With the massive amount of default connectors available in Sentinel, we were able to connect to most of our data sources without further customization. This included cloud data sources such as Azure Active Directory, Azure Security Center, and Microsoft Defender. However, it also included on-premises data sources, such as Windows Events and firewall systems.

We also connected to several enrichment sources that supplied more information for threat-hunting queries and detections. These enrichments sources included data from human-resources systems and other nontypical data sources. We used playbooks to create many of these connections.

We keep Azure Sentinel data in hot storage for 90 days, using Kusto Query Language (KQL) queries for detections, hunting, and investigation. We also use Azure Data Explorer for warm storage and Azure Data Lake for cold storage and retrieval for up to two years.

Refining detections

Based on testing, we refined our detection definitions further in Sentinel to support better alert suppression and aggregation. We didn’t want to overwhelm our Security Operations team with incidents. Therefore, we refined our detection definitions to include suppression logic when notification wasn’t required and aggregation logic to ensure that similar and related events were grouped together and not surfaced as multiple, individual alerts.

Increasing scale with the cloud

We used dedicated clusters for Azure Monitor Log Analytics to support the data-ingestion scalability we required. At a large enterprise scale, our previous solution was exceeding its capacity at 10 billion events per day. With dedicated clusters, we were able to accommodate that initial volume and add additional data sources to improve alert detection, thereby increasing our event ingestion to almost 20 billion events per day.

Customizing functionality

Our environment required several customizations to Sentinel functionality, which we implemented by using standard Azure Sentinel features and extension capabilities to meet our needs while still staying within the boundaries of standard functionality. Using common features for customization made our changes to Azure Sentinel easy to document and helped our security operations team better and more quickly understand and use the new features. We made several important customizations including:

  • Integration with our IT service-management system. We integrated Azure Sentinel with our security incident management solution. This had a two-fold positive effect, as it extended Sentinel information into our case-management environment and provided our support teams with exactly the information they need, regardless of which tool they’re in.
  • Implementation of Azure Security Center playbook to support scale. We used a playbook to automate the addition of more than 800 Azure subscriptions to Azure Security Center. We’ll use this same automation soon to include approximately 20,000 additional subscriptions.
  • High volume ingestion with Azure Event Hub and Azure Virtual Machine scales sets. We built a custom solution that ingested the large volume of events from our firewall systems that exceeded the capabilities of on-premises collection agents. With the new solution, we can ingest more than 100,000 events per second into Azure Sentinel from on-premises firewalls.
Data sources are ingested into a Log Analytics store and used to provide the portal user experience,  automation,  and case management.

Figure 1. Architecture for the new SIEM solution using Azure Sentinel

Benefits

We’ve experienced several important benefits from using Azure Sentinel as our SIEM tool, including:

  • Faster query performance. Our query speed with Azure Sentinel improved drastically. It’s 12 times faster than it was with the previous solution, on average, and is up to 100 times faster with some queries.
  • Simplified training and onboarding. Using a cloud-based, commercially available solution like Azure Sentinel means it’s much simpler to onboard and train employees. Our security engineers don’t need to understand the complexities of an underlying on-premises architecture. They simply start using Sentinel for security management.
  • Greater feature agility. Azure Sentinel’s feature set and capabilities iterate at a much faster rate than we could maintain with our on-premises developed solution.
  • Improved data ingestion. Azure Sentinel’s out-of-the box connectors and integration with the Azure platform make it much easier to include data from anywhere and extend Azure Sentinel functionality to integrate with other enterprise tools. On average, it’s 18 times faster to ingest data into Azure Sentinel using a built-in data connector than it was with our previous solution.

Lessons learned

Throughout our Sentinel implementation, we reexamined and refined our approach to SIEM. At Microsoft’s scale, very few implementations go exactly as planned from beginning to end. However, we derived several points with our Sentinel implementation, including:

  • More testing enables more refinement. We tested our detections, data sources, and processes extensively. The more we tested, the better we understood how we could improve test results. This, in turn, meant more opportunities to refine our approach.
  • Customization is necessary but achievable. We capitalized on the flexibility of Azure Sentinel and the Azure platform often during our implementation. We found that while out-of-the-box features didn’t meet all our requirements, we were able to create customizations and integrations to meet the needs of our security environment.
  • Large enterprise customers might require a dedicated cluster. We used dedicated Log Analytics clusters to allow ingestion of nearly 20 billion events per day. In other large enterprise scenarios, moving from a shared cluster to a dedicated cluster might be necessary for adequate performance.

Conclusion

The first phase of our migration is complete! However, there’s still more to discover with Azure Sentinel. We’re taking advantage of new ways to engage and interact with connected datasets and using machine learning to manage some of our most complex detections. As we continue to grow our SIEM environment in Azure Sentinel, we’re capitalizing on Sentinel’s cloud-based benefits to help meet our security needs at an enterprise level. Sentinel provides our security operations teams with a single SIEM solution that has all the tools they need to successfully complete and manage security events and investigations.


You might also be interested in

Transforming the enterprise network with next-generation connectivity
November 04, 2021

Transforming the enterprise network with next-generation connectivity

Read Article
How Microsoft narrows the threat funnel on over 600 billion monthly security events
October 29, 2021

How Microsoft narrows the threat funnel on over 600 billion monthly security events

Read blog
Boosting Microsoft’s response to cybersecurity attacks with Microsoft Azure Sentinel
October 12, 2021

Boosting Microsoft’s response to cybersecurity attacks with Microsoft Azure Sentinel

Read blog
Best practices for implementing Zero Trust at Microsoft
June 03, 2021

Best practices for implementing Zero Trust at Microsoft

Watch video