Finding and remediating rogue access points on the Microsoft corporate network

|

Pete Fortman is sitting at a desk in his home office in a blue-striped collared shirt. A laptop and two monitors with no information displayed are visible.
Pete Fortman helped lead Microsoft’s efforts to find rogue access points, unauthorized wireless devices connected to our corporate network. (Photo by Pete Fortman)

Microsoft Digital stories

Finding rogue access points on Microsoft’s network is an important mission for our IT teams.

Networked devices have come to dominate the IT world, and their prevalence has led to more complex and vulnerable gateways. As a result, employees within Microsoft and many other large organizations regularly bring in their own wireless devices. Using a wireless router designed for home office use or a wireless speaker system might seem harmless, but these rogue access points (APs) pose serious security risks.

An unauthorized user could be sitting in the parking lot and you just knowingly or unknowingly gave them access to the corporate network.

– Pete Fortman, principal engineer, Microsoft

In the case of a wireless router designed for home use, it might have a default password that’s literally “password” or the device’s brand name. That could give drive-by hackers easy access to an enterprise’s network.

“An unauthorized user could be sitting in the parking lot and you just knowingly or unknowingly gave them access to the corporate network,” says Pete Fortman, a principal engineer for Microsoft who focuses on security.

With networking built into more and more devices, an increasing number of seemingly benign APs can also act as connectors. That means that in spite of strict segmentation within our overall network environment, threats can piggy-back on increasing numbers of rogue APs to gain access to corporate networks.

Eliminating these vulnerabilities is essential to maintaining a Zero Trust environment.

The danger of rogue APs

Once inside, bad actors can wreak havoc. They can steal intellectual property, flood a network with useless data, or set up conversations between people who think they’re speaking with each other when in fact they’re talking to the attacker.

One of the most damaging outcomes is a ransomware attack. That’s a type of malware that blocks access to critical data or systems until the target pays a ransom, and they can be massively disruptive in terms of both operations and customer trust.

Beyond that, rogue APs can interfere with legitimate wireless traffic—often by simply competing for airtime with the unwanted device. “It’s like a conference room with 18 seats, but 50 people are in the room and they’re all trying to stream something wirelessly,” Fortman says.

We’ve fought to keep rogue APs off our network for years. But as devices become more complex and plentiful, they’ve also become more difficult to detect. That doesn’t just increase the number of risky APs attached to our network. It also vastly increases the amount of telemetry that IT teams have to address, resulting in greater data volume and complexity.

To combat that, we’re applying machine learning and other advanced techniques to track rogue APs down.

A diagram showing the corporate network being supported by two sections, wired and wireless network telemetry. Under wired telemetry is an icon for rogue access points. Rogue access points stem from unauthorized communication channels and unauthorized users.
The pathways that rogue access points can use to gain access to a wired corporate network.

When we began examining additional telemetry to find rogue access points in 2019, Fortman was surprised by what we uncovered.

“We had rogue devices all over the place,” Fortman says. “We kept the data private for a while to prevent adversaries from knowing what we can and cannot detect. When we shared the data more broadly, there was a collective gasp as people realized what was going on.”

[Learn how Microsoft 365 helps create a secure, modern workplace. Find out how Microsoft ensures security with Windows Hello for Business.]

Tracking down rogues

Obviously, rogue AP vulnerabilities aren’t good at a company that relies on Zero Trust to ensure security.

Gathering all this information into one place was a feat unto itself. We had to do it twice for two different data sets. Then we had to correlate the data sets together, and then look at suppression technology.

—Vincent Bersagol, senior software engineer, Microsoft

An engineering team within Microsoft Digital Employee Experience (MDEE), the organization that powers, protects, and transforms our internal technology, took on the challenge of identifying and removing rogue devices.

Finding rogue APs posed a substantial engineering challenge. Potentially thousands of devices from a wide range of manufacturers might be on the loose in the corporate network—all using different wireless protocols.

“Gathering all this information into one place was a feat unto itself,” says Vincent Bersagol, a senior software engineer for Microsoft. “And we had to do it twice for two different data sets. Then we had to correlate the data sets together, and then look at suppression technology.”

Microsoft’s data tools, such as Microsoft Power BI, Microsoft Azure Data Lake, and Microsoft Azure Synapse, played a key role in collecting and correlating the data. “That was a great way to visualize all this data for folks to have a look at it,” Bersagol says.

Our expertise in machine learning also proved helpful for finding rogue APs. We used it to sort through the correlations between wired and wireless devices.

“We used a clustering algorithm that allowed us to tease out all the media access control (MAC) addresses that were statistically related to each other in a way that humans couldn’t see,” Bersagol says.

Many access points have commonly identifiable designs we can determine by looking at multiple sets of network telemetry, including the MAC addresses. Finding these identifiable designs began with a manual examination of the rogue APs we’d already discovered. We recognized that requiring a sample of every type of rogue AP to generate a manual identification to find new patterns would present problems as the project scaled.

But collecting all the wired and wireless telemetry to hunt for new rogue AP designs wasn’t enough. “That’s too much data for humans to sift through,” Bersagol says.

Instead, we ran a script that matched the two telemetry sets across all machines encountered. If the script found any correlated wireless and wired data, the odds were very high that they came from the same device—a rogue AP. We gained further confidence that we’d found a rogue AP when the correlated addresses came from within the same building.

So far, so good.

But some devices have designs that elude direct correlation using the existing telemetry. By using additional telemetry sources, we’ve been able to unearth devices that are more difficult to detect.

Still, even finding the simpler devices yields an impressive collection.

In the early stages of the project in October 2019, a sweep of about 100 buildings on the Microsoft campus unearthed more than 1,000 rogue APs.

COVID-19 plays a role (of course)

The COVID-19 pandemic had several impacts on the team tasked with finding rogue access points. Many rogue devices disappeared from the network because their owners were working from home.

The disruption also challenged some of the engineers working on the problem.

Blaze Kotsenburg, a software engineer, began work on the project in June 2020—his first month as a Microsoft employee. But onboarding, meeting new team members, and getting up to speed on the rogue AP project all took place over Microsoft Teams.

“I couldn’t go to my mentor Vincent and ask him for a 15-minute whiteboard,” Kotsenburg says. “I’d work on something for a few hours, then ping him and say, ‘Hey, I need some help.’”

In spite of these challenges, the entire team found new ways to collaborate and recreate the in-office dynamic. Diego Baccino, a principal software engineering manager, shares that the virtual work environment helped create a single team, rather than one team led by Fortman and one by Baccino.

“Working with two teams in parallel worked even better because of the remote situation,” Baccino says. “If I were to do this over again, I’d put even more emphasis on communication between everyone involved.”

This strong collaborative stance has remained as employees have transitioned from fully remote to hybrid work.

Pulling the plug

It’s possible to take a very fine-grained approach to finding rogue access points and booting them off a network, such as assigning traffic through their ports to a virtual local area network (VLAN), or by blocking the devices’ MAC addresses.

In this case, we opted for a more blanket approach: shutting down any port connected to a rogue AP. This technique proved simple and effective, and safer than trying gentler approaches.

There’s what Fortman calls “collateral damage” because when a port is shut down, its user might lose network connectivity for other devices in their office, and Microsoft loses visibility to anything connected to that port.

“Shutting down a port is a basic capability of wired access” Fortman says. “As more Zero Trust networking capabilities become available on the infrastructure, we’re leveraging them to proactively prevent some devices from connecting and to enact more precise rogue AP suppression through automated remediation.”

While our earlier work was about identifying, cataloging, and remediating accumulated rogue AP issues, we’ve now developed a more real-time approach. We’re using Azure EventHub and Data Explorer to handle real-time telemetry to help improve the security response time.

That set the stage for automated remediation. Now, when our systems detect a rogue AP, we can automatically suppress it through an automation platform that turns off the associated ports—no human intervention required.

Extending the lessons of rogue AP suppression

MDEE’s work tracking down and remediating rogue APs has been so successful that they’re preparing slices of that data to provide to Azure datacenter teams. They’ll use the lessons learned to enact their own rogue AP detection to fulfill regulatory requirements across different geographies throughout the world.

Finally, these capabilities are spawning other abilities across teams as well. MDEE is actively looking for opportunities to apply the platform they’ve created throughout Microsoft. That might eventually lead to a self-serve platform that other business groups within Microsoft can access for their own AP security needs.

As new threats emerge and old ones find new ways to cause problems, security is a constant challenge. At Microsoft, preventing unwanted intruders is a top priority, and digital sleuthing has helped us close off one more avenue that bad actors might use.

Related links

Recent