“Our mission is to invent and research technologies that make Microsoft’s networks, services, and devices indispensable to the world.”
The Mobility and Networking Research (MNR) Group focuses on basic and applied research in all areas related to networked systems and mobile computing. Researchers build proof-of-concept systems, engage with academia, publish scientific papers, publish software for the research community, and transfer cutting-edge technologies to Microsoft’s product groups.
Under the umbrella of the KNOWS project, we are revisiting classical wireless networking problems and designing new solutions that incorporate and build upon recent advances in software and hardware technologies for networking over the recently opened white spaces spectrum.
Product group impact comes in many forms – consultation, design wins, code transfer, people transfer etc. Here are some examples:
Visit www.microsoft.com/iplicensing for licensing information.
Our research is focused on the following four broad themes. Each theme has numerous projects, and some projects span multiple themes. A partial list of current and past projects is available.
Network Verification
As we move from software on disk (e.g., Office) to Software-as-a-service delivered over the network (e.g., Ofice365) it is imperative that network down times not diminish service availability. Network verification seeks to guarantee correct operation of our data center and core networks by leveraging work in formal methods for programs. Despite the presence of cables and routers, a network can be viewed abstractly as a “program” that takes packets from the input edges of the network and outputs packets to the output edge.
This leads to a broad research agenda: building tools that are the equivalents of testers, static checkers, and compilers for Microsoft networks. New research is required because differences in the networking domain require rethinking classical verification tools (e.g., model checking, symbolic testing) to produce new concepts. At MSR, we have built four tools including SecGuru (used operationally within Azure), NetSonar (aspects of which are in Autopilot), Batfish (which can predict the effect of routing configuration changes), and Network Optimized Datalog (which can check reachability across firewalls and load balancers). This is joint work between the MNR and RiSE groups, various network product teams, and external researchers in Stanford and UCLA.
In ongoing work, we are 1) improving the scalability of reachability checks by leveraging symmetries in data center topologies; 2) improving the speed of configuration change analysis by decomposing and modularizing the analysis logic into smaller chunks; and 3) proving correctness under all topologies and route announcements through symbolic execution of configurations.
Technology Policy
Public communications networks, such as those delivering mobile and wired broadband Internet access to homes and businesses, are typically subject to a high degree of regulatory oversight. Consequently, the policies that national governments enact have a big impact on how our customers experience our products, influencing how fast and available their connections are and how much they cost.
MNR technology policy efforts give researchers a clear understanding of policy perspectives, opportunities and constraints, and in coordination with LCA, bring world-leading technical knowledge into the policy-making arena. Examples of focus areas for MNR technology policy include rules for use of wireless radio frequencies (spectrum), and rules protecting networked services (such as O365, Bing, Skype and other Microsoft products) from being impaired by network operators (network neutrality).
Optical Networking
Design and operations of today’s networks decouples the physical layer from higher layers—to higher layers everything is an Ethernet port, irrespective of the physical media underneath. This decoupling makes it hard to diagnose failures, manage risk (e.g., multiple IP-level links may traverse the same physical media), debug degradation of packet delivery (e.g., corruption), or modulate transmission speed based on physical layer characteristics. The decoupling may have made sense in a world with diverse physical layers, but with the convergence of the physical layer to optics, we believe it is time to revisit it.
We are pursuing two threads of research. First, we are developing techniques to characterize the physical layer and correlate its performance to that of packet delivery. To enable this analysis, we are mining optical data from Microsoft’s wide area network (WAN) and data center networks. Our analysis is uncovering key insights such as fibers cuts are not the leading cause of WAN faults (equipment failures are), the level of optical power overprovisioning in WANs is such that data transmission speeds can be safely increased by over a third, and low receive power is a common cause of packet corruption inside data centers.
Second, we are exploring radical cross-layer network designs. For the data center, we are focusing on free-space optics and the use of DMDs (digital micromirror devices, which are pervasive in projectors today) as the basis for a completely flat interconnect with high fanout and fast (10 microsecond) switching. For WANs, we are focusing on cross-layer traffic engineering, that is, a system that jointly and dynamically decides the routing of wavelengths and packets. Since commodity hardware does not enable us to prototype such ideas, we are also developing an FPGA-based platform for programmable optical transceivers.
RDMA in Large Scale Data Center Networks
Modern datacenter applications demand high throughput (over 40Gbps) and ultra-low latency (less than 10 microseconds) from the network. At the same time, the brutal economics of the cloud services market dictates that CPU overhead should be minimized. Standard TCP/IP stacks cannot meet these requirements: e.g. the single hop latency of a production TCP stack can be over 15 microseconds, and to saturate a 40Gbps link, the stack can consume 15-20% CPU cycles. Remote Direct Memory Access (RDMA) can provide low latency, and high throughput by bypassing the host networking stack for data transfer operations. On IP-routed datacenter networks, RDMA is deployed using RoCEv2 protocol, which relies on Priority-based Flow Control (PFC) to enable a lossless (i.e. no congestion drops) network.
However, PFC can lead to poor application performance due to problems like head-of-line blocking and unfairness. To alleviates these problems, we have designed DCQCN, an end-to-end congestion control scheme for RoCEv2. To optimize DCQCN performance, we build a fluid model, and provide guidelines for tuning switch buffer thresholds, and other protocol parameters. Our experiments show that that DCQCN dramatically improves throughput and fairness of RoCEv2 RDMA traffic. DCQCN is implemented in Mellanox NICs, and is being deployed in Microsoft’s datacenters.
TIMELY, a protocol put forth by Google is a parallel effort to DCQCN. It aims to solve the same problem, but uses delay as a congestion signal (like TCP Vegas).
Another way to think about DCQCN and TIMELY is that these congestion control algorithms represent a new design point in the age-old tussle between fast response and stability. They rely on PFC to offer fast response (i.e. avoid packet drops) to short-term congestion, while relying on conventional, (ECN or delay-based) based congestion control to provide long-term stability.
Datacenter Networking & Performance Optimization of Cloud ServicesWe are pursuing a multi-year cross-lab research program that focuses on producing the next generation data center networking and services. We are experimenting with radical new designs in network architecture, programming abstractions, and performance management tools. We care about inexpensive future-proof networking inside the data centers, between globally distributed data centers and to the data centers. Our research includes several projects the cut across various systems and networking research areas that are being pursued in collaboration with Microsoft’s Global Foundation Services Team, Windows Azure Team, Bing Team, and the Management Solutions Division.
Mobile Computing & Software ServicesWe are pursuing a variety of mobility-related projects: studying how the cloud can enhance the user experience on mobile devices (HAWAII); understanding how people use smartphones and the performance characteristics of 3G networks (3GTest & Diversity Studies); building systems to enhance smartphone performance, functionality, and battery lifetime using code offload (MAUI); building infrastructure to enable mobile social applications (Virtual Compass); and enhancing mobile device sensors by making their sensor readings trustworthy. In the software services arena, we are pursuing a variety of systems to simplify building scalable and geo-distributed services (PRS/Centrifuge, Volley, and Stout). Another area of emphasis is home networks, where we are pursuing network diagnosis services for the home (NetMedic & NetClinic), as well as new services and abstractions for easily building networked applications for the home (HomeOS).
Continuous Hands-free Mobile InteractionWhen combined with high-resolution touch-enabled displays, web access has proven a killer application. Playing games, reading the news or watching YouTube can capture the attention of users for extended periods many times a day. However, if the user has limited attention or the relevant tasks are short and happen many times an hour, e.g., making an appointment, adding a song to a playlist or checking on a bus, pulling the phone out of your pocket for an immersive experience is cumbersome. The newly launched Continuous Mobile Interaction project is developing usages, devices and systems to make such lightweight but frequent actions easy to do. Current efforts are along several directions. (a) an always-available speech accessory, and (b) developer support for natural-language interaction (c) platform for multi-modal interaction in moving vehicles, and (d) an always-on visual cognition engine
Cognitive Wireless NetworkingThe next generation of wireless networks will include software defined radios, cognitive radios, and multi-radio systems which will co-exist harmoniously while operating over a very wide range of frequencies. We are revisiting “classical” wireless networking problems and designing new solutions that incorporate and build upon recent advances in software and hardware technologies. Of interest lately has been our research solutions to problems in white space networking (the KNOWS project). We are working with ploicy makers, business units and acdemia to address the societal needs for providing inexpensive broadband connectivity everywhere.
Enterprise Network Management & ServicesWe are pursuing several different projects in this area. In particular, NetHealth is a network management research program in which end-hosts cooperatively detect, diagnose, and recover from network faults. Unlike existing products we take a end-host centric approach to gathering, aggregating, and analyzing data at all layers of the networking stack for determining the root cause of the problems. NetHealth includes several on-going projects in the wireless and wired space that are being pursued in collaboration with Microsoft’s Management Solutions Division and Microsoft’s Unified Communications Group.
We strive to find a balance between long term research and product impact. You may know about our research from the papers we publish and the talks we give, here we share with you a few examples of the broad impact we have had on Microsoft products.
Details about these and our other product contributions are provided below.
Microsoft’s Wide Area Network – Architecture & Management Software (2013-14)Increases the inter-DC WAN utilization from 40% to 90%+
AutoPilot’s Network-state Management Service (2014)Dramatically simplifies network management app development and operations while maintaining network-wide SLA
Windows Azure ZKaaS (ZooKeeper as a service) (2014)A multi-tenant coordination cloud service that uses open source Zookeeper
Windows Azure Autopilot NetInsight (2013-14)End-to-end measurement and analysis tools that run automatically and vastly improve the accuracy of WAN fault localization.
Network Virtualization for Hybrid Clouds (2010-12)Enabled Windows to provide seamless connectivity between Microsoft’s Data Centers and customers’on-premise networks
Visual Studio Energy Modeler & Profiler (2012)
GreenUp (2011-12)Delivers significant power and monetary savings for enterprise customers by enabling seamless remote access to sleeping desktop machines
Fully Configurable Windows Azure Software Load Balancer (2011)Reduced costs by a factor of 15 by removing dependence on hardware load balancers and improved cloud manageability as well
Full-Bisection Bandwidth Datacenter Networks (2009-10)Servers in a datacenter are no longer limited by the network that connects them
TCP Analyzer (2010)Enabled Microsoft Network Monitor to provide deeper insights into the working of Internet’s Transport Control Protocol
Data Sense Bandwidth Attribution Technology (2012)
Mobile Input Services & Technology (2012)
Firmware TPM Emulator (2012)
AppInsight to the Application Compatibility Team (2012)
Wireless Optimizations in Windows 8 (2011-12)Increases battery lifetime in Windows 8 Tablets and Surface computers.
Network Quality of Service for Virtual Machines (2011-12)Enabled Windows 8 to provide predictable networking to high-value cloud services
Support for Security Features in Windows ARM (2011-12)Enabled widely used security features (BitLocker, DirectAccess, Virtual SmartCards) on Windows RT and Windows Phone
Antenna Placement on Windows Tablet (2011-12)Enabled best-in-class Wi-Fi network connectivity & performance
Datacenter TCP (2010-12)Improved network performance in Data Centers with inexpensive switches
Virtual Wi-Fi (2009)Enabled Windows to connect to multiple WLANs simultaneously and offer range extension, concurrent corporate and guest connection, and Internet gateway features
Network Bandwidth Estimation (2004-05)Enabled Windows to offer a better media streaming experience over Wi-Fi
NDIS WLAN extensions in Windows 2000, Windows XP, Vista & Windows 7Elevated Wireless LAN connectivity to a premier consumer networking technology in Windows
Network Failure Recovery in Data Centers (2012)NetPilot reduces the recovery time for the common Data Center network failures from a few hours to tens of minutes
Improving Page Load Time of Bing Searches (2012)Faster load times leads to better user experience
Onset-of-congestion Signaling (2012)Our congestion prediction technology enables mitigation strategies that lead to better application performance
ReOptimizer for Data Parallel Computing (2011)This technology significntly reduced the response times of large jobs in our Data Centers
Mitigating Outliers in Data Parallel Jobs (2009)
NetTrace (2009)
DNS Query Time Optimization (2008)
Scalable and Consistent Caching (2008)Enabled MSN web properties to better handle spikes in load (flash crowds)
Partitioning and Recovery Service (2008)Enabled Live Mesh (now SkyDrive) backend cloud services to scale resiliently
Managing Shared Credential Vulnerabilities (2008)The technology behind Microsoft Forefront risk analysis and mitigation planning feature
XBOX One Wireless Controller Protocol (2013-14)The wireless protocol between the XBOX One controllers and the console
Service Graphs for Large-Scale Network Diagnostics (2012)Helps meet customer service level agreements (SLAs) by quickly identifying faltering components, reducing down time from days to minutes
MNR works with academic institutions in a number of different ways. We sponsor a number of ACM and IEEE conferences; Our researchers serve on steering and program committees of academic conferences and workshops; we serve on editorial boards of prestigious journals; Via the Hawaii Academic outreach program, we support mobile computing courses and research at several universities; we invite colleagues from academia to visit us and we host events that provide a forum to brainstorm about new research.
For a number of years we have been organizing mindswap events between researchers from industry, academia, and government. At these events we have open discussions on important research topics and the challenges ahead. For the benefit of the community, we make videos and presentation slides from all talks available on the event’s web site. Here are the events we have organized:
Cambridge University, University of North Carolina, Purdue University, University of Michigan, Michigan State University, Singapore Management University School of Information Systems, Egypt-Japan University of Science and Technology, Virginia Tech University, University of South Carolina, Old Dominion University, Clemson University, Temple University, University of Utah, University of Wisconsin-Madison, University of Arkansas, University of Oregon
University College London, Stanford University, Duke University, University of Arkansas, University of Minnesota, University of Illinois at Urbana-Champaign, New York University, University of Massachusetts Lowell, Stony Brook University, University of Houston, University of California Santa Barbara, Ohio State University, Temple University, Purdue University, University of California Santa Barbara, University of Leipzig, Germany, Indiana University, Purdue University, Pontificia Universidade Catolica, Brasil, University of Goettingen, University of Washington
Singapore Management University School of Information Systems, University of Micigan, University of Maryland, University of Arkansas, University fo California at Santa Barbara, Michgan State University.
University of Southern California, University of Wisconsin Madison, Duke University
We have consistently supported strong conferences on mobile systems. A sampling of some conferences we have supported in the recent past
2013MobiCom, HotMobile, S3, MobiSys, HotNets
2012MobiSys, HotMobile, DySPAN
2011MobiCom, SIGCOMM, MobiSys, DySPAN, NSDI, MobiCom – PhDForum
In addition to Project Hawaii Support, and an extensive University Relations Program dedicated to funding research at Universities, occasionally we too support faculty research in areas of our interests. Examples of institutes MCRC Researchers have supported in the past include:
Microsoft Research organizes an annual faculty summit in Redmond. The summit offers a unique opportunity for us to mingle with researchers in academia. In addition to this we have had the pleasure of hosting several distinguished researchers in our center as well. Here is a partial list of a few who have visited us:
Microsoft Research awards a two-year fellowship to outstanding Ph.D. students. Past recipients with Ph.Ds. in Mobility & Networking are: