Systems for the Cloud

Systems for the Cloud



The move to the cloud calls for rethinking how systems are built and run. The unprecedented scale of the cloud together with the demand for extremely high availability pose many challenges. It also creates an opportunity to build systems stacks afresh without the burden of legacy. We address these challenges by combining verification, to detect errors before they can affect availability, with implementations of high level abstractions that reduce the number of errors by simplifying programming, while leveraging hardware-software co-design to achieve exceptional performance and efficiency targets.  Our work spans distributed systems, storage, operating systems, networking, verification, hardware offload, and other areas of systems.



FaRM is a new main memory distributed computing platform that simplifies programming by providing transactions on a shared address space to hide failures and concurrency from programmers. FaRM’s software is carefully designed to exploit modern network and non-volatile memory hardware to improve availability, latency, and throughput by an order of magnitude relative to state of the art systems.


Project Honeycomb is exploring how to offload distributed system processing to custom hardware to achieve better performance, efficiency, and cost. We envision that distributed systems of the future will run on top of storage nodes built of custom hardware and storage media, without traditional CPUs. Realizing this vision requires solving many challenges including data structure and hardware co-design, sharing and isolation, security, and management.

Network Verification

The goal of our research is to develop tools for improving network reliability. These tools help network operators and architects to design, operate, maintain, and troubleshoot their networks at scale. The main goal is to find errors early before they can impact availability.