Project Everest: Reaching greater heights in internet communication security
Project Everest is a multiyear collaborative effort focused on building a verified, secure communications stack designed to improve the security of HTTPS, a key internet safeguard. This post, about the verification tools and techniques the Everest team is using and developing, is the first in a series exploring the groundbreaking work, which is available on GitHub now.
Wouldn’t it be great if a message you sent to your bank over the internet was guaranteed to be safe from tampering and readable only by your financial institution? Project Everest is building software that provides such a guarantee as a theorem about the code that implements a secure communication protocol deployed in web browsers and servers everywhere.
“Proving theorems about programs has been a dream of computer science for the last 60 years or more, and we’re finally able to do this at the scale required for an important, widely deployed security-critical piece of software,” says Microsoft Senior Researcher Nik Swamy, a member of the Project Everest team.
The security of internet communications crucially depends on a variety of cryptographic algorithms and protocols. The most widely used among these falls under the umbrella of the Transport Layer Security (TLS) protocol. TLS is used for secure web browsing via HTTPS, email, Voice over IP, instant messaging, and many other kinds of communication. Unfortunately, TLS and its many implementations have been attacked repeatedly over its 25-year history.
Project Everest is an ongoing collaboration started in 2016 with researchers from Microsoft Research Redmond, Microsoft Research Cambridge, Microsoft Research India, the Microsoft Research-Inria Joint Centre in Paris, Inria, Carnegie Mellon University, and The University of Edinburgh. Growing out of several prior Microsoft Research projects, including Ironclad, miTLS, and F*, Everest aims to develop and deploy efficient, verified, open-source implementations of the entire TLS stack and related protocols, formally reducing the security of the code to the assumptions about the hardness of certain cryptographic problems.
For Jonathan Protzenko, a Microsoft researcher on the Everest team, the project’s open collaboration is special.
“Everest makes for a tight interaction between industrial and academic research,” he says. “Our members frequently visit each other, co-author papers together, and send students from one institution to the other over the summer. Several of our members studied at or later moved to these institutions. In that sense, for me, Project Everest truly represents the ideal of open, collaborative research.”
Project Everest is halfway through its projected five-year arc, and its verified components are beginning to replace the current infrastructure with proven, secure software. For instance, Everest’s HACL* library provides verified cryptographic primitives for Mozilla Firefox, for the WireGuard VPN, and for the Tezos blockchain. And within Microsoft, Everest’s miTLS protocol stack powers the primary implementation of the QUIC transport protocol. The Everest team expects to announce further deployments in the coming weeks. Meanwhile, Everest code is already open source and is developed publicly on GitHub.
Formal Verification of Software
Formal verification involves using software tools, including various kinds of theorem provers and proof assistants, to analyze all possible behaviors of a program and prove mathematically they comply with the code’s specification, a machine-readable description of the developer’s intentions. Once the code has been verified against its specification mechanically, based on trust in the software used to check proofs, a skeptical auditor need only study the specifications and the theorem statements proven without needing to consult the much larger programs and proofs.
“Most software built today gets tested before it is released—at least one hopes it does!” says Swamy. “But even the most rigorous testing can only find bugs; it cannot rule out the existence of errors. For certain kinds of software, say security-critical code like TLS, one may actually want to prove that no vulnerabilities exist. Software verification is time-consuming and requires expertise, but, unlike testing, it can actually guarantee mathematically the absence of entire classes of errors.”
For Everest programs, the team’s specifications cover a range of properties, including:
- Memory safety: A program never violates the memory abstractions, and, as a consequence, is free from common bugs and vulnerabilities like buffer overflows, null-pointer dereferences, use-after-frees, and double-frees.
- Type safety: A program respects the interfaces among its components, including any abstraction boundaries. For example, one component never passes the wrong kind of parameters to another or accesses its private state.
- Functional correctness: A program’s input/output behavior is fully characterized by a simpler mathematical function, which acts as its functional specification.
- Side-channel resistance: Observations about the implementation’s low-level behavior, such as the time it takes to execute or the memory addresses it accesses, are independent of the secrets manipulated by the program. Hence, an adversary monitoring these “side-channels” learns nothing about the secrets.
- Cryptographic security: Based on cryptographic assumptions, except for negligible probability, Everest programs are indistinguishable from ideal cryptographic functionalities, the mathematical definitions that cryptographers use to capture the notion of secrecy, integrity, and secure communication.
Formal verification can play a role throughout the software development process, from design to implementations and deployments. The value of verification is increasingly widely perceived, especially for security-critical code.
“Cryptographic protocols are notoriously hard to implement correctly, with errors in both the algorithms and the protocol implementation itself being common,” says Eric Rescorla, chief technology officer of Mozilla Firefox, security area director at the Internet Engineering Task Force, and the editor of the TLS standard. “Formal verification tools like those developed by the Project Everest team have transformed the way we design these protocols, allowing us to move faster while having much higher confidence in protocol correctness.”
Indeed, assisted in part by their verification efforts, Benjamin Beurdouche, Karthik Bhargavan, Antoine Delignat-Lavaud, and Cédric Fournet, all members of Project Everest, have contributed features and fixes to the TLS standard. Beyond assisting with designs and specifications, verified implementations are also increasingly attractive for mainstream deployment.
“Verified implementations of cryptographic primitives are gradually making their way into major implementations,” Rescorla adds. “The Curve25519 and ChaCha/Poly1305 algorithms from HACL* are already running in Firefox, and I look forward to the day when we can adopt completely verified implementations.”
F*, Low*, and Vale
All Everest code is programmed and verified using F*, a framework that brings together three strands of research in programming languages.
- F* is an effectful, general-purpose higher-order programming language in the tradition of languages like F#, OCaml, and Haskell, among others.
- F* includes a full-fledged dependent type theory and tactic framework in the tradition of proof assistants like Coq and Nuprl, allowing nearly arbitrary expressive power for conducting formal mathematical proofs.
- Like other program verifiers, including Dafny and Why3, F* is integrated with an automated theorem prover, Z3, which can automate many of the tedious, low-level proof steps necessary to prove programs correct.
“Starting from a language in my dissertation called Fable and a language developed at Microsoft Research Cambridge called F7, F* has evolved repeatedly over the course of almost a decade and is now developed by a large but closely knit team of people and has become a full-fledged proof assistant,” Swamy says.
“F* is unique in its use of both automated and interactive proofs, and its primitive support for effects makes it well-suited for verifying real-world software that is inherently effectful,” adds Aseem Rastogi, a researcher at Microsoft Research India and member of the Project Everest team.
The following simple F* program, a functionally correct implementation of Quicksort on lists, is an example of how it operates. Given a list and total order on its elements, the type of quicksort on the first line asserts that the function always returns a sorted permutation of the original list.
While proving purely functional code, such as quicksort above, is relatively straightforward, to achieve high performance, Everest code is also programmed in two domain-specific languages embedded in F*: Low* and Vale.
Verifying efficient low-level code in F*
Low* is a subset of F* geared toward low-level programming with explicit memory management. Low* programs are extracted to idiomatic C code by a tool called KReMLin and run without garbage collection, reference counting, or any other automated memory management strategy. The HACL* library and Everest’s verified implementation of the TLS-1.3 record layer are programmed and verified in Low*.
For various cryptographic primitives, peak performance can only be obtained by going still lower level and programming in assembly language, often taking advantage of specialized hardware instructions, such as Intel AES-NI. Vale stands for “Verified Assembly Language for Everest” and provides a domain-specific language for writing and verifying assembly programs targeting different platforms, like Windows, MacOS, and Linux, and architectures, like x86, x64, and ARM. It also supports multiple verification systems for carrying out proofs, including F* and Dafny.
The 2019 Symposium on Principles of Programming Languages (POPL) features a paper on Vale. “A Verified, Efficient Embedding of a Verifiable Assembly Language” describes how the Everest team embedded Vale in F*, making use of F*’s dependent type system and its computational capabilities to implement an efficient verification condition generator for embedded assembly programs. Using Vale in F*, the Everest team reports on the first provably correct implementation of AES-GCM, a cryptographic routine used by 90 percent of secure internet traffic.
In addition to the availability of Everest components on GitHub, the team is planning the first integrated release of Everest’s verified stack, its libraries, and its verification tools for early summer 2019.
“Working at the intersection of systems, networking, cryptography, programming languages, and program verification, Everest is unique in that it co-develops verified software with the tools and techniques needed to build it,” says Swamy. “But we’re also looking beyond our specific goals for TLS to apply our verification technology to other areas.”
Spanning four continents and 12 time zones, the Everest team works around the clock trying to build a more secure internet. There’s still a long way to go to the summit, but halfway through the project, the team has learned a lot and has put down footholds to make it easier to build provably secure software at scale.