Project Natick was a full-scale, fully operational datacenter module, installed underwater in the North Sea, off the Scottish coast. Powered by renewable energy, Project Natick was a test of the feasibility of underwater datacenters. In 2019, we took some of the traffic traveling between the Natick underwater datacenter and the Microsoft Research headquarters in Redmond, Washington, USA, and secured that traffic with an encrypted network tunnel protected with post-quantum cryptography.
Project Natick was the perfect testbed for this work – while it was built to mimic a Microsoft datacenter, Natick was not handling any critical business or customer data. That we weren’t able to physically access the servers and network infrastructure inside the Natick pressure vessel to setup and manage the PQ-protected tunnel made the experiment more accurately reflect the real-world, where it would be infeasible to hand-configure devices in massive datacenters worldwide.
Background
Quantum computers are coming. The exact timeline is uncertain, but a quantum computer powerful enough to break today’s asymmetric cryptography may come online in 10 – 15 years. That cryptographically relevant quantum computer will allow adversaries to break encryption and signing of today’s internet communications. So, before that happens, the entire world needs to start using post-quantum cryptography – cryptography designed to be secure against quantum attackers.
The migration will take time – it’s going to take time to update all the applications, services, and infrastructure to support the new algorithms, and issue new credentials where they’re needed. While that migration is underway, we can use encrypted network tunnels to protect the traffic from software and devices that are not yet fully protected.
The Experiment
Microsoft already operates encrypted tunnels between its datacenters to protect network traffic in transit outside a datacenter’s physical boundaries. The great-circle distance between the underwater datacenter and Microsoft Research headquarters in Redmond is approximately 4,300 miles, and this allowed us to set up an experiment with similar real-world challenges as connections between production datacenters.
The Natick pressure vessel contained several racks of servers all connected via a network inside the vessel. This network was then connected to the Microsoft global network via a set of underwater fiberoptic cables that connect to the facility on shore. Connections between sites on the Microsoft global network were secured with classical cryptography to protect the contents of the network traveling between sites.
One of the servers ran our modified version of OpenVPN. We call this our “router node.” The router node connected to another server in Redmond to establish a post-quantum cryptography encrypted tunnel between the two sites. This server was connected to both the main network in the vessel, and a second virtual local area network (VLAN) which we called the “post-quantum VLAN.” We then configured the networking hardware in the vessel to place several of the other servers on this VLAN, and we could remotely change the number of servers on the post-quantum VLAN. All traffic from these servers was routed by the router node across the tunnel to Redmond, where it continued to its final destination, and outside traffic headed back to these nodes was similarly routed to the router node in Redmond back across the tunnel and into the vessel.
The main network in the vessel was connected normally to the Microsoft global network. In fact, the tunnel uses the regular network connection to route encrypted traffic between Redmond and Scotland. The typical round-trip time on this connection is approximately 180 milliseconds.
The post-quantum tunnel experiment concluded on July 9th, 2020, when the Natick underwater pressure vessel was decommissioned and retrieved from the sea floor. You can learn more about Natick’s ongoing progress and what they’ve learned at their project site.
Technical Details
Each router node rans our modified version of OpenVPN in a virtual machine. The session key for the data encryption was negotiated using a hybrid key exchange which combines a post-quantum key exchange algorithm with a classical key exchange algorithm. This combined the time-tested security of the classical algorithm against conventional attackers with the quantum security of the post-quantum algorithm. In our first deployment, we combined the post-quantum Supersingular Isogeny Diffie-Hellman (SIDH), as it existed in March 2018, with the classical Elliptic Curve Diffie-Hellman (ECDH) (using the NIST P-256 curve) to arrive at the symmetric session key used to encrypt data traffic with AES-256. As of 27th March 2020 we updated to the then latest versions of the algorithms and OpenVPN 2.4.8, and combined Supersingular Isogeny Key Encipherment (SIKE) (using the SIKEp434 parameter set) with classical Elliptic Curve Diffie-Hellman (still using the NIST P-256 curve) to arrive at the symmetric session key to encrypt data traffic with AES-256. With a configuration change, we could use any of the key exchange algorithms supported by OQS’s OpenSSL.
As is customary best practice, session keys are regularly regenerated while the tunnel is running. We scheduled a new key exchange to be run once an hour. This happens while data continued to flow, and so there is no interruption to data traffic while the key exchange rans; data continued to pass using the previous session key until the key exchange completed, whereupon the router nodes began using the new session key. Re-keying therefore did not cause any of the latency observed in initial tunnel setup.
The post-quantum VLAN was assigned its own IP addresses, and the Microsoft network was configured to deliver traffic destined for those addresses to the router node in Redmond. The router node encrypts the traffic, sent it across the global network inside the tunnel to the Natick vessel in Scotland, where the router node there decrypted the traffic and put it on the VLAN. Returning traffic was similarly encrypted by that router node, sent back across the global network inside the tunnel to the router node in Redmond, where the router node there decrypted the traffic and forwards it onwards normally.
We measured a maximum of 250 Mbits/sec of bandwidth over the tunnel. This is below the measured capacity of the underlying link which is capable of 2-3 Gbits/sec. These results are consistent with running an unmodified version of OpenVPN over the same link using only classical cryptography and appears to be a known limitation of tunnels running entirely in software on commodity hardware and is not a consequence of the addition of the post-quantum key exchange.
During tunnel operation, latency over the tunnel was comparable to the latency of the underlying connection, when the underlying connection is operating normally. Variance between round-trip ping times was consistently less than 1 millisecond over a link with a typical round-trip ping time of 180 milliseconds.
We used the tunnel to run volunteer computing workloads on five nodes allocated to the PQ VLAN from BOINC, the Berkeley Open Infrastructure for Network Computing. Input data for volunteer jobs were downloaded over the tunnel, processed, and results were then uploaded back via the tunnel. Typical daily transfer over the tunnel was between 300 and 600 megabytes of data, not counting spikes due to operating system updates. As these were computation-heavy workloads rather than communication-heavy workloads, we would not expect them to strain our bandwidth capacity.
Resources
We released a post-quantum cryptography-enabled Virtual Private Network (VPN) application based on OpenVPN, intended for use to protect the connections between remote workers back to the home office as traffic transits the internet. But encrypted tunnels like these are also used to protect the links between datacenters, as data transits between them.