Deadlocks in Datacenter Networks: Why Do They Form, and How to Avoid Them

ACM HotNets Workshop |

Published by ACM

Driven by the need for ultra-low latency, high throughput and low CPU overhead, Remote Direct Memory Access (RDMA) is being deployed by many cloud providers. To deploy RDMA in Ethernet networks, Priority-based Flow Control (PFC) must be used. PFC, however, makes Ethernet networks prone to deadlocks. Prior work on deadlock avoidance has focused on {\em necessary} condition for deadlock formation, which leads to rather onerous and expensive solutions for deadlock avoidance. In this paper, we investigate {\em sufficient} conditions for deadlock formation, conjecturing that avoiding {\em sufficient} conditions might be less onerous.