SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines

ACM Symposium on Cloud Computing 2018 (SoCC) |

Published by ACM

Existing state machine replication protocols are confronting two major challenges in geo-replication: (1) limited performance caused by load imbalance, and (2) severe performance degradation in heterogeneous
environments or under high-contention workloads. This paper presents a new semi-decentralized approach to addressing both the challenges at the same time. Our protocol, SDPaxos, divides the task of a replication protocol into two parts: durably replicating each command across replicas without global order, and ordering all commands to enforce the consistency guarantee. We decentralize the process of replicating commands, which accounts for the largest proportion of load, to provide high performance. In contrast, we centralize the process of ordering commands, which is lightweight but needs a global view, for better performance stability
against heterogeneity or contention. The key novelty lies in that SDPaxos achieves the optimal one-round-trip latency under realistic configurations, despite the two separated steps, replicating and ordering, which are both based on Paxos. We also design a recovery protocol to do rapid failover under failures, and a series of optimizations to boost performance.We show via a prototype implementation the significant advantage of SDPaxos on both throughput and latency, facing different environments and workloads.