Abstract

Online services hosted in data centers show significant diurnal variation in load levels. Thus, there is significant potential for saving power by powering down excess servers during the troughs. However, while techniques like VM migration can consolidate computational load, storage state has always been the elephant in the room preventing this powering down. Migrating storage is not a practical way to consolidate I/O load. This paper presents Sierra, a power-proportional distributed storage subsystem for data centers. Sierra allows powering down of a large fraction of servers during troughs without migrating data and without imposing extra capacity requirements. It addresses the challenges of maintaining read and write availability, no performance degradation, consistency, and fault tolerance for general I/O workloads through a set of techniques including power-aware layout, a distributed virtual log, recovery and migration techniques, and predictive gear scheduling. Replaying live traces from a large, real service (Hotmail) on a cluster shows power savings of 23%. Savings of 40–50% are possible with more complex optimizations.