When Idling is Ideal: Optimizing Tail-Latency for Highly-Dispersed Datacenter Workloads with Perséphone

SOSP 2021 |

This paper introduces Persephone, a kernel-bypass OS scheduler designed to minimize tail latency for applications executing at microsecond-scale and exhibiting wide service time distributions. Persephone integrates a new scheduling policy, Dynamic Application-aware Reserved Cores (DARC), that reserves cores for requests with short processing times. Unlike existing kernel-bypass schedulers, DARC is not work conserving. DARC profiles application requests and leaves a small number of cores idle when no short requests are in the queue, so when short requests do arrive, they are not blocked by longer-running ones. Counter-intuitively, leaving cores idle lets DARC maintain lower tail latencies at higher utilization, reducing the overall number of cores needed to serve the same workloads and consequently better utilizing the data center resources.