Abstract

We observe several anomalies in how modern cluster schedulers manage task queues. On one hand, centralized schedulers such as YARN do not queue tasks at the worker machines. Instead, the scheduler assigns tasks to machines whenever they report resource availability. Consequently, the inherent feedback delays affect cluster utilization, particularly when cluster workloads are increasingly dominated by tasks of short duration. On the other hand, distributed schedulers
have such queues but fail to match (i.e., place) tasks at the best possible machine since they lack cluster-wide information and do not properly estimate the work pending in a queue. Finally, we are unaware of any scheduler that uses appropriate queue service disciplines (such as prioritization, reordering and starvation freedom) at each machine. Our system, Yaq, offers principled solutions to all these problems: matching tasks to worker machines, bounding queue sizes, and how best to serve the tasks that are in queues. By augmenting both existing centralized and distributed cluster schedulers, and experimenting with a wide variety of workloads on an O(100) server cluster, we show that Yaq offers 1.6x to 3x better job throughput and 1.7x to 9.3x better average job completion time.