Wide-Area Analytics with Multiple Resources

ACM EuroSys |

Running data-parallel jobs across geo-distributed sites has emerged as a promising direction due to the growing need for geo-distributed cluster deployment. A key difference between geo-distributed and intra-cluster jobs is the heterogeneous (and often constrained) nature of compute and network resources across the sites. We propose Tetrium, a system for multi-resource allocation in geo-distributed
clusters, that jointly considers both compute and network resources for task placement and job scheduling. Tetrium significantly reduces job response time, while incorporating several other performance goals with simple control knobs. Our EC2 deployment and trace-driven simulations suggest that Tetrium improves the average job response time by up to 78% compared to existing data-locality-based solutions, and up to 55% compared to Iridium, the recently proposed geo-distributed analytics system.