ModServe: Modality- and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving
Haoran Qiu, Anish Biswas, Zihan Zhao, Jayashree Mohan, Alind Khare, Esha Choukse, Íñigo Goiri, Zeyu Zhang, Haiying Shen, Chetan Bansal, Ramachandran Ramjee, Rodrigo Fonseca
ACM Symposium on Cloud Computing (SoCC) 2025 | November 2025