Microsoft at ASPLOS 2024: Advancing hardware and software for high-scale, secure, and efficient modern applications
Publication Serving Models, Fast and Slow:Optimizing Heterogeneous LLM Inferencing Workloads at Scale Kunal Jain, A. Parayil, Ankur Mallick, Rujia Wang, Renee St. Amant, Chetan Bansal, Victor Ruehle, Saravan Rajmohan, Shashwat Jaiswal, Yogesh Simmhan, Anoop Kulkarni, Steve Kofsky ACM Sigmetrics 2026 | June 2026 Project
Publication DroidSpeak: Efficient Context Sharing for Multiple-LLM Inference Yuhan Liu, Yuyang Huang, Jiayi Yao, Zhuohan Gu, Kuntai Du, Hanchen Li, Yihua Cheng, Junchen Jiang, Shan Lu, Madan Musuvathi, Esha Choukse NSDI | May 2026 Project
Publication Harvesting Spare CPU Resources in Container Systems Adam Hall, Anirudh Sarma, Esha Choukse, Kishore Ramachandran, Sameh Elnikety NSDI | May 2026
Publication Concord: Learning Network Configuration Contracts Ryan Beckett, Francis Y. Yan, Raghunadha Reddy Pocha, Vineesh V. Raj, Ayyub Shaik, Siva Kesava Reddy Kakarla 2026 European Conference on Computer Systems | April 2026
Publication Algorithm Generation via Creative Ideation Ruiying Ma, Chieh-Jan Mike Liang, Yanjie Gao, Francis Y. Yan ICLR (International Conference on Learning Representations) | April 2026
Publication VeriStruct: AI-assisted Automated Verification of Data-Structure Modules in Verus Chuyue Sun, Yican Sun, Ethan Zhang, Daneshvar Amrollahi, Shuvendu Lahiri, Shan Lu, David Dill, Clark Barrett International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS) | April 2026 Project
Publication Niyama : Breaking the Silos of LLM Inference Serving Kanishk Goel, Jayashree Mohan, Nipun Kwatra, Ravi Shreyas Anupindi, Ramachandran Ramjee Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2026 | March 2026 Project
Publication MSCCL++: Rethinking GPU Communication Abstractions for AI Inference Changho Hwang, Peng Cheng, Roshan Dathathri, Abhinav Jangda, Saeed Maleki, Madan Musuvathi, Olli Saarikivi, Aashaka Shah, Ziyue Yang, Binyang Li, Caio Rocha, Qinghua Zhou, Mahdieh Ghazimirsaeed, Sreevatsa Anantharamu, Jithin Jose ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) | March 2026
Publication OrbitalBrain: A Distributed Framework For Training ML Models in Space Om Chabra, Chenning Li, Kevin Hsieh, Santiago Segarra, Behnaz Arzani, Peder Olsen, Ranveer Chandra New Ideas in Networked Systems (NINeS) | February 2026
Publication Towards Fully-Controllable Packet Steering for AI Backend Networks with SRv6 Shaofeng Wu, Zhixiong Niu, Riff Jiang, Guohan Lu, Chen Tian, Hong Xu, Yongqiang Xiong MSR-TR-2026-6 | January 2026 Published by Microsoft