Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation
Xutong Liu, Baran Atalar, Xiangxiang Dai, Jinhang Zuo, Siwei Wang, John C.S. Lui, Wei Chen, Carlee Joe-Wong
IEEE International Conference on Computer Communications (INFOCOM) | May 2026