VIDUR: LLM Simulator
Vidur is a high-fidelity and extensible LLM inference simulator. It can help you with capacity planning and finding the best deployment configuration for your LLM deployments, test new research ideas like new scheduling algorithms, optimizations like speculative decoding, etc., and study the system performance of models under different workloads and configurations... all without access to GPUs except for a quick initial profiling phase. Please refer to our MLSys'24 paper for more details. We have a live demo that captures the capabilities of the system.