DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 5x Faster Training Minimal Code Change DeepSpeed can train DL models with over a hundred billion parameters on current generation of GPU clusters, while achieving over 5x in system performance compared to the state-of-art. Early adopters of DeepSpeed have already produced a language model (LM) with over 17B parameters called Turing-NLG, establishing a new SOTA in the LM category.