ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters
The latest trend in AI is that larger natural language models provide better accuracy; however, larger models are difficult to train because of cost, time, and ease of code integration. Microsoft is releasing an open-source library called DeepSpeed, which vastly advances large model training by…