Microsoft Research ブログ
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model
| Ali Alvi と Paresh Kharya
We are excited to introduce the DeepSpee…
DeepSpeed powers 8x larger MoE model training with high performance
| DeepSpeed Team と Z-code Team
Today, we are proud to announce DeepSpee…
DeepSpeed: Accelerating large-scale model inference and training via system optimizations and compression
| DeepSpeed Team, Rangan Majumder, と Andrey Proskurin
Last month, the DeepSpeed Team announced…
DeepSpeed: Extreme-scale model training for everyone
| DeepSpeed Team, Rangan Majumder, と Junhua Wang
In February, we announced DeepSpeed, an …
Research Collection: Tools and Data to Advance the State of the Art
“This is a game changer for the big data…
ZeRO-2 & DeepSpeed: Shattering barriers of deep learning speed & scale
| DeepSpeed Team, Rangan Majumder, と Junhua Wang
In February, we announced DeepSpeed, an …
Turing-NLG: A 17-billion-parameter language model by Microsoft
| Corby Rosset
This figure was adapted from a similar i…
ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters
| DeepSpeed Team, Rangan Majumder, と Junhua Wang
The latest trend in AI is that larger na…