In the news | Microsoft AI Blog for Business & Tech

Microsoft to engage with customers to further develop Turing natural language models

September 23, 2020

As part of the companywide AI at Scale initiative, Microsoft announced at its Ignite conference that it plans to begin working with select customers to further develop its Turing natural language representation (NLR) models. AI at Scale, which was announced…

In the news | The Batch

Toward 1 Trillion Parameters

September 16, 2020

An open source library could spawn trillion-parameter neural networks and help small-time developers build big-league models. What’s new: Microsoft upgraded DeepSpeed, a library that accelerates the PyTorch deep learning framework. The revision makes it possible to train models five times…

In the news | Analytics India Magazine

Microsoft Releases Latest Version Of DeepSpeed, Its Python Library For Deep Learning Optimisation

September 15, 2020

Recently, Microsoft announced the new advancements in the popular deep learning optimisation library known as DeepSpeed. This library is an important part of Microsoft’s new AI at Scale initiative to enable next-generation AI capabilities at scale.

Microsoft Research Blog

DeepSpeed: Extreme-scale model training for everyone

September 10, 2020 | DeepSpeed Team, Rangan Majumder, and Junhua Wang

In February, we announced DeepSpeed, an open-source deep learning training optimization library, and ZeRO (Zero Redundancy Optimizer), a novel memory optimization technology in the library, which vastly advances large model training by improving scale, speed, cost, and usability. DeepSpeed has…

In the news | VentureBeat

Microsoft’s updated DeepSpeed can train trillion-parameter AI models with fewer GPUs

September 10, 2020

Microsoft today released an updated version of its DeepSpeed library that introduces a new approach to training AI models containing trillions of parameters, the variables internal to the model that inform its predictions. The company claims the technique, dubbed 3D…

Microsoft Research Blog

XGLUE: Expanding cross-lingual understanding and generation with tasks from real-world scenarios

June 11, 2020 | Nan Duan, Yaobo Liang, and Daniel Campos

What we can teach a model to do with natural language is dictated by the availability of data. Currently, we have a lot of labeled data for very few languages, making it difficult to train models to accomplish question answering,…

Microsoft Research Blog

ZeRO-2 & DeepSpeed: Shattering barriers of deep learning speed & scale

May 19, 2020 | DeepSpeed Team, Rangan Majumder, and Junhua Wang

In the news | The AI Blog

Microsoft announces new supercomputer, lays out vision for future AI work

May 19, 2020

Microsoft has built one of the top five publicly disclosed supercomputers in the world, making new infrastructure available in Azure to train extremely large artificial intelligence models, the company is announcing at its Build developers conference.

Microsoft Research Blog

Objects are the secret key to revealing the world between vision and language

May 15, 2020 | Chunyuan Li, Lei Zhang, and Jianfeng Gao

Humans perceive the world through many channels, such as images viewed by the eyes or voices heard by the ears. Though any individual channel might be incomplete or noisy, humans can naturally align and fuse the information collected from multiple…