Loading...
DeepSpeed MoE powers eight times bigger models using expert-parallelism + ZeRO-Offload compared with expert-parallelism only. A graph shows supported model sizes on NVIDIA A100 GPUs. DeepSpeed MoE scales near-linearly with respect to the number of GPUs. Z-code MoE (10B) consistently outperforms other systems on BLEU scores for an in-house 50 language test dataset. Read more in the blog post. 
Microsoft Research Blog

DeepSpeed powers 8x larger MoE model training with high performance 

August 18, 2021 | DeepSpeed Team and Z-code Team

Today, we are proud to announce DeepSpeed MoE, a high-performance system that supports massive scale mixture of experts (MoE) models as part of the DeepSpeed (opens in new tab) optimization library. MoE models are an emerging class of sparsely activated…

Technical diagram of MEB model. MEB is a sparse neural network model composed of an input layer taking in binary features, a feature embedding layer transforming each binary feature into a 15-dimension vector, a sum pooling layer applied on each of 49 feature groups and concatenated to produce a 735-dimension vector, which is then passed through two dense layers to produce a click probability. Features shown in this figure are generated from the example query “Microsoft Windows” and document www.microsoft.com/en-us/windows.
Microsoft Research Blog

Make Every feature Binary: A 135B parameter sparse neural network for massively improved search relevance 

August 4, 2021 | Junyan Chen, Frédéric Dubut, Jason (Zengzhong) Li, and Rangan Majumder

Recently, Transformer-based deep learning models like GPT-3 have been getting a lot of attention in the machine learning world. These models excel at understanding semantic relationships, and they have contributed to large improvements in Microsoft Bing’s search experience (opens in…

DeepSpeed multi GPU inference offers up to 6.9 times throughput improvement for large deep learning model inference. Progressive Layer Dropping offers 2.8 times faster convergence for large model training. 1-bit LAMB offers up to 4.6 times less communication overhead. Single GPU speedups for inference: 2.1 times on BERT Base, 4.4 times on BERT Large, 3.8 times on GPT 2, 3.5 times on GPT 2 XL, 1.9 times on GPT Neo. Multi GPU speedups for inference: 6.2 times for Turing NLG, 3.7 times for 175 billion parameter language model.
Microsoft Research Blog

DeepSpeed: Accelerating large-scale model inference and training via system optimizations and compression 

May 24, 2021 | DeepSpeed Team, Rangan Majumder, and Andrey Proskurin

Last month, the DeepSpeed Team announced ZeRO-Infinity, a step forward in training models with tens of trillions of parameters. In addition to creating optimizations for scale, our team strives to introduce features that also improve speed, cost, and usability. As…

graphical user interface, text, application, email
Microsoft Research Blog

The science behind semantic search: How AI from Bing is powering Azure Cognitive Search 

March 2, 2021 | Rangan Majumder, Alec Berntson, Daxin Jiang (姜大昕), Jianfeng Gao, Furu Wei, and Nan Duan

Azure Cognitive Search (opens in new tab) is a cloud search service that gives developers APIs and tools to build rich search experiences over private, heterogeneous content in web, mobile, and enterprise applications. It has multiple components, including an API for indexing and querying, seamless integration through Azure data ingestion, deep…

In the news | WindowsClub

Bing Search explains how it fixes Bad Spelling in 100 Languages 

February 11, 2021

One fantastic feature in search engines is that they can understand when you misspell a word and correct it. This seemingly simple feature saves a lot of time in an average internet user’s life, but we haven’t quite known how…

In the news | OnMSFT

Microsoft announces Speller100, a new AI-powered tool that checks spelling in 100+ languages 

February 10, 2021

Microsoft has launched a new language system that should help to improve the search experience in Bing. The tool is called Speller100, and it leverages several AI models to correct spelling in over 100 languages to make the search engine…

In the news | ZDNet

Microsoft: Here’s how we fix bad spelling in 100 languages to get you the right search results 

February 9, 2021

Microsoft has explained how it is using a variety of technologies and techniques to fix bad spellings that can mean queries addressed to its Bing search engine would otherwise deliver the wrong results. The software giant is getting back to…

In the news | NextWeb

Microsoft says it’s developed ‘the most comprehensive spelling correction system ever made’ 

February 9, 2021

Microsoft has unveiled an AI system called Speller100 that corrects spelling in over 100 languages used in search queries on Bing. “We believe Speller100 is the most comprehensive spelling correction system ever made in terms of language coverage and accuracy,”…

Diagram shows Model Architecture of Microsoft Vision Model ResNet-50
Microsoft Research Blog

Speller100: Zero-shot spelling correction at scale for 100-plus languages 

February 8, 2021 | Jingwen Lu, Jidong Long (龙继东), and Rangan Majumder

At Microsoft Bing, our mission is to delight users everywhere with the best search experience. We serve a diverse set of customers all over the planet who issue queries in over 100 languages. In search we’ve found about 15% of…

  • Previous
  • 1
  • 2
  • 3
  • 4
  • …
  • 8
  • Next