Xia Song

Partner Group Science Manager

Project Turing

A deep learning initiative inside Microsoft to build the best-in-class models for use by Microsoft and power AI applications across entire Microsoft product family (Word, PowerPoint, Office, Dynamics,…

AI at Scale

AI at Scale is an applied research initiative that works to evolve Microsoft products with the adoption of deep learning for both natural language text and image processing.…

Figure 1. Trend of sizes of state-of-the-art NLP models over time

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model

We are excited to introduce the DeepSpeed- and Megatron-powered Megatron-Turing Natural Language Generation model (MT-NLG), the largest and the most powerful monolithic transformer language model trained to date,…

Turing-NLG: A 17-billion-parameter language model by Microsoft

This figure was adapted from a similar image published in DistilBERT. Turing Natural Language Generation (T-NLG) is a 17 billion parameter language model by Microsoft that outperforms the…

About

Xia Song is a Partner Group Science Manager in Microsoft Project Turing. Project Turing is an applied research group that works to evolve Microsoft products with building and adoption of deep learning technology for both text and image processing (such as Turing NLG, Turing NLR). The work is actively being integrated into multiple Microsoft products including Bing, Office, and Xbox. Xia Song has been with Microsoft since 2010. His focus is Deep Learning, especially Neural NLP technology under production context.

Featured content

Microsoft Turing Universal Language Representation model, T-ULRv5, tops XTREME leaderboard and trains 100x faster

Today, we are excited to announce that with our latest Turing universal language representation model (T-ULRv5), a Microsoft-created model is once again the state of the art and at the top of the Google XTREME public leaderboard. Resulting from a…

Microsoft Turing Universal Language Representation model, T-ULRv2, tops XTREME leaderboard

Today, we are happy to announce that Turing multilingual language model (T-ULRv2) is the state of the art at the top of the Google XTREME public leaderboard. Created by the Microsoft Turing team in collaboration with Microsoft Research, the model…

Generic Intent Representation in Web Search

This paper presents GEneric iNtent Encoder (GEN Encoder) which learns a distributed representation space for user intent in search. Leveraging large scale user clicks from Bing search logs as weak supervision of user intent, GEN Encoder learns to map queries…

Transformer-XH: Multi-evidence Reasoning with Extra Hop Attention

Transformers have obtained significant success modeling natural language as a sequence of text tokens. However, in many real world scenarios, textual data inherently exhibits structures beyond a linear sequence such as trees and graphs; many tasks require reasoning with evidence…

Leading Conversational Search by Suggesting Useful Questions

This paper studies a new scenario in conversational search, conversational question suggestion, which leads search engine users to more engaging experiences by suggesting interesting, informative, and useful follow-up questions. We first establish a novel evaluation metric, usefulness, which goes beyond…

Neural Ranking Models with Multiple Document Fields

Deep neural networks have recently shown promise in the \emph{ad-hoc retrieval} task. However, such models have often been based on one field of the document, for example considering document title only or document body only. Since in practice documents typically…