Microsoft Research Blog

Artificial intelligence

  1. Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting 

    September 24, 2021 | Wangchunshu Zhou, Tao Ge, Ke Xu, and Furu Wei

    In this paper, we generalize text infilling (e.g., masked language models) by proposing Sequence Span Rewriting (SSR) as a self-supervised sequence-to-sequence (seq2seq) pre-training objective. SSR provides more fine-grained learning signals for text representations by supervising the model to rewrite imperfect spans to ground truth, and…

  2. Scalable and Efficient MoE Training for Multitask Multilingual Models 

    September 22, 2021

    The Mixture of Experts (MoE) models are an emerging class of sparsely activated deep learning models that have sublinear compute costs with respect to their parameters. In contrast with dense models, the sparse architecture of MoE offers opportunities for drastically growing model size with significant…

  3. TS2Vec: Towards Universal Representation of Time Series 

    September 18, 2021

    This paper presents TS2Vec, a universal framework for learning representations of time series in an arbitrary semantic level. Unlike existing methods, TS2Vec performs contrastive learning in a hierarchical way over augmented context views, which enables a robust contextual representation for each timestamp. Furthermore, to obtain…

  4. Rapid Model Architecture Adaption for Meta-Learning 

    September 9, 2021

    Network Architecture Search (NAS) methods have recently gathered much attention. They design networks with better performance and use a much shorter search time compared to traditional manual tuning. Despite their efficiency in model deployments, most NAS algorithms target a single task on a fixed hardware…

  5. Detecting Speaker Personas from Conversational Texts 

    September 4, 2021

    Personas are useful for dialogue response prediction. However, the personas used in current studies are pre-defined and hard to obtain before a conversation. To tackle this issue, we study a new task, named Speaker Persona Detection (SPD), which aims to detect speaker personas based on…

  6. Multi-objective Genetic Algorithm Based Deep Learning Model for Automated COVID-19 Detection Using Medical Image Data. 

    September 1, 2021 | Mukul Singh, Shrey Bansal, Rahul Kumar Dubey, and Bijaya Ketan Panigrahi

    In early 2020, the world is amid a significant pandemic due to the novel coronavirus disease outbreak, commonly called the COVID-19. Coronavirus is a lung infection disease caused by the Severe Acute Respiratory Syndrome Coronavirus 2 virus (SARS-CoV-2). Because of its high transmission rate, it…

  7. MergeBERT: Program Merge Conflict Resolution via Neural Transformers 

    August 31, 2021

    Collaborative software development is an integral part of the modern software development life cycle, essential to the success of large-scale software projects. When multiple developers make concurrent changes around the same lines of code, a merge conflict may occur. Such conflicts stall pull requests and…

  8. How Interpretable and Trustworthy are GAMs 

    August 13, 2021

    Generalized additive models (GAMs) have become a leading model class for data bias discovery and model auditing. However, there are a variety of algorithms for training GAMs, and these do not always learn the same things. Statisticians originally used splines to train GAMs, but more…