Microsoft Research Blog

Artificial intelligence

A Deep Ensemble Method for Multi-Agent Reinforcement Learning: A Case Study on Air Traffic Control

May 17, 2021

Reinforcement learning (RL), a promising framework for data-driven decision making in an uncertain environment, has successfully been applied in many real-world operation and control problems. However, the application of RL in a large-scale decentralized multi-agent environment remains a challenging problem due to the partial observability…
From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding

May 16, 2021

The lack of publicly available evaluation data for low-resource languages limits progress in Spoken Language Understanding (SLU). As key tasks like intent classification and slot filling require abundant training data, it is desirable to reuse existing data in high-resource languages to develop models for low-resource…
Breaking the Computation and Communication Abstraction Barrier in Distributed Machine Learning Workloads

May 11, 2021

Recent trends towards increasing large machine learning models require both training and inference tasks to be distributed. Considering the huge cost of training these models, it is imperative to unlock optimizations in computation and communication to obtain best performance. However, current logical separation between computation…
Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle

May 10, 2021

Syntax is fundamental to our thinking about language. Failing to capture the structure of input language could lead to generalization problems and over-parametrization. In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM). The model explicitly models the structure with…
Analyzing the Nuances of Transformers’ Polynomial Simplification Abilities

May 7, 2021 | Vishesh Agarwal, Somak Aditya, and Navin Goyal (navingo)

Symbolic Mathematical tasks such as integration often require multiple well-defined steps and understanding of sub-tasks to reach a solution. To understand Transformers' abilities in such tasks in a fine-grained manner, we deviate from traditional end-to-end settings, and explore a step-wise polynomial simplification task. Polynomials can…
Training Structured Mechanical Models by Minimizing Discrete Euler-Lagrange Residual

May 4, 2021 | Kunal Menda, Jayesh K. Gupta, Zachary Manchester, and Mykel J. Kochenderfer

Model-based paradigms for decision-making and control are becoming ubiquitous in robotics. They rely on the ability to efficiently learn a model of the system from data. Structured Mechanical Models (SMMs) are a data-efficient black-box parameterization of mechanical systems, typically fit to data by minimizing the…
What Makes Instance Discrimination Good for Transfer Learning

May 3, 2021 | Nanxuan Zhao, Zhirong Wu, Rynson W. H. Lau, and Stephen Lin

Contrastive visual pretraining based on the instance discrimination pretext task has made significant progress. Notably, recent work on unsupervised pretraining has shown to surpass the supervised counterpart for finetuning downstream applications such as object detection and segmentation. It comes as a surprise that image annotations…
Taking Notes on the Fly Helps Language Pre-Training

May 3, 2021

How to make unsupervised language pre-training more efficient and less resource-intensive is an important research direction in NLP. In this paper, we focus on improving the efficiency of language pre-training methods through providing better data utilization. It is well-known that in language data corpus, words…
A Unified Bayesian Framework for Discriminative and Generative Continual Learning

May 3, 2021 | Abhishek Kumar, Sunabha Chatterjee, and Piyush Rai

Continual Learning is a learning paradigm where learning systems are trained on a sequence of tasks. The goal here is to perform well on the current task without suffering from a performance drop on the previous tasks. Two notable directions among the recent advances in…
Taking Notes on the Fly Helps Language Pre-Training

May 3, 2021

How to make unsupervised language pre-training more efficient and less resource-intensive is an important research direction in NLP. In this paper, we focus on improving the efficiency of language pre-training methods through providing better data utilization. It is well-known that in language data corpus, words…
Exploiting structured data for learning contagious diseases under incomplete testing

May 3, 2021

One of the ways that machine learning algorithms can help control the spread of an infectious disease is by building models that predict who is likely to get infected whether or not they display any symptoms, making them good candidates for preemptive isolation. In this…
Initialization and Regularization of Factorized Neural Layers

May 2, 2021 | Mikhail Khodak, Neil Tenenholtz, Lester Mackey, and Nicolo Fusi

Factorized layers---operations parameterized by products of two or more matrices---occur in a variety of deep learning contexts, including compressed model training, certain types of knowledge distillation, and multi-head self-attention architectures. We study how to initialize and regularize deep nets containing such layers, examining two simple,…

No results