Microsoft Research Blog

Artificial intelligence

A Causal View on Robustness of Neural Networks

December 3, 2020 | Cheng Zhang, Kun Zhang, and Yingzhen Li

We present a causal view on the robustness of neural networks against input manipulations, which applies not only to traditional classification tasks but also to general measurement data. Based on this view, we design a deep causal manipulation augmented model (deep CAMA) which explicitly models…
Deep Evidential Regression

December 2, 2020 | Alexander Amini, Wilko Schwarting, Ava Soleimany, and Daniela Rus

Deterministic neural networks (NNs) are increasingly being deployed in safety critical domains, where calibrated, robust, and efficient measures of uncertainty are crucial. In this paper, we propose a novel method for training non-Bayesian NNs to estimate a continuous target as well as its associated evidence…
On Warm-Starting Neural Network Training

December 1, 2020 | Jordan Ash and Ryan P. Adams

In many real-world deployments of machine learning systems, data arrive piecemeal. These learning scenarios may be passive, where data arrive incrementally due to structural properties of the problem (e.g., daily financial data) or active, where samples are selected according to a measure of their quality…
Formality Style Transfer with Shared Latent Space

December 1, 2020

Conventional approaches for formality style transfer borrow models from neural machine translation, which typically requires massive parallel data for training. However, the dataset for formality style transfer is considerably smaller than translation corpora. Moreover, we observe that informal and formal sentences closely resemble each other,…
Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval

December 1, 2020 | Bhaskar Mitra

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents---or short passages---in response…
AIDE: Accelerating Image-Based Ecological Surveys with Interactive Machine Learning

November 30, 2020 | Benjamin Kellenberger, Devis Tuia, and Dan Morris

Ecological surveys increasingly rely on large-scale image datasets, typically terabytes of imagery for a single survey. The ability to collect this volume of data allows surveys of unprecedented scale, at the cost of expansive volumes of photo-interpretation labour. We present Annotation Interface for Data-driven Ecology…
GLGE: A New General Language Generation Evaluation Benchmark

November 23, 2020

Multi-task benchmarks such as GLUE and SuperGLUE have driven great progress of pretraining and transfer learning in Natural Language Processing (NLP). These benchmarks mostly focus on a range of Natural Language Understanding (NLU) tasks, without considering the Natural Language Generation (NLG) models. In this paper,…
Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language

November 17, 2020

Neuro-symbolic representations have proved effective in learning structure information in vision and language. In this paper, we propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning. Our approach uses a dictionary learning-based method of learning relations between videos and their paired…
Assessment of the Relative Importance of different hyper-parameters of LSTM for an IDS

November 15, 2020 | Mohit Sewak, Sanjay K. Sahay, and Hemant Rathore

Recurrent deep learning language models like the LSTM are often used to provide advanced cyber-defense for high-value assets. The underlying assumption for using LSTM networks for malware-detection is that the op-code sequence of a malware could be treated as a (spoken) language representation. There are…
Efficient Per-Example Gradient Computations in Convolutional Neural Networks

November 13, 2020 | Gaspar Rochette, Andre Manoel, and Eric W. Tramel

Deep learning frameworks leverage GPUs to perform massively-parallel computations over batches of many training examples efficiently. However, for certain tasks, one may be interested in performing per-example computations, for instance using per-example gradients to evaluate a quantity of interest unique to each example. One notable…
A partition-based similarity for classification distributions.

November 11, 2020

Herein we define a measure of similarity between classification distributions that is both principled from the perspective of statistical pattern recognition and useful from the perspective of machine learning practitioners. In particular, we propose a novel similarity on classification distributions, dubbed task similarity, that quantifies…
Enhancing the Interoperability between Deep Learning Frameworks by Model Conversion

November 7, 2020

Deep learning (DL) has become one of the most successful machine learning techniques. To achieve the optimal development result, there are emerging requirements on the interoperability between DL frameworks that the trained model files and training/serving programs can be re-utilized. Faithful model conversion is a…

No results