Microsoft Research Blog

Artificial intelligence

Estimating GPU Memory Consumption of Deep Learning Models

November 7, 2020

Deep learning (DL) has been increasingly adopted by a variety of software-intensive systems. Developers mainly use GPUs to accelerate the training, testing, and deployment of DL models. However, the GPU memory consumed by a DL model is often unknown to them before a DL job…
Identifying linked incidents in large-scale online service systems

November 7, 2020

In large-scale online service systems, incidents occur frequently due to a variety of causes, from updates of software and hardware to changes in operation environment. These incidents could significantly degrade system’s availability and customers’ satisfaction. Some incidents are linked because they are duplicate or inter-related.…
How to mitigate the incident? an effective troubleshooting guide recommendation technique for online service systems.

November 7, 2020

In recent years, more and more traditional shrink-wrapped software are provided as 7×24 online services. Incidents (events that lead to service disruptions or outages) could affect service availability and cause great financial loss. Therefore, mitigating the incidents is important and time critical. In practice, a…
A Simple Approach to Learning Unsupervised Multilingual Embeddings

November 1, 2020 | Pratik Jawanpuria, Mayank Meghwanshi, and Bamdev Mishra

Recent progress on unsupervised cross-lingual embeddings in the bilingual setting has given the impetus to learning a shared embedding space for several languages. A popular framework to solve the latter problem is to solve the following two sub-problems jointly: 1) learning unsupervised word alignment between…
Few-Shot Induction of Generalized Logical Concepts via Human Guidance

November 1, 2020 | Mayukh Das, Nandini Ramanan, Janardhan Rao Doppa, and Sriraam Natarajan

We consider the problem of learning generalized first-order representations of concepts from a small number of examples. We augment an inductive logic programming learner with two novel contributions. First, we define a distance measure between candidate concept representations that improves the efficiency of search for…
Routing Enforced Generative Model for Recipe Generation.

November 1, 2020 | Zhiwei Yu, Hongyu Zang, and Xiaojun Wan

One of the most challenging part of recipe generation is to deal with the complex restrictions among the input ingredients. Previous researches simplify the problem by treating the inputs independently and generating recipes containing as much information as possible. In this work, we propose a…
Homophonic Pun Generation with Lexically Constrained Rewriting.

November 1, 2020 | Zhiwei Yu, Hongyu Zang, and Xiaojun Wan

Punning is a creative way to make conversation enjoyable and literary writing elegant. In this paper, we focus on the task of generating a pun sentence given a pair of homophones. We first find the constraint words supporting the semantic incongruity for a sentence. Then…
Long Document Ranking with Query-Directed Sparse Transformer

October 31, 2020 | Jyun-Yu Jiang, Chenyan Xiong, Chia-Jung Lee, and Wei Wang

The computing cost of transformer self-attention often necessitates breaking long documents to fit in pretrained models in document ranking tasks. In this paper, we design Query-Directed Sparse attention that induces IR-axiomatic structures in transformer self-attention. Our model, QDS-Transformer, enforces the principle properties desired in ranking:…
Depth Completion Using a View-constrained Deep Prior

October 31, 2020

Recent work has shown that the structure of convolutional neural networks (CNNs) induces a strong prior that favors natural images. This prior, known as a deep image prior (DIP), is an effective regularizer in inverse problems such as image denoising and inpainting. We extend the…
Generalized Pose-and-Scale Estimation using 4-Point Congruence Constraints

October 31, 2020 | Victor Fragoso and Sudipta Sinha

We present gP4Pc, a new method for computing the absolute pose of a generalized camera with unknown internal scale from four corresponding 3D point-and-ray pairs. Unlike most pose-and-scale methods, gP4Pc is based on constraints arising from the congruence of shapes defined by two sets of…
To Schedule or not to Schedule: Extracting Task Specific Temporal Entities and Associated Negation Constraints

October 31, 2020 | Barun Patra, Chala Fufa, Pamela Bhattacharya, and Charles Lee

State of the art research for date-time entity extraction from text is task agnostic. Consequently, while the methods proposed in literature perform well for generic date-time extraction from texts, they don’t fare as well on task specific date-time entity extraction where only a subset of…
Text Classification Using Label Names Only: A Language Model Self-Training Approach

October 31, 2020

Current text classification methods typically require a good number of human-labeled documents as training data, which can be costly and difficult to obtain in real applications. Humans can perform classification without seeing any labeled examples but only based on a small set of words describing…

No results