Microsoft Research Blog

Artificial intelligence

Unsupervised Context Rewriting for Open Domain Conversation

November 1, 2019

Context modeling has a pivotal role in open domain conversation. Existing works either use heuristic methods or jointly learn context modeling and response generation with an encoder-decoder framework. This paper proposes an explicit context rewriting method, which rewrites the last utterance by considering context history.…
Hierarchical Attention Prototypical Networks for Few-Shot Text Classification

November 1, 2019 | Shengli Sun, Kevin Zhou, and Tengchao Lv

Most of the current effective methods for text classification tasks are based on large-scale labeled data and a great number of parameters, but when the supervised training data are few and difficult to be collected, these models are not available. In this work, we propose…
Hierarchical Attention Prototypical Networks for Few-Shot Text Classification

November 1, 2019 | Shengli Sun, Kevin Zhou, and Tengchao Lv

Most of the current effective methods for text classification tasks are based on large-scale labeled data and a great number of parameters, but when the supervised training data are few and difficult to be collected, these models are not available. In this work, we propose…
Harnessing Pre-Trained Neural Networks with Rules for Formality Style Transfer

November 1, 2019

Formality text style transfer plays an important role in various NLP applications, such as non-native speaker assistants and child education. Early studies normalize informal sentences with rules, before statistical and neural models become a prevailing method in the field. While a rule-based system is still…
Explicit Cross-lingual Pre-training for Unsupervised Machine Translation

November 1, 2019

Pre-training has proven to be effective in unsupervised machine translation due to its ability to model deep context information in cross-lingual scenarios. However, the cross-lingual information obtained from shared BPE spaces is inexplicit and limited. In this paper, we propose a novel cross-lingual pre-training method…
Enhancing Neural Data-To-Text Generation Models with External Background Knowledge

November 1, 2019

Recent neural models for data-to-text generation rely on massive parallel pairs of data and text to learn the writing knowledge. They often assume that writing knowledge can be acquired from the training data alone. However, when people are writing, they not only rely on the…
Deep Scalable Image Compression via Hierarchical Feature Decorrelation

November 1, 2019 | Zongyu Guo, Zhizheng Zhang, and Zhibo Chen

Scalable image compression allows reconstructing complete images through partially decoding. It plays an important role for image transmission and storage. In this paper, we study the problem of feature decorrelation for Deep Neural Network (DNN) based image codec. Inspired by self-attention mechanism [1], we design…
Detection of Prevalent Malware Families with Deep Learning

October 30, 2019 | Jack W. Stokes, Christian Seifert, Jerry Li, and Nizar Hejazi

Attackers evolve their malware over time in order to evade detection, and the rate of change varies from family to family depending on the amount of resources these groups devote to their “product”. This rapid change forces anti-malware companies to also direct much human and…
Discourse-Aware Neural Extractive Model for Text Summarization.

October 29, 2019 | Jiacheng Xu, Zhe Gan, Yu Cheng, and Jingjing Liu

Recently BERT has been adopted for document encoding in state-of-the-art text summarization models. However, sentence-based extractive models often result in redundant or uninformative phrases in the extracted summaries. Also, long-range dependencies throughout a document are not well captured by BERT, which is pre-trained on sentence…
SMART POINT CLOUD RECONSTRUCTION OF OBJECTS IN VISUAL SCENES IN COMPUTING ENVIRONMENTS

October 23, 2019 | Lucas Blake C and Fanny Nina Paravecino

A mechanism is described for facilitating smart point cloud reconstruction of objects in visual scenes in computing environments. An apparatus of embodiments, as described herein, includes one or more processors including one or more graphics processors, and photo-consistency logic to perform line searches on cloud…
The adverse effects of code duplication in machine learning models of code

October 23, 2019 | Miltos Allamanis

The field of big code relies on mining large corpora of code to perform some learning task towards creating better tools for software engineers. A significant threat to this approach was recently identified by Lopes et al. (2017) who found a large amount of near-duplicate…
Machine learning-guided channelrhodopsin engineering enables minimally invasive optogenetics.

October 13, 2019

We engineered light-gated channelrhodopsins (ChRs) whose current strength and light sensitivity enable minimally invasive neuronal circuit interrogation. Current ChR tools applied to the mammalian brain require intracranial surgery for transgene delivery and implantation of fiber-optic cables to produce light-dependent activation of a small volume of…

No results