Microsoft Research Blog

Artificial intelligence

  1. Unsupervised Context Rewriting for Open Domain Conversation 

    November 1, 2019

    Context modeling has a pivotal role in open domain conversation. Existing works either use heuristic methods or jointly learn context modeling and response generation with an encoder-decoder framework. This paper proposes an explicit context rewriting method, which rewrites the last utterance by considering context history.…

  2. Hierarchical Attention Prototypical Networks for Few-Shot Text Classification 

    November 1, 2019 | Shengli Sun, Kevin Zhou, and Tengchao Lv

    Most of the current effective methods for text classification tasks are based on large-scale labeled data and a great number of parameters, but when the supervised training data are few and difficult to be collected, these models are not available. In this work, we propose…

  3. Hierarchical Attention Prototypical Networks for Few-Shot Text Classification 

    November 1, 2019 | Shengli Sun, Kevin Zhou, and Tengchao Lv

    Most of the current effective methods for text classification tasks are based on large-scale labeled data and a great number of parameters, but when the supervised training data are few and difficult to be collected, these models are not available. In this work, we propose…

  4. Harnessing Pre-Trained Neural Networks with Rules for Formality Style Transfer 

    November 1, 2019

    Formality text style transfer plays an important role in various NLP applications, such as non-native speaker assistants and child education. Early studies normalize informal sentences with rules, before statistical and neural models become a prevailing method in the field. While a rule-based system is still…

  5. Explicit Cross-lingual Pre-training for Unsupervised Machine Translation 

    November 1, 2019

    Pre-training has proven to be effective in unsupervised machine translation due to its ability to model deep context information in cross-lingual scenarios. However, the cross-lingual information obtained from shared BPE spaces is inexplicit and limited. In this paper, we propose a novel cross-lingual pre-training method…

  6. Deep Scalable Image Compression via Hierarchical Feature Decorrelation 

    November 1, 2019 | Zongyu Guo, Zhizheng Zhang, and Zhibo Chen

    Scalable image compression allows reconstructing complete images through partially decoding. It plays an important role for image transmission and storage. In this paper, we study the problem of feature decorrelation for Deep Neural Network (DNN) based image codec. Inspired by self-attention mechanism [1], we design…

  7. Detection of Prevalent Malware Families with Deep Learning 

    October 30, 2019 | Jack W. Stokes, Christian Seifert, Jerry Li, and Nizar Hejazi

    Attackers evolve their malware over time in order to evade detection, and the rate of change varies from family to family depending on the amount of resources these groups devote to their “product”. This rapid change forces anti-malware companies to also direct much human and…

  8. Discourse-Aware Neural Extractive Model for Text Summarization. 

    October 29, 2019 | Jiacheng Xu, Zhe Gan, Yu Cheng, and Jingjing Liu

    Recently BERT has been adopted for document encoding in state-of-the-art text summarization models. However, sentence-based extractive models often result in redundant or uninformative phrases in the extracted summaries. Also, long-range dependencies throughout a document are not well captured by BERT, which is pre-trained on sentence…

  9. The adverse effects of code duplication in machine learning models of code 

    October 23, 2019 | Miltos Allamanis

    The field of big code relies on mining large corpora of code to perform some learning task towards creating better tools for software engineers. A significant threat to this approach was recently identified by Lopes et al. (2017) who found a large amount of near-duplicate…