Microsoft Research Blog

Artificial intelligence

  1. Deep learning similarities from different representations of source code 

    May 27, 2018

    Assessing the similarity between code components plays a pivotal role in a number of Software Engineering (SE) tasks, such as clone detection, impact analysis, refactoring, etc. Code similarity is generally measured by relying on manually defined or hand-crafted features, e.g., by analyzing the overlap among…

  2. Discovering Canonical Indian English Accents: A Crowdsourcing-based Approach 

    April 24, 2018

    Automatic Speech Recognition (ASR) systems typically degrade in performance when recognizing an accent different from the accents in the training data. One way to overcome this problem without training new models for every accent is adaptation. India has over a hundred major languages, which leads…

  3. Neural Sequential Malware Detection with Parameters 

    April 14, 2018 | Rakshit Agrawal, Jack W. Stokes, Mady Marinescu, and Karthik Selvaraj

    Sequential models which analyze system API calls have shown promise for detecting unknown malware. Athiwaratkun and Stokes recently proposed a two-stage model which uses a long short-term memory (LSTM) model for learning a set of features which are then input to a second classifier. Kolosnjaji…

  4. Learning Latent Semantic Annotations for Grounding Natural Language to Structured Data 

    January 1, 2018

    Previous work on grounded language learning did not fully capture the semantics underlying the correspondences between structured world state representations and texts, especially those between numerical values and lexical terms. In this paper, we attempt at learning explicit latent semantic annotations from paired structured tables…

  5. Sylvester Normalizing Flows for Variational Inference 

    January 1, 2018 | Rianne van den Berg, Leonard Hasenclever, Jakub M. Tomczak, and Max Welling

    Variational inference relies on flexible approximate posterior distributions. Normalizing flows provide a general recipe to construct flexible variational posteriors. We introduce Sylvester normalizing flows, which can be seen as a generalization of planar flows. Sylvester normalizing flows remove the well-known single-unit bottleneck from planar flows,…

  6. Mention and Entity Description Co-Attention for Entity Disambiguation 

    January 1, 2018

    For the task of entity disambiguation, mention contexts and entity descriptions both contain various kinds of information content while only a subset of them are helpful for disambiguation. In this paper, we propose a type-aware co-attention model for entity disambiguation, which tries to identify the…

  7. Analyzing the Training Processes of Deep Generative Models 

    December 31, 2017

    Among the many types of deep models, deep generative models (DGMs) provide a solution to the important problem of unsupervised and semi-supervised learning. However, training DGMs requires more skill, experience, and know-how because their training is more complex than other types of deep models such…

  8. A Statistical Framework for Product Description Generation 

    November 1, 2017

    We present in this paper a statistical framework that generates accurate and fluent product description from product attributes. Specifically, after extracting templates and learning writing knowledge from attribute-description parallel data, we use the learned knowledge to decide what to say and how to say for…