Microsoft Research Blog

Artificial intelligence

  1. RANP: Resource Aware Neuron Pruning at Initialization for 3D CNNs 

    October 5, 2020 | Zhiwei Xu, Thalaiyasingam Ajanthan, Vibhav Vineet, and Richard I. Hartley

    Although 3D Convolutional Neural Networks (CNNs) are essential for most learning based applications involving dense 3D data, their applicability is limited due to excessive memory and computational requirements. Compressing such networks by pruning therefore becomes highly desirable. However, pruning 3D CNNs is largely unexplored possibly…

  2. MetaPhys: Unsupervised Few-Shot Adaptation for Non-Contact Physiological Measurement. 

    October 4, 2020

    There are large individual differences in physiological processes, making designing personalized health sensing algorithms challenging. Existing machine learning systems struggle to generalize well to unseen subjects or contexts, especially in video-based physiological measurement. Although fine-tuning for a user might address this issue, it is difficult…

  3. TaxiNLI: Taking a Ride up the NLU Hill 

    September 30, 2020 | Pratik Joshi, Somak Aditya, Aalok Sathe, and Monojit Choudhury

    Pre-trained Transformer-based neural architectures have consistently achieved state-of-the-art performance in the Natural Language Inference (NLI) task. Since NLI examples encompass a variety of linguistic, logical, and reasoning phenomena, it remains unclear as to which specific concepts are learnt by the trained systems and where they…

  4. Weakly Supervised Semantic Segmentation in the 2020 IEEE GRSS Data Fusion Contest 

    September 25, 2020

    We propose an iterative clustering-based label super-resolution approach and epitome-based approach to weakly supervised semantic segmentation, as well as a deep learning-based postprocessing step for land cover segmentation. An ensemble of the iterative clustering and epitome approaches with the proposed postprocessing step results in a…

  5. An Empirical Study on Neural Keyphrase Generation 

    September 21, 2020

    Recent years have seen a flourishing of neural keyphrase generation works, including the release of several large-scale datasets and a host of new models to tackle them. Model performance on keyphrase generation tasks has increased significantly with evolving deep learning research. However, there lacks a…

  6. Improving Handwritten OCR with Augmented Text Line Images Synthesized from Online Handwriting Samples by Style-Conditioned GAN 

    September 1, 2020 | Mingyang Guan, Haisong Ding, Kai Chen, and Qiang Huo

    By leveraging large amounts of training data and deep learning technologies, performances of modern handwritten optical character recognition (OCR) systems have been greatly improved. However, collecting and labeling massive handwriting images are both time-consuming and expensive. In this paper, we propose to augment handwritten OCR…

  7. Grasp-type Recognition Leveraging Object Affordance 

    August 25, 2020 | Naoki Wake, Kazuhiro Sasabuchi, and Katsushi Ikeuchi

    A key challenge in robot teaching is grasp-type recognition with a single RGB image and a target object name. Here, we propose a simple yet effective pipeline to enhance learning-based recognition by leveraging a prior distribution of grasp types for each object. In the pipeline,…

  8. How machine learning can help select capping layers to suppress perovskite degradation 

    August 19, 2020

    Environmental stability of perovskite solar cells (PSCs) has been improved by trial-and-error exploration of thin low-dimensional (LD) perovskite deposited on top of the perovskite absorber, called the capping layer. In this study, a machine-learning framework is presented to optimize this layer. We featurize 21 organic halide…

  9. REFORM: Recognizing F-formations for Social Robots 

    August 16, 2020 | Hooman Hedayati, Annika Muehlbradt, Daniel J. Szafir, and Sean Andrist

    Recognizing and understanding conversational groups, or F-formations, is a critical task for situated agents designed to interact with humans. F-formations contain complex structures and dynamics, yet are used intuitively by people in everyday face-to-face conversations. Prior research exploring ways of identifying F-formations has largely relied…

  10. High Resolution Zero-Shot Domain Adaptation of Synthetically Rendered Face Images 

    August 9, 2020 | Stephan J. Garbin, Marek Kowalski, Matthew Johnson, and Jamie Shotton

    Generating photorealistic images of human faces at scale remains a prohibitively difficult task using computer graphics approaches. This is because these require the simulation of light to be photorealistic, which in turn requires physically accurate modelling of geometry, materials, and light sources, for both the…

  11. A Spectral Energy Distance for Parallel Speech Synthesis 

    August 3, 2020

    Speech synthesis is an important practical generative modeling problem that has seen great progress over the last few years, with likelihood-based autoregressive neural models now outperforming traditional concatenative systems. A downside of such autoregressive models is that they require executing tens of thousands of sequential…