Microsoft Research Blog

Artificial intelligence

FedFR: Joint Optimization Federated Framework for Generic and Personalized Face Recognition

December 23, 2021 | Chih-Ting Liu, Chien-Yi Wang, Shao-Yi Chien, and Shang-Hong Lai

Current state-of-the-art deep learning-based face recognition (FR) models require a large number of face identities for central training. However, due to the growing privacy awareness, it is prohibited to access the face images on user devices to continually improve face recognition models. Federated Learning (FL)…
ValueNet: A New Dataset for Human Value Driven Dialogue System

December 12, 2021

Building a socially intelligent agent involves many challenges, one of which is to teach the agent to speak guided by its value like a human. However, value-driven chatbots are still understudied in the area of dialogue systems. Most existing datasets focus on commonsense reasoning or…
RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling

December 10, 2021

Recent advances in large-scale pre-training such as GPT-3 allow seemingly high-quality text to be generated from a given prompt. However, such generation systems often suffer from problems of hallucinated facts and are not inherently designed to incorporate useful external information. Grounded generation models appear to…
From Good to Best: Two-Stage Training for Cross-lingual Machine Reading Comprehension

December 9, 2021

Cross-lingual Machine Reading Comprehension (xMRC) is challenging due to the lack of training data in low-resource languages. The recent approaches use training data only in a resource-rich language like English to fine-tune large-scale cross-lingual pre-trained language models. Due to the big difference between languages, a…
Counter-Strike Deathmatch with Large-Scale Behavioural Cloning

December 9, 2021 | Tim Pearce and Jun Zhu

This paper describes an AI agent that plays the popular first-person-shooter (FPS) video game `Counter-Strike; Global Offensive' (CSGO) from pixel input. The agent, a deep neural network, matches the performance of the medium difficulty built-in AI on the deathmatch game mode, whilst adopting a humanlike…
Contrastive Learning of Global-Local Video Representations

December 6, 2021 | Shuang Ma, Zhaoyang Zeng, Daniel McDuff, and Yale Song

Contrastive learning has delivered impressive results for various tasks in the self-supervised regime. However, existing approaches optimize for learning representations specific to downstream scenarios, i.e., global representations suitable for tasks such as classification or local representations for tasks such as detection and localization. While they…
Aligning Pretraining for Detection via Object-Level Contrastive Learning

December 1, 2021

Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. Such generality for transfer learning, however, sacrifices specificity if we are interested in a certain downstream task. We argue that this could be sub-optimal and thus advocate a…
BEVT: BERT Pretraining of Video Transformers

December 1, 2021

This paper studies the BERT pretraining of video transformers. It is a straightforward but worth-studying extension given the recent success from BERT pretraining of image transformers. We introduce BEVT which decouples video representation learning into spatial representation learning and temporal dynamics learning. In particular, BEVT…
Checklist for Evaluation of Image-Based Artificial Intelligence Reports in Dermatology: CLEAR Derm Consensus Guidelines From the International Skin Imaging Collaboration Artificial Intelligence Working Group

November 30, 2021

Importance The use of artificial intelligence (AI) is accelerating in all aspects of medicine and has the potential to transform clinical care and dermatology workflows. However, to develop image-based algorithms for dermatology applications, comprehensive criteria establishing development and performance evaluation standards are required to ensure…
Effect of noise suppression losses on speech distortion and ASR performance

November 22, 2021 | Sebastian Braun and Hannes Gamper

Deep learning based speech enhancement has made rapid development towards improving quality, while models are becoming more compact and usable for real-time on-the-edge inference. However, the speech quality scales directly with the model size, and small models are often still unable to achieve sufficient quality.…
Document AI: Benchmarks, Models and Applications

November 17, 2021 | Lei Cui, Yiheng Xu, Tengchao Lv, and Furu Wei

Document AI, or Document Intelligence, is a relatively new research topic that refers to the techniques for automatically reading, understanding, and analyzing business documents. It is an important research direction for natural language processing and computer vision. In recent years, the popularity of deep learning…
Deep Risk Model: A Deep Learning Solution for Mining Latent Risk Factors to Improve Covariance Matrix Estimation

November 3, 2021 | Hengxu Lin, Dong Zhou, Weiqing Liu, and Jiang Bian

Modeling and managing portfolio risk is perhaps the most important step to achieve growing and preserving investment performance. Within the modern portfolio construction framework that built on Markowitz's theory, the covariance matrix of stock returns is a required input to calculate portfolio risk. Traditional approaches…

No results