Image/Video Transformation
Image and video have become the language people use to communicate on the Internet. Multimedia content connects people and appeals to the young. This project aims at deep image and video transformation to generate high-quality…
OCR and Document Understanding
We have been developing SOTA technologies and industry-leading product solutions for following scenarios: (1) Universal OCR to detect and recognize any text in image/PDF; (2) Universal math OCR to detect and recognize any math expression…
Rich Media Communications
Modern work increasingly relies on online collaboration with real-time communications (RTC). Our research aims to provide real-time, intelligent, and immersive media experiences, with a long-term vision of advancing multimedia technologies in a manner that shapes…
Human Centric Video Understanding
This project is part of the multi-sense efforts within the people centric strategy of Microsoft. It addresses a number of vertical domains for AI by developing effective human-centric spatial understanding technologies to extract insights from…
Microsoft Research Conversations in STEM: Research in STEM as a Career
Research in STEM as a Career Are you curious about career options after academia? Wondering what you might want to do post-postdoc? Hoping to use your skills and knowledge to make a difference in the…
Microsoft Vision
Microsoft Vision Model ResNet-50 is a state-of-the-art ResNet-50 model pretrained with web-scale data, multi-task training, and web-supervision.