Research Intern – Interactive Multimodal Futures Group (Situated & Affective Computing)
The Interactive Multimodal Futures (IMF) group at Microsoft Research seeks a PhD-level Research Intern to work on a project at the intersection of situated interaction, affective computing, and human-centered AI systems. The project will include…
Spatial Audio Rendering for Speech Live Translation
Language barriers in virtual meetings remain a persistent challenge to global collaboration. While real-time translation technologies offer a promising solution, their integration into conversational interfaces often neglects key perceptual cues. This study explores how spatial…
Distant conversational speech recognition: Challenges and Opportunities
State-of-the-art ASR systems excel on close-talk benchmarks but struggle with far-field conversational speech, where error rates remain above 20%. Current benchmark datasets inadequately assess generalization across domains and real-world conditions, often relying on oracle segmentation…