Audio and acoustics

论文与出版物

VibeVoice: Expressive Podcast Generation with Next-Token Diffusion

Zhiliang Peng, Jianwei Yu, Wenhui Wang, Yaoyao Chang, Yutao Sun, Li Dong, Yi Zhu, Weijiang Xu, Hangbo Bao, Zehua Wang, Shaohan Huang, Yan Xia, Furu Wei

ICLR 2026 | February 2026

论文与出版物

EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning

Dingdong Wang, Shujie Liu, Tianhua Zhang, Youjun Chen, Jinyu Li, Helen M. Meng

ICLR 2026 | January 2026

论文与出版物

SALAD-VAE: Semantic Audio Compression with Language-Audio Distillation

Sebastian Braun, Hannes Gamper, Dimitra Emmanouilidou

2026 International Conference on Acoustics, Speech, and Signal Processing | January 2026

论文与出版物

Towards Real-Time Generative Speech Restoration with Flow-Matching

Tsun-An Hsieh, Sebastian Braun

2026 International Conference on Acoustics, Speech, and Signal Processing | January 2026

项目

论文与出版物

Sci-Phi: A Large Language Model Spatial Audio Descriptor

Xilin Jiang, Sebastian Braun, Hannes Gamper

IEEE Open Journal of Signal Processing | January 2026

项目

岗位

Research Intern – Interactive Multimodal Futures Group (Situated & Affective Computing)

Posted: 2025年12月2日

地点: Cambridge, MA, US; Redmond, WA, US

研究领域: Artificial intelligence, Audio and Acoustics, Computer vision, Data platforms and analytics, Graphics and multimedia, Human-computer interaction

The Interactive Multimodal Futures …

视频

Spatial Audio Rendering for Speech Live Translation

2025年11月24日 | Margarita Geleta

Language barriers in virtual meetin…

01:04:38

论文与出版物

Train Short, Infer Long: Speech-LLM Enables Zero-Shot Streamable Joint ASR and Diarization on Long Audio

Mohan Shi, Xiong Xiao, Ruchao Fan, Shaoshi Ling, Jinyu Li

November 2025

论文与出版物

RiTTA: Modeling Event Relations in Text-to-Audio Generation

Yuhang He, Yash Jain, Xubo Liu, Andrew Markham, Vibhav Vineet

2025 Empirical Methods in Natural Language Processing | November 2025

视频

Distant conversational speech recognition: Challenges and Opportunities

2025年10月17日 | Dr. Samuele Cornell, Sunit Sivasankaran

State-of-the-art ASR systems excel …

01:28:41