illustration of a speaker and soundwaves on top of a computer on a green gradient background

Audio and acoustics

Audio and Acoustics Research Group

Audio and Acoustics Research Group

diagram

CoVoMix: Advancing Zero-shot Speech Generation for Human-like Multi-talker Conversation

What’s Your Story: Ivan Tashev

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

現在の選択

並べ替え: 最近

選択内容のクリア

公開日

Real-time Speech Restoration using Data Prediction Mean Flows

Sebastian Braun

May 2026

公開日

A Comprehensive Ecosystem for Open-Domain Customized Video Generation

Jingxu Zhang, Yuqian Hong, Daneul Kim, Kai Qiu, Qi Dai, Jianmin Bao, Yifan Yang, Xiaoyan Sun, Chong Luo

ICASSP 2026 | May 2026

公開日

Virtual Speech Therapist: A Clinician-in-the-Loop AI Speech Therapy Agent for Personalized and Supervised Therapy

Shakeel A. Sheikh, Patrick Marmaroli, Md. Sahidullah, Slim Ouni, Fabrice Hirsch, Goncalo Leal, Bjorn W. Schuller

May 2026

公開日

Speech LLMs are Contextual Reasoning Transcribers

Keqi Deng, Ruchao Fan, Bo Ren, Yiming Wang, Jinyu Li

April 2026

公開日

RESPOND: Responsive Engagement Strategy for Predictive Orchestration and Dialogue

Meng-Chen Lee, Costas Panay, Javier Hernandez, Sean Andrist, Dan Bohus, Anatoly Churikov, Andrew D. Wilson

March 2026

プロジェクト

公開日

Counting Without Numbers &Finding Without Words

B. N. Patro

March 2026

公開日

Sirens’Whisper: Inaudible Near-Ultrasonic Jailbreaks of Speech-Driven LLMs

Zijian Ling, Pingyi Hu, Xiuyong Gao, Xiaojing Ma, Man Zhou, Jun Feng, Songfeng Lu, Dongmei Zhang, Bin Benjamin Zhu

March 2026

公開日

Aurelius: Relation Aware Text-to-Audio Generation At Scale

Yuhang He, He Liang, Yash Jain, Andrew Markham, Vibhav Vineet

ICLR | February 2026

公開日

VibeVoice: Expressive Podcast Generation with Next-Token Diffusion

Zhiliang Peng, Jianwei Yu, Wenhui Wang, Yaoyao Chang, Yutao Sun, Li Dong, Yi Zhu, Weijiang Xu, Hangbo Bao, Zehua Wang, Shaohan Huang, Yan Xia, Furu Wei

ICLR 2026 | February 2026

公開日

EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning

Dingdong Wang, Shujie Liu, Tianhua Zhang, Youjun Chen, Jinyu Li, Helen M. Meng

ICLR 2026 | January 2026

プライバシーに関する選択