Insights into the Challenges and Opportunities of Large Multi-Modal Models for Blind and Low Vision Users: CLIP
PARIKSHA: A Scalable, Democratic, Transparent Evaluation Platform for Assessing Indic Large Language Models
Publication Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training Chengyi Wang, Yiming Wang, Yu Wu, Sanyuan Chen, Jinyu Li, Shujie Liu, Furu Wei Interspeech | September 2022
Publication Separating Long-Form Speech with Group-Wise Permutation Invariant Training Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei Interspeech 2022 | September 2022
Publication Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Zhuo Chen, Gang Liu, Jinyu Li, Jian Wu, Xiangzhan Yu, Furu Wei Interspeech | September 2022
Publication Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei Interspeech | September 2022
Publication Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong Interspeech 2022 | September 2022
Publication VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition Naoyuki Kanda, Jian Wu, Xiaofei Wang, Zhuo Chen, Jinyu Li, Takuya Yoshioka arXiv:2209.04974 | September 2022
Publication Towards Contextual Spelling Correction for Customization of End-to-End Speech Recognition Systems Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil IEEE/ACM Transactions on Audio, Speech, and Language Processing | September 2022, Vol 30: pp. 3089-3097
Publication Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation Xiaofei Wang, Dongmei Wang, Naoyuki Kanda, Sefik Emre Eskimez, Takuya Yoshioka INTERSPEECH 2022 | September 2022
Publication What is it like to program with artificial intelligence? Advait Sarkar, Andy Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, Ben Zorn Proceedings of the 33rd Annual Conference of the Psychology of Programming Interest Group (PPIG 2022) | September 2022 Project Project
Publication Adapting Task-Oriented Dialogue Models for Email Conversations Soham Deshmukh, Charles Lee arXiv preprint arXiv:2208.09439 | August 2022 Project