Insights into the Challenges and Opportunities of Large Multi-Modal Models for Blind and Low Vision Users: CLIP
PARIKSHA: A Scalable, Democratic, Transparent Evaluation Platform for Assessing Indic Large Language Models
Publication LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupang Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei ACM Multimedia 2022 | October 2022 Project
Publication SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training Ziqiang Zhang, Long Zhou, Junyi Ao, Shujie Liu, Lirong Dai, Jinyu Li, Furu Wei EMNLP 2022 | October 2022
Publication CodeRetriever: A Large Scale Contrastive Pre-Training Method for Code Search. Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang (姜大昕), Weizhu Chen, Nan Duan EMNLP 2022 | October 2022
Publication Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization Zonghan Yang, Xiaoyuan Yi, Peng Liu, Yang Liu, Xing Xie October 2022 Preprint
Publication Large-Scale Streaming End-to-End Speech Translation with Neural Transducers Jian Xue, Peidong Wang, Jinyu Li, Matt Post, Yashesh Gaur Interspeech | September 2022
Publication Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings Naoyuki Kanda, Jian Wu, Yu Wu, Xiong Xiao, Zhong Meng, Xiaofei Wang, Yashesh Gaur, Zhuo Chen, Jinyu Li, Takuya Yoshioka Interspeech 2022 | September 2022
Publication Streaming Multi-Talker ASR with Token-Level Serialized Output Training Naoyuki Kanda, Jian Wu, Yu Wu, Xiong Xiao, Zhong Meng, Xiaofei Wang, Yashesh Gaur, Zhuo Chen, Jinyu Li, Takuya Yoshioka Interspeech 2022 | September 2022
Publication Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training Chengyi Wang, Yiming Wang, Yu Wu, Sanyuan Chen, Jinyu Li, Shujie Liu, Furu Wei Interspeech | September 2022
Publication Separating Long-Form Speech with Group-Wise Permutation Invariant Training Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei Interspeech 2022 | September 2022
Publication Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong Interspeech 2022 | September 2022