Publication AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation Ziwei Zhou, Zeyuan Lai, Rui Wang, Yifan Yang, Zhening Xing, Yuqing Yang, Qi Dai, Lili Qiu, Chong Luo April 2026
Publication Faithful GRPO: Improving Visual Spatial Reasoning in Multimodal Language Models via Constrained Policy Optimization Sai Srinivas Kancheti, Aditya Kanade, Rohit Sinha, Vineeth N Balasubramanian, Tanuja Ganu April 2026
Publication From Gaze to Guidance: Interpreting and Adapting to Users’ Cognitive Needs with Multimodal Gaze-Aware AI Assistants Valdemar Danry, Javier Hernandez, Andrew D. Wilson, Pattie Maes, Judith Amores April 2026 Project
Publication Training-free Spatially Grounded Geometric Shape Encoding (Technical Report) Yuhan He April 2026
Publication FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching Junchao Yi, Rui Zhao, Jiahao Tang, Weixian Lei, Linjie Li, Qi Su, Zhengyuan Yang, Lijuan Wang, Xiaofeng Zhu, Alex Jinpeng Wang April 2026
Publication Bridging Natural Language and Interactive What-If Interfaces via LLM-Generated Declarative Specification Sneha Gathani, Sirui Zeng, Diya Patel, Ryan A. Rossi, Dan Marshall, Çağatay Demiralp, Steven Drucker, Zhicheng Liu April 2026
Publication Does a Global Perspective Help Prune Sparse MoEs Elegantly? Zeliang Zhang, Nikhil Ghosh, Jiani Liu, Bin Yu, Xiaodong Liu April 2026
Publication FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching Junchao Yi, Rui Zhao, Jiahao Tang, Weixian Lei, Linjie Li, Qi Su, Zhengyuan Yang, Lijuan Wang, Xiaofeng Zhu, Alex Jinpeng Wang April 2026
Publication When Equality Fails as a Rewrite Principle: Provenance and Definedness for Measurement-Bearing Expressions David B. Hulak, A. Ramos, Ruy J. G. B. de Queiroz April 2026
Publication LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals Lihao Sun, Hang Dong, Bo Qiao, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan April 2026