teal background triangular pattern

June 3, 2026

Computer Vision in the Wild at CVPR 2026

Towards Unified Multimodal Agents for Reasoning in the Wild

Mountain Standard Time (UTC -7)

Location: Denver, Colorado, USA

Agenda (tentative)

Time	Description
13:00-13:30	Invited talk: Spatial Intelligence and Embodied AI Manling Li, Northwestern University
13:30-14:00	Invited talk: Robotic perception, planning, and reasoning Chelsea Finn, Stanford & PI
14:00-14:30	Workshop paper presentations
14:30-15:00	Afternoon break + poster session
15:00-15:30	Invited talk: Test-time scaling and reinforcement learning Xiaolong Wang, UCSD & Nvidia
15:30-16:00	Invited talk: Multimodal reasoning and reward-driven video understanding Mohit Bansal, University North Carolina Chapel Hill
16:00-16:30	Invited talk: Sam3 promptable concept segmentation Kate Saenko, Meta AGI Foundations
16:30-17:30	Panel discussion and closing remarks Moderators: Zhengyuan Yang, Jianfeng Gao