From robots to cars, virtual assistants and voice-controlled drones, computing devices are increasingly expected to communicate naturally with people and to understand the visual context in which they operate. In this talk, I will present our latest work on generating and comprehending visually-grounded language. First, we will discuss the challenging task of describing an image […]
Peter Anderson is a Research Scientist in the School of Interactive Computing at Georgia Tech. He completed his PhD in Computer Science at Australian National University in 2018 where he was advised by Prof. Stephen Gould. His research is in computer vision, focusing particularly on the intersection between vision and language including tasks such as image captioning, visual question answering (VQA), and vision-and-language navigation (VLN). He was a member of the team that won the 2017 VQA Challenge. He has published at major computer vision (CVPR, ECCV), natural language processing (ACL, EMNLP), machine learning (NeurIPS) and robotics (ICRA) conferences, and is a co-organizer of workshops and tutorials at CVPR, ECCV, NeurIPS, RSS and ACL.