Deep Multiagent Reinforcement Learning for Partially Observable Parameterized Environments


June 16, 2016


Matthew Hausknecht


UT Austin


As software and hardware agents begin to perform tasks of genuine interest, they will be faced with environments too complex for humans to predetermine the correct actions to take. Three characteristics shared by many complex domains are 1) high-dimensional state and action spaces, 2) partial observability, and 3) multiple learning agents. To tackle such problems I will describe algorithms that combine deep neural network function approximation with reinforcement learning. First I will describe using recurrent neural networks to handle partial observability in Atari games. Next, I will describe a multiagent soccer domain: Half-Field-Offense and approaches for learning effective policies in this parameterized-continuous action space. I will conclude with ongoing work on multiagent learning in HFO.