This week, Microsoft Research threw down the gauntlet with the launch of a competition challenging researchers around the world to develop AI agents that can solve text-based games. Conceived by the Machine Reading Comprehension team at Microsoft Research Montreal, the competition—First TextWorld Problems: A Reinforcement and Language Learning Challenge—runs from December 8, 2018 through May 31, 2019.
First TextWorld Problems is built on the TextWorld framework. TextWorld was released to the public in July 2018 at aka.ms/textworld. TextWorld is an extensible, sandbox learning environment for reinforcement learning in text-based games. Beyond game simulation, it has the capacity to generate games stochastically from a user-specified distribution. Such a distribution of games opens new possibilities for the study of generalization and continual or meta-learning in a reinforcement learning setting, by enabling researchers to train and test agents on distinct but related games. TextWorld’s generator gives fine control over game parameters like the size of the game world, the branching factor and length of quests, the density of rewards, and the stochasticity of transitions. Game vocabulary can also be controlled; this directly affects the action and observation spaces. Researchers can also use TextWorld to handcraft games that test for specific knowledge and skills.
The theme for First TextWorld Problems is gathering ingredients to cook a recipe. Agents must determine the necessary ingredients from a recipe book, explore the house to gather ingredients, and return to the kitchen to cook up a delicious meal. Additionally, agents will need to use tools like knives and frying pans. Locked doors and other obstacles along the way must be overcome. The necessary ingredients and their locations change from game to game, as does the layout of the house itself; agents cannot simply memorize a procedure in order to succeed.
While a simple cooking task may seem quotidian by human standards, it is still very difficult for AI. Observations and actions are all text-based (see the example below), so a successful agent must learn to understand and manipulate its environment through language, as well as to ground its language in the environmental dynamics. It must also deal with classic, open reinforcement learning problems like partial observability and sparse rewards.
We hope this competition fosters research into generalization across tasks, meta-learning, zero-shot language understanding, common-sense reasoning, efficient exploration, and effective handling of combinatorial action spaces. The winning team will be awarded a prize of $2000 USD, plus an exclusive one-hour discussion session with a Microsoft Research researcher, as well as being featured in a Microsoft Research blog post and in an accompanying article in the Microsoft Research Newsletter (some restrictions apply, please check competition rules and regulations for details.)
Did we pique your interest? We encourage everyone to put their reinforcement learning prowess—and culinary talents—to the test in First TextWorld Problems. Go to aka.ms/textworld-challenge and sign up today!