Learning Dynamic Belief Graphs to Generalize on Text-Based Games

Ashutosh Adhikari; Xingdi Yuan; Marc-Alexandre Côté; Mikulas Zelinka; Marc-Antoine Rondeau; Romain Laroche; Pascal Poupart; Jian Tang; Adam Trischler; William L. Hamilton

Learning Dynamic Belief Graphs to Generalize on Text-Based Games

Ashutosh Adhikari ,
Xingdi Yuan ,
Marc-Alexandre Côté ,
Mikulas Zelinka ,
Marc-Antoine Rondeau ,
Romain Laroche ,
Pascal Poupart ,
Jian Tang ,
Adam Trischler ,
William L. Hamilton

NeurIPS 2020 | October 2020

Organized by ACM

Download BibTex

Playing text-based games requires skill in processing natural language and in planning. Although a key goal for agents solving this task is to generalize across multiple games, most previous work has either focused on solving a single game or has tackled generalization with rule-based heuristics. In this work, we investigate how structured information in the form of a knowledge graph (KG) can facilitate effective planning and generalization. We introduce a novel transformer-based sequence-to-sequence model that constructs a “belief” KG from raw text observations of the environment, dynamically updating this belief graph at every game step as it receives new observations. To train this model to build useful graph representations, we introduce and analyze a set of graph-related pre-training tasks. We demonstrate empirically that KG-based representations from our model help agents to converge faster to better policies for multiple text-based games, and further, enable stronger zero-shot performance on unseen games. Experiments on unseen games show that our best agent outperforms text-based baselines by 21.6%.