A Bayesian Approach to Identifying Representational Errors

  • Ramya Ramakrishnan ,
  • Vaibhav Unhelkar ,
  • ,
  • Julie Shah

Trained AI systems and expert decision makers can make errors that are often difficult to identify and understand. Determining the root cause for these errors can improve future decisions. This work presents Generative Error Model (GEM), a generative model for inferring representational errors based on observations of an actor’s behavior (either simulated agent, robot, or human). The model considers two sources of error: those that occur due to representational limitations — “blind spots” — and non-representational errors, such as those caused by noise in execution or systematic errors present in the actor’s policy. Disambiguating these two error types allows for targeted refinement of the actor’s policy (i.e., representational errors require perceptual augmentation, while other errors can be reduced through methods such as improved training or attention support). We present a Bayesian inference algorithm for GEM and evaluate its utility in recovering representational errors on multiple domains. Results show that our approach can recover blind spots of both reinforcement learning agents as well as human users.