Bayesian Exploration with Heterogeneous Agents

It is common in recommendation systems that users both consume and produce information as they make strategic choices under uncertainty. While a social planner would balance “exploration” and “exploitation” using a multi-armed bandit algorithm, users’ incentives may tilt this balance in favor of exploitation. We consider Bayesian Exploration: a simple model in which the recommendation system (the “principal”) controls the information flow to the users (the “agents”) and strives to incentivize exploration via information asymmetry. A single round of this model is a version of a well-known “Bayesian Persuasion game” from [Kamenica and Gentzkow]. We allow heterogeneous users, relaxing a major assumption from prior work that users have the same preferences from one time step to another. The goal is now to learn the best personalized recommendations. One particular challenge is that it may be impossible to incentivize some of the user types to take some of the actions, no matter what the principal does or how much time she has. We consider several versions of the model, depending on whether and when the user types are reported to the principal, and design a near-optimal “recommendation policy” for each version. We also investigate how the model choice and the diversity of user types impact the set of actions that can possibly be “explored” by each type.