Short-Term Satisfaction and Long-Term Coverage: Understanding How Users Tolerate Algorithmic Exploration

Tobias Schnabel, Paul N. Bennett, Susan Dumais, Thorsten Joachims

Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM '18) |

Published by ACM

Any learning algorithm for recommendation faces a fundamental trade-off between exploiting partial knowledge of a user’s interests to maximize satisfaction in the short term and discovering additional user interests to maximize satisfaction in the long term. To enable discovery, a machine learning algorithm typically elicits feedback on items it is uncertain about, which is termed algorithmic exploration in machine learning. This exploration comes with a cost to the user, since the items an algorithm chooses for exploration frequently turn out to not match the user’s interests. In this paper, we study how users tolerate such exploration and how presentation strategies can mitigate the exploration cost. To this end, we conduct a behavioral study with over 600 people, where we vary how algorithmic exploration is mixed into the set of recommendations. We find that users respond non-linearly to the amount of exploration, where some exploration mixed into the set of recommendations has little effect on short-term satisfaction and behavior. For long-term satisfaction, the overall goal is to learn via exploration about the items presented. We therefore also analyze the quantity and quality of implicit feedback signals such as clicks and hovers, and how they vary with different amounts of mix-in exploration. Our findings provide insights into how to design presentation strategies for algorithmic exploration in interactive recommender systems, mitigating the short-term costs of algorithmic exploration while aiming to elicit informative feedback data for learning.