Exploiting Correlation to Achieve Faster Learning Rates in Low-Rank Preference Bandits
We introduce the \emph{Correlated Preference Bandits} problem with random utility-based choice models (RUMs), where the goal is to identify the best item from a given pool of n items through online subsetwise preference feedback. We investigate whether models with a simple correlation structure, e.g., low rank,…