Collaborative Pure Exploration in Kernel Bandit

  • Yihan Du ,
  • ,
  • Yuko Kuroki ,
  • Longbo Huang

Proceedings of the 11th International Conference on Learning Representations (ICLR) |

In this paper, we propose a novel Collaborative Pure Exploration in Kernel Bandit model (CoPE-KB), where multiple agents collaborate to complete different but related tasks with limited communication. Our model generalizes prior CoPE formulation with the single-task and classic MAB setting to allow multiple tasks and general reward structures. We propose a novel communication scheme with an efficient kernelized estimator, and design algorithms CoKernelFC and CoKernelFB for CoPE-KB with fixed-confidence and fixed-budget objectives, respectively. Sample and communication complexities are provided to demonstrate the efficiency of our algorithms. Our theoretical results explicitly quantify how task similarities influence learning speedup, and only depend on the effective dimension of feature space. Our novel techniques, such as an efficient kernelized estimator and decomposition of task similarities and arm features, which overcome the communication difficulty in high-dimensional feature space and reveal the impacts of task similarities on sample complexity, can be of independent interests.