Disability-first Dataset Creation: Lessons from Constructing a Dataset for Teachable Object Recognition with Blind and Low Vision Data Collectors

Artificial Intelligence (AI) for accessibility is a rapidly growing area, requiring datasets that are inclusive of the disabled users that assistive technology aims to serve. We offer insights from a multi-disciplinary project that constructed a dataset for teachable object recognition with people who are blind or low vision. Teachable object recognition enables users to teach a model objects that are of interest to them, e.g., their white cane or own sunglasses, by providing example images or videos of objects. In this paper, we make the following contributions: 1) a disability-first procedure to support blind and low vision data collectors to produce good quality data, using video rather than images; 2) a validation and evolution of this procedure through a series of data collection phases and 3) a set of questions to orient researchers involved in creating datasets toward reflecting on the needs of their participant community.