Many mobile machine learning applications require collecting and labeling data, and a traditional GUI on a mobile device may not be an appropriate or viable method for this task. This paper presents an alternative approach to mobile labeling of sensor data called VoiceLabel. VoiceLabel consists of two components: (1) a speech-based data collection tool for mobile devices, and (2) a desktop tool for offline segmentation of recorded data and recognition of spoken labels. The desktop tool automatically analyzes the audio stream to find and recognize spoken labels, and then presents a multimodal interface for reviewing and correcting data labels using a combination of the audio stream, the system’s analysis of that audio, and the corresponding mobile sensor data. A study with ten participants showed that VoiceLabel is a viable method for labeling mobile sensor data. VoiceLabel also illustrates several key features that inform the design of other data labeling tools.