A Smartphone as Your Third Ear


March 27, 2014


: We humans are capable of remembering, recognizing, and acting upon hundreds of thousands of different types of acoustic events on a day-to-day basis. Decades of research on acoustic sensing have led to the creation of systems that now understand speech (e.g. a personal assistant like iPhone’s Siri, or the voice activated search feature from Google), recognizes the speaker, and finds a song (e.g., Shazam). However, apart from speech, music, and some application specific sounds, the problem of recognizing varieties of general-purpose sounds that a mobile device encounters all the time has remained unsolved. The goal of this research is to build a platform that automatically creates classifiers that recognize general-purpose acoustic events on mobile devices. As these classifiers are meant to run on mobile devices, the technical goals include energy-efficiency, meeting timing constraints, and leveraging the user contexts such as the location and position of the mobile device in order to improve the classification accuracy.

With this goal in mind, we have built a general-purpose, energy-efficient, and context-aware acoustic event detection platform for mobile devices called – ‘Auditeur. Auditeur enables mobile application developers to have their app register for and get notified on a wide variety of acoustic events. Auditeur is backed by a cloud service to store crowd-contributed sound clips and to generate an energy-efficient and context-aware classification plan for the mobile device. When an acoustic event type has been registered, the mobile device instantiates the necessary acoustic processing modules and wires them together to dynamically form an acoustic processing pipeline in accordance to the classification plan. The mobile device then captures, processes, and classifies acoustic events locally and efficiently. Our analysis on user-contributed empirical data shows that Auditeur’s energy-aware acoustic feature selection algorithm is capable of increasing the device-lifetime by 33.4%, sacrificing less than 2% of the maximum achievable accuracy. We implement seven apps with Auditeur, and deploy them in real-world scenarios to demonstrate that Auditeur is versatile, 11.04% − 441.42% less power hungry, and 10.71% − 13.86% more accurate in detecting acoustic events, compared to state-of-the-art techniques. We perform a user study involving 15 participants to demonstrate that even a novice programmer can implement the core logic of an interesting app with Auditeur in less than 30 minutes, using only 15 – 20 lines of Java code.


Shahriar Nirjon

Shahriar Nirjon is a doctoral candidate in the department of Computer Science at the University of Virginia. Shahriar received a MCS (2011) from UVA, and a M.Sc. (2008) and a B.Sc. (2006) from the Bangladesh University of Engineering and Technology (BUET). Shahriar was an intern at Microsoft Research (Redmond, WA, 2013), Microsoft (Redmond, WA, 2011), and Deutsche Telekom Lab (Los Altos, CA, 2010). Before grad school, he was a Lecturer in the department of Computer Science and Engineering at the Bangladesh University of Engineering and Technology (2007-08). Shahriar is an experimental computer scientist who likes to build systems with sensors, mobile devices and back-end cloud-services. His primary research interest is in sensing and mobile Computing, which has an overlap with cyber-physical systems, applied machine learning, and human centered computing (e.g. health and wellness monitoring applications). During his PhD, Shahriar published seven flagship papers in top conferences in sensing, mobile computing, real-time computing, and pervasive computing, patented two of his works, won a best paper award (RTAS 2012), and won an outstanding graduate student award for research from the CS department at the UVA. Shahriar’s future research interest includes – smart devices and the Internet of Things, and big sensor data mining.