Deep Neural Networks for Speech and Image Processing

Date

May 24, 2012

Speaker

Alex Acero

Affiliation

Microsoft Research

Overview

Neural networks are experiencing a renaissance, thanks to a new mathematical formulation, known as restricted Boltzmann machines, and the availability of powerful GPUs and increased processing power. Unlike past neural networks, these new ones can have many layers and thus are called “deep neural networks”; and because they are a machine-learning technique, the technology is also known as “deep learning.”

In this talk, I describe this new formulation and its signal-processing application in such fields as speech recognition and image recognition. In all these applications, deep neural networks have resulted in significant reductions in error rate. This success has sparked great interest from computer scientists, who are also eager to learn from neuroscientists how neurons in the brain work.

Speakers

Alex Acero

Alex Acero is research area manager in Microsoft Research, directing an organization with 60 engineers working on audio, multimedia, communication, speech, and natural language. He is also an affiliate professor of Electrical Engineering at the University of Washington. He received a M.S. degree from the Polytechnic University of Madrid, Madrid, Spain, in 1985; a M.S. degree from Rice University, Houston, TX, in 1987; and a Ph.D. degree from Carnegie Mellon University, Pittsburgh, PA, in 1990, all in Electrical Engineering. Dr. Acero worked in Apple Computer’s Advanced Technology Group during 1990–1991. In 1992, he joined Telefonica I+D, Madrid, Spain, as manager of the speech technology group. Since 1994, he has been with Microsoft Research.

Dr. Acero is a Fellow of IEEE. He has served the IEEE Signal Processing Society as Vice President Technical Directions (2007–2009), Director Industrial Relations (2009–2011), 2006 Distinguished Lecturer, member of the Board of Governors (2004–2005), associate editor for IEEE Signal Processing Letters (2003–2005) and IEEE Transactions of Audio, Speech and Language Processing (2005–2007), and member of the editorial board of IEEE Journal of Selected Topics in Signal Processing (2006–2008) and IEEE Signal Processing Magazine (2008–2010). He also served as member (1996–2000) and Chair (2000–2002) of the Speech Technical Committee of the IEEE Signal Processing Society. He was Publications Chair of ICASSP98, Sponsorship Chair of the 1999 IEEE Workshop on Automatic Speech Recognition and Understanding, and General Co-Chair of the 2001 IEEE Workshop on Automatic Speech Recognition and Understanding. Dr. Acero served as member of the editorial board of Computer Speech and Language and member of Carnegie Mellon University Dean’s Leadership Council for College of Engineering.

Dr. Acero is author of the books Acoustical and Environmental Robustness in Automatic Speech Recognition (Kluwer, 1993) and Spoken Language Processing (Prentice Hall, 2001), has written invited chapters in 4 edited books and 200 technical papers. He holds 78 U.S. patents. Since 2004, Dr. Acero, along with co-authors Drs. Huang and Hon, has been using proceeds from their textbook Spoken Language Processing to fund the “IEEE Spoken Language Processing Student Travel Grant” for the best ICASSP student papers in the speech area.

People