A Harmonic-Model-Based Front End for Robust Speech Recognition

Mike Seltzer; Jasha Droppo; Alex Acero

A Harmonic-Model-Based Front End for Robust Speech Recognition

Mike Seltzer ,
Jasha Droppo ,
Alex Acero

Proc. Eurospeech Conference | September 2003

Published by International Speech Communication Association

Download BibTex

Speech recognition accuracy degrades significantly when the speech has been corrupted by noise, especially when the system has been trained on clean speech. Many compensation algorithms have been developed which require reliable online noise estimates or a prior knowledge of the noise. In situations where such estimates or knowledge is difficult to obtain, these methods fail. We present a new robustness algorithm which avoids these problems by making no assumptions about the corrupting noise. Instead, we exploit properties inherent to the speech signal itself to denoise the recognition features. In this method,

speech is decomposed into harmonic and noise-like components, which are then processed independently and recombined. By processing noise-corrupted speech in this manner we achieve significant improvements in recognition accuracy on the Aurora 2 task.

© 2007 ISCA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ISCA and/or the author.