Abstract

In this paper, we present a novel two-stage framework of designing a noise-robust front-end for automatic speech recognition. In the first stage, a parametric model of acoustic distortion is used to estimate the clean speech and noise spectra in a principled way so that no heuristic parameters need to set manually. To reduce possible flaws caused by the simplifying assumptions in the parametric model, a second-stage Wiener filtering is applied to further reduce the noise while preserving speech spectra unharmed. This front-end is evaluated on the Aurora2 task. For the multi-condition training scenario, a relative error reduction of 28.4% is achieved.