Speech Recognition with Segmental Conditional Random Fields: A Summary of the JHU CLSP 2010 Summer Workshop
- Geoffrey Zweig ,
- Patrick Nguyen ,
- Dirk Van Compernolle ,
- Kris Demuynck ,
- Les Atlas ,
- Pascal Clark ,
- Greg Sell ,
- Meihong Wang ,
- Fei Sha ,
- Hynek Hermansky ,
- Damianos Karakos ,
- Aren Jansen ,
- Samuel Thomas ,
- Samuel Bowman ,
- Justine Kao ,
- G.S.V.S. Sivaram
ICASSP 2011 |
Published by IEEE
This paper summarizes the 2010 CLSP Summer Workshop on speech recognition at Johns Hopkins University. The key theme of the workshop was to improve on state-of-the-art speech recognition systems by using Segmental Conditional Random Fields (SCRFs) to integrate multiple types of information. This approach uses a state-of-the-art baseline as a springboard from which to add a suite of novel features including ones derived from acoustic templates, deep neural net phoneme detections, duration models, modulation features, and whole word point-process models. The SCRF framework is able to appropriately weight these different information sources to produce significant gains on both the Broadcast News and Wall Street Journal tasks.