Abstract

This paper summarizes the 2010 CLSP Summer Workshop on
speech recognition at Johns Hopkins University. The key theme of
the workshop was to improve on state-of-the-art speech recognition
systems by using Segmental Conditional Random Fields (SCRFs) to
integrate multiple types of information. This approach uses a stateof-
the-art baseline as a springboard from which to add a suite of
novel features including ones derived from acoustic templates, deep
neural net phoneme detections, duration models, modulation features,
and whole word point-process models. The SCRF framework
is able to appropriately weight these different information sources
to produce significant gains on both the Broadcast News and Wall
Street Journal tasks.

‚Äč