Abstract

In modern automatic speech recognition systems, it is standard practice
to cluster several logical hidden Markov model states into one
physical, clustered state. Typically, the clustering is done such that
logical states from different phones or different states can not share
the same clustered state. In this paper, we present a collection of
experiments that lift this restriction. The results show that, for Aurora
2 and Aurora 3, much smaller models perform as least as well
as the standard baseline. On a TIMIT phone recognition task, we
analyze the tying structures introduced, and discuss the implications
for building better acoustic models.