Abstract

The paper presents the Position Specific
Posterior Lattice (PSPL), a novel lossy
representation of automatic speech recognition
lattices that naturally lends itself
to efficient indexing and subsequent relevance
ranking of spoken documents.
In experiments performed on a collection
of lecture recordings — MIT iCampus
data — the spoken document ranking
accuracy was improved by 20% relative
over the commonly used baseline of
indexing the 1-best output from an automatic
speech recognizer.
The inverted index built from PSPL lattices
is compact — about 20% of the size
of 3-gram ASR lattices and 3% of the size
of the uncompressed speech — and it allows
for extremely fast retrieval. Furthermore,
little degradation in performance is
observed when pruning PSPL lattices, resulting
in even smaller indexes — 5% of
the size of 3-gram ASR lattices.