Abstract

This paper proposes three novel and effective procedures for jointly analyzing repeated utterances. First, we propose repetition-driven system switching, where repetition triggers the use of an independent backup system for decoding. Second, we propose a cache language model for use with the second utterance. Finally, we propose a method with which the acoustics from multiple utterances – not necessarily exact repetitions of each other – can be combined to into a composite that increases accuracy. The combination of all methods produces a relative increase in sentence accuracy of 65.7% for repeated
voice-search queries.

‚Äč