Abstract

When the top ASR hypothesis is incorrect, often the correct hypothesis is listed as an alternative in the ASR N-Best list. Whereas traditional spoken dialog systems have struggled to exploit this information, this paper argues that a dialog model that tracks a distribution over multiple dialog states can improve dialog accuracy by making use of the entire N-Best list. The key element of the approach is a generative model of the N-Best list given the user’s true hidden action. An evaluation on real dialog data verifies that dialog accuracy rates are improved by making use of the entire N-Best list.