The MSR-NLP System at Dialog System Technology Challenges 6
Dialog System Technology Challenges (DSTC) |
We present our work on the Dialog System Technology Challenges 6 (DSTC6). We participated in Track 2, which evaluates the generation of conversational responses in a fully data-driven manner. Our system follows the approach taken by Li et al. 2016), which utilizes sequence-to-sequence (seq2seq) models that exploit a Maximum Mutual Information (MMI) criterion that has been shown to increase response adequacy and diversity. We find that when trained on the DSTC6 corpus MMI models exhibit improvements in BLEU scores, CIDEr, and SkipThoughts over the task baseline, but not METEOR or ROUGE-L. We also show gains in terms of unigram and bigram lexical diversity. However, inspection of the datasets used in the DSTC6 Track 2 task suggests that the task may favor blander outputs. In particular, the high incidence of references to taking the conversation offline suggests that the datasets may skewed to favor a single response type.