Improving End-of-Turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task

  • Z. Aldeneh ,
  • D. Dimitriadis ,
  • E. Mower-Provost

2018 IEEE International Conference on Acoustics, Speech and Signal Processing |

Published by IEEE Signal Processing Society

This work focuses on the use of acoustic cues for modeling turn-taking in dyadic spoken dialogues. Previous work has shown that speaker intentions (e.g., asking a question, uttering a backchannel, etc.) can influence turn-taking behavior and are good predictors of turn-transitions in spoken dialogues. However, speaker intentions are not readily available for use by automated systems at run-time; making it difficult to use this information to anticipate a turn-transition. To this end, we propose a multi-task neural approach for predicting turn transitions and speaker intentions simultaneously. Our results show that adding the auxiliary task of speaker intention prediction improves the performance of turn-transition prediction in spoken dialogues, without relying on additional input features during run-time.