Context Memory Networks for Multi-Objective Semantic Parsing in Conversational Understanding
- Asli Celikyilmaz ,
- Dilek Hakkani-Tur ,
- Gokhan Tur ,
- Yun-Nung Vivian Chen
The end-to-end multi-domain and multi-task learning of the
full semantic frame of user utterances (i.e., domain and intent
classes and slots in utterances) have recently emerged as
a new paradigm in spoken language understanding. An advantage
of the joint optimization of these semantic frames is
that the data and feature representations learnt by the model
are shared across different tasks (e.g., domain/intent classification
and slot filling tasks use the same feature sets). It’s important
that the model should learn to pay attention to global
and local aspects of the utterances while learning to map the
entire utterance to an intent class and tag each word with a
slot tag. We introduce the Context Memory Network (CMN),
a neural network architecture which specifically focuses on
learning better representations as attention vectors from past
memory to be reasoned with for the end task of jointly learning
the intent class and slot tags. The utterances trigger a
dynamic memory network, which learns attention based representation
for each word by allowing the model to condition
on the list of related phrases in the form of memory
networks. These representations are then provided to a new
multi-objective long short term memory network (LSTM) to
infer the intent class and slot tags. Our empirical investigations
on CMN show impressive gains over the end-to-end
LSTM baselines on ATIS dataset as well as two other humanto-
machine conversational datasets.