Deep Learning and Continuous Representations for Language Processing (Tutorial for SLT-2014)

Xiaodong He, Scott Wen-tau Yih

Deep learning techniques have demonstrated tremendous success in the speech and language processing community in recent years, establishing new state-of-the-art performance in speech recognition, language modeling, and have shown great potential for many other natural language processing tasks. The focus of this tutorial is to provide an extensive overview on recent deep learning approaches to problems in language or text processing, with particular emphasis on important real-world applications including spoken language understanding, semantic representation modeling, information retrieval, semantic parsing and question answering, etc.

In this tutorial, we will first survey the latest deep learning technology, presenting both theoretical and practical perspectives that are most relevant to our topic. We plan to cover common methods of deep neural networks and more advanced methods of recurrent, recursive, stacking, and convolutional networks. In addition, we will introduce recently proposed continuous-space representations for both semantic word embedding and knowledge base embedding, which are modeled by either matrix/tensor decomposition or neural networks.

Next, we will review general problems and tasks in text/language processing, and underline the distinct properties that differentiate language processing from other tasks such as speech and image object recognition. More importantly, we highlight the general issues of language processing, and elaborate on how new deep learning technologies are proposed and fundamentally address these issues. We then place particular emphasis on several important applications:1) spoken language understanding, 2) semantic information retrieval, 3) semantic parsing and question answering. For each task, we discuss what particular architectures of deep learning models are suitable given the nature of the task, and how learning can be performed efficiently and effectively using end-to-end optimization strategies.

Besides providing a systematic tutorial of the general theory, we will also present hands-on experience in building state-of-the-art SLU/IR/QA systems. In the tutorial, we will share our practice with concrete examples drawn from our first-hand experience in major research benchmarks and some industrial scale applications which we have been working on extensively in recent years.