Models for statistical spoken language understanding (SLU) systems are conventionally trained using supervised discriminative training methods. In many cases, however, labeled data necessary for these supervised techniques is not readily available necessitating a laborious data collection and annotation effort. This often results into data sets that are not expansive enough to cover adequately all patterns of natural language phrases that occur in the target applications. Word embedding features alleviate data and feature sparsity issues by learning mathematical representation of words and word associations in the continuous space. In this work, we present techniques to obtain task and domain speciﬁc word embeddings and show their usefulness over those obtained from generic unsupervised data. We also show how we transfer these embeddings from one language to another enabling training of a multilingual spoken language understanding system.