We present our recent and ongoing work on applying deep learning techniques to spoken language understanding (SLU) problems. The previously developed deep convex network (DCN) is extended to its kernel version (K-DCN) where the number of hidden units in each DCN layer approaches infinity using the kernel trick. We report experimental results demonstrating dramatic error reduction achieved by the K-DCN over both the Boosting-based baseline and the DCN on a domain classification task of SLU, especially when a highly correlated set of features extracted from search query click logs are used. Not only can DCN and K-DCN be used as a domain or intent classifier for SLU, they can also be used as local, discriminative feature extractors for the slot filling task of SLU. The interface of K-DCN to slot filling systems via the softmax function is presented. Finally, we outline an end-to-end learning strategy for training the softmax parameters (and potentially all DCN and K-DCN parameters) where the learning objective can take any performance measure (e.g. the F-measure) for the full SLU system.