Together we stand: Siamese Networks for Similar Question Retrieval
- Arpita Das ,
- Harish Yenala ,
- Manoj Kumar Chinnakotla ,
- Manish Shrivastava
Annual Meeting of the Association for Computational Linguistics |
Community Question Answering (cQA) services like Yahoo! Answers 1 , Baidu Zhidao 2 , Quora 3 , StackOverflow 4 etc. provide a platform for interaction with experts and help users to obtain precise and accurate answers to their questions. The time lag between the user posting a question and receiving its answer could be reduced by retrieving similar historic questions from the cQA archives. The main challenge in this task is the “lexico-syntactic” gap between the current and the previous questions. In this paper, we pro-pose a novel approach called “Siamese Convolutional Neural Network for cQA (SCQA)” to find the semantic similarity between the current and the archived questions. SCQA consist of twin convolutional neural networks with shared parameters and a contrastive loss function joining them. SCQA learns the similarity metric for question-question pairs by leveraging the question-answer pairs available in cQA forum archives. The model projects semantically similar question pairs nearer to each other and dissimilar question pairs far-ther away from each other in the semantic space. Experiments on large scale real-life “Yahoo! Answers” dataset reveals that SCQA outperforms current state-of-the-art approaches based on translation models, topic models and deep neural network