Identifying Web Search Query Reformulation using Concept based Matching

Published by Empirical Methods in Natural Language Processing (EMNLP)

Web search users frequently modify their queries in hope of receiving better results. This process is referred to as “Query Reformulation”. Previous research has mainly focused on proposing query reformulations in the form of suggested queries for users. Some research has studied the problem of predicting whether the current query is a reformulation of the previous query or not. However, this work has been limited to bag-of-words models where the main signals being used are word overlap, character level edit distance and word level edit distance. In this work, we show that relying solely on surface level text similarity results in many false positives where queries with different intents yet similar topics are mistakenly predicted as query reformulations. We propose a new representation for Web search queries based on identifying the concepts in queries and show that we can significantly improve query reformulation performance using features of query concepts.