Toward Topic Search on the Web

  • Hongsong Li ,
  • Haixun Wang

MSR-TR-2011-28 |

Traditional web search engines treat queries as sequences of keywords and return web pages that contain those keywords as results. Such a mechanism is effective when the user knows exactly the right words that web pages use to describe the content they are looking for. However, it is less than satisfactory or even downright hopeless if the user asks for a concept or topic that has broader and sometimes ambiguous meanings. This is because keyword-based search engines index web pages by keywords and not by concepts or topics. In fact they do not understand the content of the web pages. In this paper, we present a framework that improves web search experiences through the use of a probabilistic knowledge base. The framework classifies web queries into different patterns according to the concepts and entities in addition to keywords contained in these queries. Then it produces answers by interpreting the queries with the help of the knowledge base. Our preliminary results showed that the new framework is capable of answering various types of topic-like queries with much higher user satisfaction, and is therefore a valuable addition to the traditional web search.