A Two-Stage Model for Expert Search

Yunbo Cao, Jingjing Liu, Shenghua Bao, Hang Li, Nick Craswell

MSR-TR-2008-143 |

This paper is concerned with expert search, a search task where the user types a query representing a topic and the search system returns a ranked list of people who are considered experts on the topic. The system does this using the evidence existing in a document collection. We proposed a model for performing the task in our TREC 2005 submission, referred to as two-stage model. Since that, a number of groups adopted the model in their systems and achieved good performances in their TREC submissions. This paper aims to give a comprehensive description on the two-stage model and provide more experimental results on the use of the model. Two-stage model is capable of employing many types of association relationships among query terms, documents and people (experts). The model consists of two parts: relevance model and co-occurrence model. The relevance model characterizes the relevance of documents to queries. The co-occurrence model characterizes the co-occurrence between people and terms (i.e., queries) in various types. In this paper, we report our new experimental results on two-stage model using the data in both TREC 2005 and TREC 2006 expert search tasks. We also show that our approach is applicable beyond TREC W3C corpus, by introducing experimental results on expert finding at Microsoft Research. Our experimental results, once again, demonstrate the effectiveness of the two-stage model.