Abstract

We introduce a new approach to analyzing click logs by examining both the documents that are clicked and those that are bypassed—documents returned higher in the ordering of the search results but skipped by the user. This approach complements the popular click-through rate analysis, and helps to draw negative inferences in the click logs. We formulate a natural objective that finds sets of results that are unlikely to be collectively bypassed by a typical user. This is closely related to the problem of reducing query abandonment. We analyze a greedy approach to optimizing this objective, and establish theoretical guarantees of its performance. We evaluate our approach on a large set of queries, and demonstrate that it compares favorably to the maximal marginal relevance approach on a number of metrics including mean average precision and mean reciprocal rank.