Abstract

We address the problem of learning large complex ranking functions. Most IR applications use evaluation metrics that depend only upon the ranks of documents. However, most ranking functions generate document scores, which are sorted to produce a ranking. Hence IR metrics are innately non-smooth with respect to the scores, due to the sort. Unfortunately, many machine learning algorithms require the gradient of a training objective in order to perform the optimization of the model parameters, and because IR metricsare non-smooth, we need to find a smooth proxy objective that can be used for training. We present a new family of training objectives that are derived from the rankdistributions of documents, induced by smoothed scores. We call this approach SoftRank. We focus on a smoothed approximation to Normalized Discounted Cumulative Gain(NDCG), called SoftNDCG and we compare it with three other training objectives in the recent literature. We present two main results.