The signals used for ranking in local search are very different from web search: in addition to (textual) relevance, measures of (geographic) distance between the user and the search result, as well as measures of popularity of the result are important for effective ranking. Depending on the query and search result, different ways to quantify these factors exist – for example, it is possible to use customer ratings to quantify the popularity of restaurants, whereas different measures are more appropriate for other types of businesses. Hence, our approach is to capture the different notions of distance/popularity relevant via a number of external data sources (e.g., logs of customer ratings, driving-direction requests, or site accesses).
In this paper we will describe the relevant signal contained in a number of such data sources in detail and present methods to integrate these external data sources into the feature generation for local search ranking. In particular, we propose novel backoff methods to alleviate the impact of skew, noise or incomplete data in these logs in a systematic manner. We evaluate our techniques on both human-judged relevance data as well as click-through data from a commercial local search engine.