Bayesian extension to the language model for adhoc information retrieval

H. Zaragoza; D. Hiemstra; M. Tipping; Stephen Robertson

Bayesian extension to the language model for adhoc information retrieval

H. Zaragoza ,
D. Hiemstra ,
M. Tipping ,
Stephen Robertson

SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval | January 2003

Published by ACM Press

Download BibTex

We propose a Bayesian extension to the ad-hoc Language Model. Many smoothed estimators used for the multinomial query model in ad-hoc Language Models (including Laplace and Bayes-smoothing) are approximations to the Bayesian predictive distribution. In this paper we derive the full predictive distribution in a form amenable to implementation by classical IR models, and then compare it to other currently used estimators. In our experiments the proposed model outperforms Bayes-smoothing, and its combination with linear interpolation smoothing outperforms all other estimators.