Abstract

In this paper we consider different strategies for constructing click-prediction models that can subsequently be used for audience segmentation and behavioural targeting. In particular, we address the question whether one should build separate models for each audience segment or instead build a single model that simultaneously predicts membership in multiple segments. We discuss the pros and cons of both strategies and then investigate which yields the best results empirically. We use a recently developed Bayesian model that is capable of combining traditional feature-based modelling with collaborative filtering based techniques. We apply this model to a large set of web log data, harvested from a collection of linked, large commercial websites. In our experiments, multiple Bayesian logistic regression models, each built for a single segment topic, generally produce better results than a single model built against all topics simultaneously. But there are indications that, at least for some segment topics, allowing the feature representation to depend on the topic can improve the performance of multi-topic models.