Interleaving is an online evaluation technique for comparing the relative quality of information retrieval functions by combining their result lists and tracking clicks. A sequence of such algorithms have been proposed, each being shown to address problems in earlier algorithms. In this paper, we formalize and generalize this process, while introducing a formal model: We identify a set of desirable properties for interleaving, then show that an interleaving algorithm can be obtained as the solution to an optimization problem within those constraints. Our approach makes explicit the parameters of the algorithm, as well as assumptions about user behavior. Further, we show that our approach leads to an unbiased and more efficient interleaving algorithm than any previous approach, using a novel log-based analysis of user search behavior.