Learning to Rank for IR with Neural Nets (AFIRM 2020)

Learning to rank (LTR) for information retrieval (IR) involves the application of machine learning (ML) models to rank artifacts, such as documents or entities, in response to user’s information need—as may be expressed by a short text query. LTR models typically employ training data, such as human relevance labels and click data, to discriminatively train towards an IR objective. The focus of this tutorial will be on the applications of neural network models—with shallow or deep architectures—to LTR tasks in the context of text-based search systems.

The tutorial will consist of two lectures, followed by a hands-on lab session. Some of the topics for the lectures will be based on the content presented in the “Deep Learning for Search” tutorial at AFIRM 2019. However, this proposed tutorial will put a stronger emphasis on the fundamentals. Towards that goal, the course plan will follow a different organization of topics compared to last year. We will begin the first lecture by refreshing fundamental concepts in IR and neural networks. We will then introduce the LTR framework. Our second lecture will focus on reviewing deep neural architectures and their applications to search tasks. The goal of the hands-on lab session will be to provide the attendees an opportunity to implement, train, and evaluate their own LTR models on publicly available IR benchmarks. In particular, we will demonstrate traditional feature-based LTR models on the LETOR dataset. We will conclude the hands-on portion of this course with demonstrations of deep neural ranking models on the TREC 2019 deep learning task (using the MS MARCO dataset). All demonstrations will be using Python, based on the popular neural network toolkit PyTorch.

Bhaskar Mitra, Nick Craswell, Emine Yilmaz, and Daniel Campos