About
I am a Senior Applied Scientist in the Microsoft Ads AI team.
I completed my PhD at the Information Retrieval Lab at the University of Amsterdam, supervised by Prof. Maarten de Rijke and Prof. Harrie Oosterhuis, where I focused on off-policy evaluation and learning for ranking and recommendation systems—developing methods to improve models using logged user interactions.
During my PhD, I worked at Meta AI on applications of these ideas. In New York, I developed reinforcement learning methods for fine-tuning text-to-image diffusion models, and in London, I worked on off-policy learning for large-scale recommendation systems and mixture-of-experts architectures.
My research interests include machine learning, information retrieval, contextual bandits, off-policy methods, and reinforcement learning for post-training of foundation models. I have also prepared a slide deck summarizing my work and perspective on RL for search and recommendation systems.
Before my PhD, I was a data scientist at Flipkart in India, working on search ranking and query understanding. I hold a research master’s degree from IIIT Hyderabad.
For more information about my publications, my personal website is: https://shashank-gupta.com/.