Tutorial Session B – Learning to Interact


January 6, 2014


John Langford




Machine Learning does magical things when it starts interacting with and changing the world, yet most algorithms are not designed to do this. Systematically gathering the right data is the first order problem for learning with interaction. One simplistic example of this is ad recommendation where a high recommendation implies high placement which implies high click-through which implies high recommendation…. creating a self-fulfilling prophecy. This talk is about how to systematically avoid these problems by effectively (re)using randomization to engage in controlled exploration for learning algorithms. With these techniques, we can exponentially reduce the amount of exploration required, test many policies offline, and repurpose our existing learning algorithms to directly solve for optimal policies.


John Langford

John Langford is a machine learning research scientist, a field which he says “is shifting from an academic discipline to an industrial tool”. He is the author of the weblog hunch.net and the principal developer of Vowpal Wabbit. John works at Microsoft Research New York, of which he was one of the founding members, and was previously affiliated with Yahoo! Research, Toyota Technological Institute, and IBM’s Watson Research Center. He studied Physics and Computer Science at the California Institute of Technology, earning a double bachelor’s degree in 1997, and received his Ph.D. in Computer Science from Carnegie Mellon University in 2002. He was the program co-chair for the 2012 International Conference on Machine Learning.