Infer.NET: Machine Learning Tailor-Made
Infer.NET is an example of model-based machine learning, as explained by Tom Minka from the Cambridge lab during a morning talk.
“It’s about trying to get more people to try machine learning,” said Minka, a senior researcher. “The traditional approach to this is that experts build prepackaged learners that are very generic and apply in a robust way to different data sets. But the problem with that approach is that it doesn’t account for domain knowledge. In lots of areas where we want to use machine learning, such as vision or speech or ecology, there is very strong domain knowledge.
“What model-based machine learning lets you do is construct a model of your domain knowledge. It builds a learning algorithm directly from that model, so instead of having to map your problem into a pre-existing learning algorithm you’ve been given, it actually constructs a learning algorithm for you based on the model you’ve provided.”
Extending the morning discussion, John Guiver, a senior research software-design engineer from Microsoft Research Cambridge, delivered an early-afternoon talk called Model-Based Machine Learning Tutorial with Infer.NET.
That presentation took a look at how to design and build such a tailor-made model. Guiver talked about data visualization and analysis—and how that, combined with knowledge about how the data were collected, can deliver a deep understanding of the data that can help in the construction of early models. Those initial attempts enable new ideas about how the model can be improved.
Through an iterative process, though, the models can get quite complex. Infer.NET enables the models to be written in code, which can take much less time and space than a graphical representation of a complicated model.
“To write an inference algorithm for a model could be quite difficult,” Guiver said. “It could take quite a few weeks to write down the inference equations to solve a problem and to ask questions of a model. With a model-based approach, the inference is handled for you.”
And then, there was John Bronskill, a partner architect who, like Minka and Guiver, works in the Cambridge lab’s Machine Learning and Perception group. During a mid-afternoon collection of machine-learning-related demos, Bronskill’s was particularly popular because of its Infer.NET movie-recommendation system, one already in use by subscribers to Xbox LIVE.
The demo consisted of a display with dozens of squares of various hues, each with the name of a film at its center. On each side of a touchscreen was a column, the one to the left in red and titled Movies I Don’t Like, and, on the right, a green companion called Movies I Like.
Initially, the movie-title squares are arrayed randomly in the center part of the display, but once a user begins dragging the squares of preferred movies to the right column and those of disliked movies to the left, other squares begin to congregate along the edges, the inference being that the movies gravitating toward Movies I Like would appeal to the user, and those moving in the other direction would be ones to avoid.
“This system was trained offline on a large database of movies and users,” Bronskill explained, “so we have 20,000 movies and millions of users, and some users rated some movies. The idea is that we want to learn from that data to predict what someone encountering the system would like or dislike. It knows where you are in the space it knows about move watchers. It figures out you must be like this person or that, and they like this so you must like it, too.”
The demo might be based on movies, but it could be extended to any number of domains.
“It’s a product-recommendation system,” Bronskill confirms. “This could be music, it could be books, it could be jokes, it could be a dating service.”
It, too, is a manifestation of what is possible using Infer.NET.
“Infer.NET has building blocks to build models to build whatever phenomena you want to model,” Bronskill concludes. “It has probability distributions and operations that work on probabilistic variables that could recommend movies, classify a cell as cancerous, or model tree growth or lung function.
“It could model any sort of natural phenomenon. It gives you the building blocks to build the model, and once the model is built, you can infer unknown variables in that model.”