Microsoft Research Blog

Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

The Microsoft Infer.NET machine learning framework goes open source

October 5, 2018 | By Yordan Zaykov, Principal Research Software Engineering Lead

It isn’t every day that one gets to announce that one of the top-tier cross-platform frameworks for model-based machine learning is open to one and all worldwide. We’re extremely excited today to open source Infer.NET on GitHub under the permissive MIT license for free use in commercial applications.

Open sourcing Infer.NET represents the culmination of a long and ambitious journey. Our team at Microsoft Research in Cambridge, UK embarked on developing the framework back in 2004. We’ve learned a lot along the way about making machine learning solutions that are scalable and interpretable. Infer.NET initially was envisioned as a research tool and we released it for academic use in 2008. As a result, there have been hundreds of papers published using the framework across a variety of fields, everything from information retrieval to healthcare. In 2012 Infer.NET even won a Patents for Humanity award for aiding research in epidemiology, genetic causes of disease, deforestation and asthma.

Over time, the framework has evolved from a research tool to being the machine learning engine in a number of Microsoft products in Office, Xbox and Azure. A recent example is TrueSkill 2 – a system that matches players in online video games. Implemented in Infer.NET, it is running live in the bestselling titles Halo 5 and Gears of War 4, processing millions of matches.

But in an age of abundance of machine learning libraries, what sets Infer.NET apart from the competition? Infer.NET enables a model-based approach to machine learning. This lets you incorporate domain knowledge into your model. The framework can then build a bespoke machine learning algorithm directly from that model. This means that instead of having to map your problem onto a pre-existing learning algorithm that you’ve been given, Infer.NET actually constructs a learning algorithm for you, based on the model you’ve provided.

An added advantage of model-based machine learning is interpretability. If you have designed the model yourself and the learning algorithm follows that model, then you can understand why the system behaves in a particular way or makes certain predictions. As machine learning applications gradually enter our lives, understanding and explaining their behavior becomes increasingly more important.

Model-based machine learning also naturally applies to problems with certain data traits, such as real-time data, heterogeneous data, insufficient data, unlabelled data, data with missing parts and data collected with known biases. Indeed, if you’ve read this far then it’s a good bet you’re interested in learning more about model-based machine learning. It just so happens that the Infer.NET team has written an awesome online book on the subject and it’s absolutely free.

In Infer.NET, models are described using a probabilistic program. This may seem like an oxymoron but is actually a powerful concept used to describe real-world processes in a language that machines understand. Infer.NET compiles the probabilistic program into high-performance code for implementing something cryptically called deterministic approximate Bayesian inference. This approach allows substantial scalability – for example, we use it in a system that automatically extracts knowledge from billions of web pages, comprising petabytes of data.

The use of deterministic inference algorithms is complementary to the predominantly sampling-based methods of most other probabilistic programming frameworks. A key capability of our approach is support for online Bayesian inference – the ability of the system to learn as new data arrives. We have observed that this is essential in business and consumer products that interact with users in real time. For example, in the aforementioned TrueSkill 2 system, in order to provide competitive matches, we need to update the skills of the players immediately following each round. And we do so in just a millisecond.

To sum up, you’d want to use Infer.NET when you have extensive knowledge about the domain you’re solving a problem in, or if interpreting the behaviour of the system is of importance for you, or if you have a production system that needs to learn as new data arrives.

The Infer.NET team is looking forward to engaging with the open-source community in developing and growing the framework further. Infer.NET will become a part of ML.NET – the machine learning framework for .NET developers. We have already taken several steps towards integration with ML.NET, like setting up the repository under the .NET Foundation and moving the package and namespaces to Microsoft.ML.Probabilistic. Infer.NET will extend ML.NET for statistical modelling and online learning.

Interested in Infer.NET? Download the framework here. Support for Windows, Linux and MacOS is provided through .NET Core. Our Tutorials and Examples page gives a taste of what models can be implemented using Infer.NET. And the documentation also contains a detailed User Guide. You are warmly invited to join us on GitHub if you want to contribute!

The Infer.NET Team. Top row, left to right: Martin Kukla, John Guiver, Tom Minka, John Winn, Sam Webster, Dany Fabian. Bottom row, left to right: Pavel Myshkov, Yordan Zaykov, Alex Spengler.


Up Next

Artificial intelligence

Creating AI glass boxes – Open sourcing a library to enable intelligibility in machine learning

When AI systems impact people’s lives, it is critically important that people understand their behavior. By understanding their behavior, data scientists can properly debug their models. If able to reason how models behave, designers can convey that information to end users. If doctors, judges and other decision makers trust the models that underpin intelligent systems, […]

Rich Caruana

Principal Researcher

Dr. Nicolo Fusi Podcast headshot

Artificial intelligence

All about automated machine learning with Dr. Nicolo Fusi

Episode 43, September 26, 2018 - Dr. Nicolo Fusi gives us an inside look at Automated Machine Learning – Microsoft’s version of the industry’s AutoML technology – and shares the story of how an idea he had while working on a gene editing problem with CRISPR/Cas9 turned into a bit of a machine learning side quest and, ultimately, a surprisingly useful instantiation of Automated Machine Learning - now a feature of Azure Machine Learning - that reduces dependence on intuition and takes some of the tedium out of data science at the same time.

Microsoft blog editor

Artificial intelligence

Probabilistic predicates to accelerate inference queries

Imagine having to narrow down a video of passing vehicles captured on a traffic camera to red SUVs that are exceeding the speed limit. Such queries over images, videos and text are an increasingly common use-case for data analytics platforms. A query to identify in-a-hurry red SUVs has to execute several machine learning modules on […]

Srikanth Kandula

Principal Researcher