Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

AutoML Challenge: A leap forward for machine learning competitions

June 21, 2016 | By Microsoft blog editor

By Isabelle Guyon, Professor, University Paris-Saclay, and President, ChaLearn

If you are attending this year’s ICML conference in New York City, June 19–24, be sure to drop by the AutoML workshop and congratulate Team AAD Freiburg, the winners of the Automatic Machine Learning (AutoML) Challenge. Led by Frank Hutter, who co-developed SMAC and Auto-WEKA, the winning team delivered auto-sklearn, an open-source tool that provides a wrapper around the Python library scikit-learn. Running head to head in most phases, the Intel team, led by Eugene Tuv, used a proprietary solution, a fast implementation of tree-based methods in C/C+.

In recent years, challenges have emerged as a means of crowdsourcing machine learning. Naturally, some organizers have started trying to automate the process of participation in competitions in order to save time and maximize profit.

CodaLab Competitions, an open-source challenge platform, has made it possible to easily organize machine learning challenges with code submission. Running on Microsoft Azure, the platform provides free compute time and enables unbiased evaluation by executing submitted code in the same condition for all participants; and making it possible for the AutoML Challenge to test whether machine learning code could operate without any human intervention under strict execution time and memory usage constraints.

AutoML

The AutoML Challenge took place from 2014 to 2016, over the course of 18 months. The challenge participants worked to develop fully automatic “black-box” learning machines for feature-based classification and regression problems. Over the course of 5 consecutive rounds, the participants were exposed to 30 datasets from a wide variety of application domains. In each new round, the participants’ code underwent a blind test on 5 new datasets. Several teams succeeded in delivering real AutoML software capable of being trained and tested without human intervention in 20 minutes of time on an 8-core machine. This was regardless of the type of dataset, which included a wide range in level of complexity.

Participants could also enter the challenge without submitting code by running the learning machines on their own local computers and submitting only results: Following each AutoML phase, the newly introduced datasets were released (labeled training data and unlabeled validation and test data), and the participants were able to manually tune their models for over a month during “Tweakathon” phases. We have more details on the solutions and what we learned in a paper, which will be presented at the ICML AutoML workshop.

When we look closely at the results of the challenge, we can see that there is still significant room for improvement. For one thing, there’s a significant gap between Tweakathon and AutoML results, indicating that the “automatic” algorithms can be further optimized. Nonetheless, this challenge has resulted in a leap forward for the field in terms of automation.

Please join us in congratulating the AutoML Challenge winners. By making their solution publicly available, AAD Freiburg has set a great precedent. We are grateful for their contribution. Imagine what the impact to the data science industry would be if all the successful software were shared.

If you missed the challenge, or just want to know more about the details, the winners’ code and the presentation material from several satellite events (hackathons and workshops) are available at ChaLearn’s website. By the way, if you think you can beat the winners, the CodaLab platform remains open for post challenge submissions!

Learn more

Up Next

Artificial intelligence

The Microsoft Infer.NET machine learning framework goes open source

It isn’t every day that one gets to announce that one of the top-tier cross-platform frameworks for model-based machine learning is open to one and all worldwide. We’re extremely excited today to open source Infer.NET on GitHub under the permissive MIT license for free use in commercial applications. Open sourcing Infer.NET represents the culmination of […]

Yordan Zaykov

Principal Research Software Engineering Lead

Dr. Nicolo Fusi Podcast headshot

Artificial intelligence

All about automated machine learning with Dr. Nicolo Fusi

Episode 43, September 26, 2018 - Dr. Nicolo Fusi gives us an inside look at Automated Machine Learning – Microsoft’s version of the industry’s AutoML technology – and shares the story of how an idea he had while working on a gene editing problem with CRISPR/Cas9 turned into a bit of a machine learning side quest and, ultimately, a surprisingly useful instantiation of Automated Machine Learning - now a feature of Azure Machine Learning - that reduces dependence on intuition and takes some of the tedium out of data science at the same time.

Microsoft blog editor

Artificial intelligence

Probabilistic predicates to accelerate inference queries

Imagine having to narrow down a video of passing vehicles captured on a traffic camera to red SUVs that are exceeding the speed limit. Such queries over images, videos and text are an increasingly common use-case for data analytics platforms. A query to identify in-a-hurry red SUVs has to execute several machine learning modules on […]

Srikanth Kandula

Principal Researcher