Automating the construction and tuning of machine learning (ML) models has long been one of the goals of the ML community. This is due to several factors, most notably a sharp increase in the demand for tailored AI solutions, a relative scarcity of trained ML scientists, and the development of deep learning models with complex architectures requiring accurate design and fine-tuning. Existing automated machine learning (AutoML) techniques have been remarkably successful in identifying good parameters for a given model, sometimes even outperforming humans. However, these options either take too long to train or they work for only a handful of parameters. Our research centers on creating AutoML techniques that outperform existing approaches.
Discussing an algorithm and different ways to choose the next machine learning pipeline. (L-R) Seated at the table, Microsoft’s Sharon Gillett, and Paul Oka; standing Evan Green, Nicolo Fusi, Gilbert Hendry, Paolo Casale and Rishit Sheth. Photo by Dana J. Quigley for Microsoft.
How it works
Our solution to this problem leverages thousands of experiments performed in hundreds of different datasets. Now running in preview in Azure machine Learning services, our approach runs a few models with hyperparameters tuned various ways on a user’s new dataset, to learn the accuracy of the pipeline’s predictions. That information informs the next set of recommendations, over hundreds of iterations.
When compared with other automated approaches, our method outperforms them in terms of classification accuracy by 2 to 200 times, depending on the task. Even against human data scientists in Kaggle competitions, our approach often beats 95 percent of the data scientists competing.
The heart of our current research is a probabilistic latent variable model with additional structure over the latent space to work with DNNs without needing to fully train them. This research is designed to replace the training of hundreds of different architectures for hundreds or thousands of training cycles on the training set.
We are also developing several techniques to reason about the latent space identified by our method. This will allow us to produce stacked ensembles of models that will further boost the accuracy of our method.
Finally, by reasoning about the latent space, we will also be able to ideate and synthetize novel DNN architectures and novel ML pipelines that are not part of the set of solutions usually explored by ML researchers.
For those interested in applications of this research in Microsoft products, check out Azure Machine Learning and Power BI.