All about automated machine learning with Dr. Nicolo Fusi
Episode 43, September 26, 2018
You may have heard the phrase, necessity is the mother of invention, but for Dr. Nicolo Fusi, a researcher at the Microsoft Research lab in Cambridge, Massachusetts, the mother of his invention wasn’t so much necessity as it was boredom: the special machine learning boredom of manually fine-tuning models and hyper-parameters that can eat up tons of human and computational resources, but bring no guarantee of a good result. His solution? Automate machine learning with a meta-model that figures out what other models are doing, and then predicts how they’ll work on a given dataset.
On today’s podcast, Dr. Fusi gives us an inside look at Automated Machine Learning – Microsoft’s version of the industry’s AutoML technology – and shares the story of how an idea he had while working on a gene editing problem with CRISPR/Cas9 turned into a bit of a machine learning side quest and, ultimately, a surprisingly useful instantiation of Automated Machine Learning – now a feature of Azure Machine Learning – that reduces dependence on intuition and takes some of the tedium out of data science at the same time.
- Microsoft Research Podcast: View more podcasts on Microsoft.com
- iTunes: Subscribe and listen to new podcasts each week on iTunes
- Email: Subscribe and listen by email
- Android: Subscribe and listen on Android
- Spotify: Listen on Spotify
- RSS feed
- Microsoft Research Newsletter: Sign up to receive the latest news from Microsoft Research
Nicolo Fusi: So, we cast it, again, as a machine-learning problem. But we had multiple models interacting, and tuning each model separately was a complete nightmare. And during that process, I decided, surely somebody must have thought about something. And you know, they had. But the problem is that a lot of the state-of-the-art was working only for tuning a few hyper-parameters at a time. What we are trying to do was really tune thousands.
Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I’m your host, Gretchen Huizinga.
Host: You may have heard the phrase, necessity is the mother of invention, but for Dr. Nicolo Fusi, a researcher at the Microsoft Research lab in Cambridge, Massachusetts, the mother of his invention wasn’t so much necessity as it was boredom: the special machine learning boredom of manually fine-tuning models and hyper-parameters that can eat up tons of human and computational resources, but bring no guarantee of a good result. His solution? Automate machine learning with a meta-model that figures out what other models are doing, and then predicts how they’ll work on a given dataset.
On today’s podcast, Dr. Fusi gives us an inside look at Automated Machine Learning – Microsoft’s version of the industry’s AutoML technology – and shares the story of how an idea he had while working on a gene editing problem with CRISPR-Cas9, turned into a bit of a machine learning side quest, and, ultimately, a surprisingly useful instantiation of Automated Machine Learning – now a feature of Azure Machine Learning – that reduces dependence on intuition and takes some of the tedium out of data science at the same time. That and much more on this episode of the Microsoft Research Podcast.
Host: Nicolo Fusi, welcome to the podcast.
Nicolo Fusi: Thank you, it’s great to be here.
Host: So, you lead the Automated Machine Learning efforts at Microsoft Research in Cambridge, Massachusetts. I really want to wade into the technical weeds with you in a bit, but for right now, in broad strokes, tell us about your work. What gets you up in the morning?
Nicolo Fusi: Yeah, it’s interesting, because my background is in machine learning and I got very excited about problems in computational biology. And so I did a lot of work in computational biology and then during that work, I kind of figured, oh, there are so many machine learning problems that you can solve that are interesting and apply to a wide range of things. And so, I kind of went back a little bit in machine learning. So, most recently, as you said, I’m working on automated machine learning which is a field where the goal is to kind of automate as much of the machine learning process as possible. That goes from data preparation to model criticism, for instance, once we come up with a model.
Host: Ok. So, drill in a little bit on this idea of computational biology.
Nicolo Fusi: So, computational biology is an enormous field. There are many kind of different people doing different things from proteomics to genomics. Some of them are using mathematical tools. Some of them are more probabilistic, or statistical, in nature. So, my slice of this world was using machine learning and statistics to kind of investigate molecular mechanism. And in particular I was working on genetics, mostly. I also worked on functional genomics, but genetics was the most formative part of my training.
Host: So, we talked a little bit about your interest in machine learning, computational biology and medicine, in fact. So those are three sort of divergent paths – well there’s some overlaps on Venn diagram – but how did those all come together for you?
Nicolo Fusi: I started in machine learning, and you know machine learning you can do either applied work – uh, you pick a problem, you apply machine learning to it, it always comes with its own set of challenges – or you can pick something more theoretical and maybe advance the way people do modeling or infer the parameters of a model, for instance. And when it came to starting my PhD, I had the choice of problems from both fields and, due to personal circumstances, I felt like I needed to do something that had an effect on human health. But I also thought that human health was too far from machine learning. And in some sense, to this day, I still think that if you want to do medicine with machine learning, I think you need to stop somewhere in between first. Like kind of break up your journey. And I think you probably should break up your journey at the molecular level, which is where computational biology comes in. So, I started working on solving questions in computational biology using machine learning with the goal of later, kind of going from computational biology to medicine, again using machine learning.
Host: That’s fascinating. Before we launch into your specific work in the field, let’s talk a little more generally about automated machine learning. Forbes Magazine had an article where the author claimed that it was set to become “the future of AI.” Is that overstatement?
Nicolo Fusi: Well, in general in AI right now, there is a lot of “This is the future of AI! That is the future of AI!” And it would be great, as somebody who does a lot of AutoML research, if AutoML will single-handedly be the future of all of AI. But I think it’s going to be a huge component. And I think, more than AutoML, it’s probably going to be meta-learning, if one has to put names on fields.
Host: Interesting, yeah.
Nicolo Fusi: Because meta-learning is learning about learning. So, in some sense, I think we have developed, over time, a good set of kind of base models or base algorithms. And now we are starting to move up the hierarchy and kind of combine classes and families of models into meta-models, and that kind of incorporates all that is going on underneath them. So, in some sense I agree with the Forbes article in the sense that we need to move one level up the hierarchy. Ummm… But there is a lot more work to be done at the base.
Host: Let’s drill in a little bit on why automated machine learning is such a big deal.
Nicolo Fusi: Yes.
Host: Ok? And perhaps by way of comparison. So, you alluded earlier to the traditional machine learning workflow? Tell us what that looks like and what an automated machine learning workflow looks like, and why it’s different and why it matters.
Nicolo Fusi: In my mind, the traditional machine learning workflow, which is also the data science workflow – people use different names – you start with some question and you define what kind of data do I need to answer that question, what kind of metrics measure my success and what’s the closest numerically computable metric that I can pair and I can measure to see whether my model is doing well? And then there is a lot of data cleaning, and then, eventually, you start the modeling phase. And the modeling phase involves transforming features, changing different models, tuning different parameters. And every time you go down one path, you pick a way to transform your features, you pick one model, you pick a set of parameters and then you test it and then you go back. And you maybe try different hypothesis, you gather more data. You keep doing this loop, over and over again, and then, at the end, you basically produce one model that maybe you deploy, maybe you inspect to see whether the predictions are correct or fair or stuff like that. The goal of automated machine learning is to automate as much of this as possible. I don’t think we’ll ever be able to automate the “phrasing the question” or “deciding the metric,” because that’s what the human should be doing, really. But the goal is to kind of remove as much of the high-dimensional thinking with many options that are not always clear, that really kind of slows down the process for humans.
Host: I’ve heard it described as the drudge work of data science, the fine-tuning of the models and the parameters. Explain that a little bit more about how… is it basically a trial and error?
Nicolo Fusi: It is a lot of a trial and error, because it’s really high-dimensional space. So, depending on which value you set one hyper-parameter to, all the values for another hyper-parameter completely change meaning or change the scale at which they are relevant. And so, it becomes a really difficult problem because… I’ve done it, right? It’s extremely boring. Not a good use of time. If you do kind of parameter sweeps, which is a lot of what the industry is doing, you kind of waste a ton of computational resources, and you have no guarantee of finding anything good.
Host: (laughs) That’s sad…
Nicolo Fusi: So, it’s bleak.
Yeah. Uh, yeah… So, how are you tackling that?
Nicolo Fusi: So there are different techniques. And AutoML is a very exciting research area. There has been like AutoML competitions, AutoML workshops, symposiums at NIPS. It’s really very exciting. The broad idea is, let’s try to use a model to kind of figure out what other models are doing. So in that sense it’s kind of meta-learning, because you’re trying to predict how different models react when you change their parameters. And you use that model to guide you through a series of experiments.
Host: Alright, so you would have to have a base of model experiments for this uber model to judge, right?
Nicolo Fusi: Exactly, to learn form, yes.
Host: To learn from. Better word.
Nicolo Fusi: Yes.
Host: According to legend – I use that term loosely – you were using machine learning and getting mired in the drudgework of data science and, basically, thinking there’s got to be an app for this, or something.
Nicolo Fusi: Yeah.
Host: And you set out to fix the problem for yourself. It wasn’t like, “I’m going to go create this AutoML thing for the world.” It’s like, I got to solve my own problem. And you kind of did it covertly. Tell us that story.
Nicolo Fusi: Yeah, so I was working on CRISPR gene editing.
Nicolo Fusi: It was a joint collaboration between Microsoft Research and the Broad Institute. At the Broad Institute, the lead investigator was John Doench, and Microsoft Research was Jennifer Listgarten, who’s now at Berkeley, and me. And basically, we got this data. We figured out the question. We did all the deciding which metric we want to optimize, and then we spent six months, maybe, of our own time, almost full-time, trying different ways to slice and dice the model space, the parameter space. It was just exhausting.
Host: What question were you trying to answer?
Nicolo Fusi: In this work, we were trying to investigate and build a predictive model of the off-target activity in CRISPR-Cas9. So, CRISPR is a gene editing system. It allows you to mute a gene that you don’t want to be expressed, for instance. This gene is causing a disease. I want to shut it down. So, you can do that. In previous work, we basically figured out, again, a machine-learning model to predict, given the many ways you can edit the gene, because you can do it in different ways, what’s the most successful edit? And in this follow-up work, we were investigating the issue of, given that I want to perform this edit, what’s the likelihood that I mess up something else in the genome that I didn’t want to touch? You can imagine if I want to remove a gene, or silence a gene that was causing a disease, I don’t want to suddenly give you a different disease because an unintended edit happened somewhere else.
Nicolo Fusi: And so, we cast it, again, as a machine-learning problem. But we had multiple models interacting, and tuning each model separately was a complete nightmare. And during that process, I decided, surely somebody must have thought about something. And you know, they had. But the problem is that a lot of the state-of-the-art was working only for tuning a few hyper-parameters at a time. What we are trying to do was really tune thousands.
Nicolo Fusi: And so, it would’ve taken ages. And so, I kind of had an idea, while working on CRISPR, so it was kind of like a side project. It kind of worked, and I was suspicious because it was a weird approach that was not supposed to work. It was really a hack. And so, I kind of kept it quiet. I kind of used it to inform my own experiments but I didn’t…
Nicolo Fusi: … advertise it.
Host: Ok. So, the best and brightest minds all over the world are trying to tackle this problem. Like you say, there’s other companies working on it. They’ve got symposiums, they’ve got competitions. And here you are, in your lab, working on a CRISPR-Cas9 gene problem, and you come up with this. Tell us what it is.
Nicolo Fusi: So, it’s something that’s actually is used already by you know, Netflix, Amazon, we probably use it somewhere in the company to recommend things.
Nicolo Fusi: So, the idea was ultimately deciding which algorithm, which set of hyper-parameters to use for a given problem, you’re kind of trying to recommend a series of things and then I evaluate them and I tell you how well they work. And then you kind of update your beliefs about what’s going to work and what’s not going to work. And that is similar to movies, right? You watch a movie, you rate it and then they learn more about you, your tastes and so on. The good news for us is that we don’t actually rely on the human watching the movie. We can force the execution of a machine learning pipeline. We can just tell, execute this thing. And they perform, let’s say, well or not so well depending on which data set they’re exposed to. So, you can now gather a corpus of experiments that help you guide the selection of what to do in the new data set. So that’s the meta-learning aspect. You have this meta-model that knows, and can predict, how individual base models will perform when shown some given data.
Host: Okay. So, implementing this…
Nicolo Fusi: Yeah… So because I was the first user, I didn’t want just something that you could you know academically show, “Oh, you know, we beat random,” because random is a strong baseline for AutoML, surprisingly. Like picking a model at random. I wanted to encapsulate this work into something I could use into a library.
Nicolo Fusi: So, we worked on the first version of a toolkit you could just deploy in your data. And we started using it for our own stuff. And then we were working on this, and in the summer, at Microsoft, you have the hackathon, which is this huge initiative, everybody kind of takes part. And a lot of teams were looking for data scientists. And it was crazy, because on the machine learning mailing list was, “Oh I’m working on, you know, this accessibility problem. Is there anybody who knows machine learning who can help me out with this data analysis question?” And so, we thought, ok, so if nobody’s responding to that email, maybe we should just blast out an email to the entire mailing list saying, “We have fifty spots. So, we can give you an API key that we designed in a bad way. And if you want to use something that kind of finds a model automatically, you phrase the question and we kind of search your model space for you and give you a pipeline you can use at the end. Just give you a Python object. You can ask for predictions.”
Host: OK. What was the response?
Nicolo Fusi: The response was crazy. So, we were a research team of two people, specifically, so, we didn’t have a lot of systems “umph” behind us. So, it was a single machine serving this meta-brain model that kind of figures out what to do. And by our calculations, we could only accommodate fifty teams. So, we got a hundred and fifty requests within, you know, the first week.
Nicolo Fusi: And so, we stretched our resources a bit thin to let people use it.
Host: So, you did accommodate the hundred and fifty?
Nicolo Fusi: We did accommodate the hundred and fifty in the end.
Nicolo Fusi: It was a lot of CPU time…
Nicolo Fusi: And a lot of like, last minute, “Oh, the service is down. Can you reboot?” Something that we had never experienced.
Host: Well, ok, so this is the hackathon. And it sounds to me like the scenario you’re painting is something that would help validate your research and actually help the people that are doing projects within the hackathon itself.
Nicolo Fusi: Yes. It was kind of a win/win, because we saw some real-world usage of our tool. And it was very limited in the beginning. We could only do classification, not regression problems, just because we started with that. And people came and they started saying “Oh, could you add this base learner or this processing method?” and it was very useful to us.
Host: What did you say, “No I can’t? Not yet?”
Nicolo Fusi: You know, our answer was always, “Oh, it’s just, you know, it’s just us, but one day….”
Host: It’s just research.
Nicolo Fusi: It’s just research, yes.
Host: Well, let’s go there for a minute. There’s been a big announcement at Ignite.
Nicolo Fusi: Very exciting.
Host: Can you talk about that?
Nicolo Fusi: So, yes, it’s a very exciting announcement. It took a ton of work from a lot of very smart people. So, it’s a joint collaboration at this point between MSR, you know we did the original kind for proof-of-concept, and a huge amount of work went in from people within Azure. So, it’s being released as kind of like an Automated ML library, or SDK, that you can use on your data. And it’s in public preview.
Host: That’s super exciting.
Nicolo Fusi: You know, we have a very good collaboration, and a very good ability to now transfer what are technically complex ideas. You know this technology transfer was not like a small, simple model that you could just write quickly. And we had to think about, is the probability distribution calibrated? Are the choices that we make based on that information correct in most cases for most datasets? What’s the cost on runtime on our servers to satisfy the demand that this thing will likely have? So, it was a lot of engineering work. And I think we figured a lot of that out and so, we were able to transfer a lot of research into product much more quickly.
Host: Well, who’s the customer for this, right now?
Nicolo Fusi: Uhhh, we struggled a little bit, because in the early validation phase, let’s call it, the research prototype, “let’s give it away” phase, we got a lot of different kind of people approaching us. So, data scientists, they are interested in it because they want to save time. They don’t care, in a lot of cases, what the final model is, they just want a good model, and they don’t want to spend ages just running parameter sweeps. So, data scientists are one set of individuals who could be interested. Developers. Sometimes developers now are tasked with including intelligence in their applications. And if you don’t know what to do, this kind of solution gives you a good model in a short amount of time, relative to the size of your data so you can just use it. And then there is a lot of kind of business analysts, buyers from companies. They have to make data-driven decisions and they would benefit from good predictions and this tool would give them good predictions.
Host: So, going back to your comment about data scientists being a customer, why wouldn’t a data scientist be a little bit worried that this AutoML might be taking over their job?
Nicolo Fusi: Yeah, I get asked that question a lot. I created for myself, not intending to replace myself, but just kind of as a tool for me to use. The metaphor I use for it is, it’s kind of like using a word editor. It doesn’t replace the role…
Host: Of a writer…
Nicolo Fusi: …of a writer. It just makes the writer that much more effective, because you don’t have to cancel, with a pencil, your old text and just rewrite it from scratch. You can just erase one letter, for instance. And that’s what AutoML does. Like, if you change, let’s say, the way you featurize your data, you don’t have to start from scratch with the tuning, you just set an AutoML run going and you just move on.
Host: Would it democratize it to the point where non-data scientists could become data scientists?
Nicolo Fusi: I think it’s a possibility. Because in some sense you observe this meta-model kind of reasoning about your data, and you see the thinking process, if you can call it thinking. You see which models it starts out with. And then it sees how it evolves. And so, you can actually learn things about your data by observing the process, observing how different metrics are, you know, sometimes you want to maximize accuracy. But maybe looking at something like an area under the ROC curve is informative because you can now see how the probabilities are changing. So, I think you can learn from AutoML, and I think it can become kind of like some training wheels if you are starting out.
Host: Yeah, yeah. Let’s go back to CRISPR for a minute, since this all started with how to make it easier to decide where to edit a gene.
Nicolo Fusi: Yeah.
Host: You made a website called CRISPR.ML that provides bioscientists with free tools they can use to make CRISPR gene edits?
Nicolo Fusi: Basically, yes.
Host: Tell us about the site. Why did you start it and how’s it going?
Nicolo Fusi: That’s a great question. So, it’s CRISPR.ML, that’s the URL. We had to really, really resist the crazy hype to not call it CRISPR.AI, which was available. And we just said no. It’s ML. It’s not AI. We decided not to feed the hype around AI.
Host: That was self-restraint on steroids.
Nicolo Fusi: I know, I know. It took a lot out of us. Um… So, we had done work on the on-target problem, which was how do you find the optimal, for some notion of optimality, how do you find the best edit to perform to make sure you disable a gene you want to disable? And that was giving you predictions. It was a tool that could use in Python. You could just download it from GitHub. We just give it away, liberally licensed and so on. And a lot of startups incorporated that in their tools, selling it. A lot of institutions were using it. The off-target stuff, which was, how do you make sure that you don’t have unintended edits somewhere else in the genome? That was much more computationally intensive to run as something you could download. So, if you were interested in running it for your own problem of interest, you would have to wait a long time. And so, we decided, why don’t we just pre-compute everything for the human genome which took an exorbitant amount of CPU time on Azure, but we could just pre-populate a giant database table and then search it almost instantly. And that’s what the website does. You can put in the gene you want to edit, and you get a “least of possible” guide with a score that tells you how likely the edit is to be successful and how likely each off-target is…
Host: To happen?
Nicolo Fusi: …to happen. And a global score that tells you, broadly speaking, “Bad, bad, bad off-targets here. Don’t touch it.”
Host: Don’t do it.
Nicolo Fusi: Yeah.
Host: So, it both tells you what’s a good place to go and tells you, avoid these places because lots of bad stuff could happen?
Nicolo Fusi: In some sense. It tells you which place to edit and if you choose to edit this, these other spots on the genome, might be edited. And maybe you don’t care. So maybe your experiment is very narrowly focused on a given gene. So maybe you don’t care, but in therapeutic applications, you want zero off-targets, pretty much, all the time.
Host: Yeah. That leads me into a question that I ask all of the guests on the podcast. Is there anything that keeps you up at night about what you’re doing?
Nicolo Fusi: Uh… yeah, gene editing, AutoML, and AI… what can go wrong?
Nicolo Fusi: No, I’m not easily kept awake at night. I can sleep anywhere. But you know, there are things that concern me. And I’ve tried to move my work towards addressing them as a first priority. Maybe the main thing on my mind right now is, I know that AutoML, or what we call AutoML, is a very good way to predict. So, it’s a very strong, supervised machine learning method. And it can be applied to all kinds of data. And I want to make sure that, as we build a capability to generate better and better predictors, we are also thinking of ways to make sure that the predictions are well-explained, that the biases are auditable and visible to the person who’s deploying these systems. So, we are spending a lot of time now thinking how fairness and all these themes that are mentioned a lot in this podcast are addressed.
Nicolo Fusi: Because it’s a very powerful tool and if you apply it in the wrong way, you’re going to have exorbitant amounts of bias.
Host: Tell us a bit about yourself, Nicolo. What’s your background? How did you get interested in what you’re doing, and how did you end up at Microsoft Research?
Nicolo Fusi: It’s a good story. So I… I will not start from when I was a baby. I will start directly from university. I did my university in Milan close to home… well, at home, basically. And then I attended some advanced courses in statistics and I thought it was fascinating. But I was always kind of like more of a computer science person. And so, I figured computer science plus statistics? At the time, the answer was machine learning. I think to this day, probably, there is a lot…
Host: Same answer.
Nicolo Fusi: Same answer for most people. And so, I decided to kind of do a summer internship somewhere. Anybody who would take me. You know, these kind of “hashtag rejection” stories? I must have sent thirty or forty emails to everybody in Europe to say can you please, like, I will come for free, like, for a summer. It was kind of like my summer vacation that year. And I think the response I got was from Neil Lawrence, who’s now also a podcast host as part of other things. Talking Machines. And he wrote this email in broken Italian because he speaks a little bit of Italian, but he can speak it, but he cannot write it. If he hears this I hope he’s OK with that. And he says sure, come over. And I went there for a summer. The goal was to kind of build up my CV to do grad school somewhere. So, I wanted to do some research, you know, get a feeling. And after that summer, I basically decided no, I’m staying here. I want to do a PhD right here. I’m coming back in four months. And I basically kind of closed up all my stuff. Like finished my exams at home. Just like my thesis and just packed up and went to the UK. So, Neil had a choice of projects. And because I wanted to have an impact on health, I kind of chose the molecular biology-inspired projects and I started working on that. And it was one of the best, you know, three years of my life. It was a lot of fun. I learned a lot.
Host: Yeah. And you got your PhD…
Nicolo Fusi: I got my PhD… Well, in the last year, I traveled a lot during my PhD, because I spent some time at Max Planck Institute in Tübingen. I spent some times at UCLA at the Institute of Pure Applied Mathematics. There was a program where the idea, I think, of to represent it correctly, was to kind of combine some mathematically-minded people and some biology/medicine-minded people to see what kind of collaborations arise. It was an incredible program for like, I think three or four months, during which I met this group of Microsoftees in LA who were doing statistical genomics. And that included my long-term collaborator Jennifer Listgarten. And I started an internship there the year after. And then at the end of the internship they said, “Hey, do you want to join?” So again, once I more I went back. I packed everything up, and I said sure! And I joined Microsoft Research in LA, which was this remote site. It was five of us, all kind of working in health.
Host: So, there’s not actually a lab in LA?
Nicolo Fusi: It’s not a lab. It was a rented office that used to be a steakhouse on the UCLA campus. Very, very unofficial. If you’ve ever been to a Microsoft building, you see all these you know machines that include beverages, like sodas and so on. We had to have a standing order from supermarkets, delivering us the sodas.
Host: Because you didn’t have a place to put your machine?
Nicolo Fusi: We didn’t really have facilities. And it was just us. And our network was ethernet cables running everywhere.
Host: That’s hilarious, did you at least have some signage?
Nicolo Fusi: I don’t think so. It was probably, you know those plastic signs that you can kind of get at any office store?
Host: Yeah, yeah.
Nicolo Fusi: We had those. But we didn’t have a sign that said Microsoft. But we were in the address book so sometimes Microsoft sales people would come to our office intending to access the corporate network, but they didn’t understand that we were not on the corporate network even.
Host: Oh, you were that off-target.
Nicolo Fusi: Off, off the grid.
Host: Listen, so then how did you come to be at Cambridge? Did you go straight from LA to the Cambridge lab?
Nicolo Fusi: Yes, so… I think after a couple of years in LA, different people moved in different places, and Jen Listgarten came to Boston and you know, I heard this Boston lab is incredible. And so, at NIPS I met with Jennifer Chayes, who was the lab director, and I was like oh I need visit there and check it out and I joined, I think, three years ago now.
Host: So, you moved up all your stuff again?
Nicolo Fusi: I moved up all my stuff again. I said I love it here. I’m just moving.
Host: As we close, I like to ask all my guests to look to the future a little. And it’s not like predictions of the future, but more just sort of as you look at the landscape, what are the exciting challenges, hard problems, that are still out there for young researchers who might be, yourself, a few years ago, trying to decide where do I want to land? Where do I want to pack all my stuff up and move to?
Nicolo Fusi: Ah! That’s a great question, and one I spend a lot of time thinking about because we have a lot of interns. The quality of the students right now is exceptional. So even maybe a first-year PhD student has an incredible amount of experience, very often. And they’re asking you where should I direct my career? And it’s hard to give advice. But I think the area where I expect the most improvement, and interesting work to be done, is probably the area of making decisions, given predictions. So, I think a lot of machine learning is focused, correctly so, on giving good predictions. We are now kind of topped out performance on a lot of tasks that were considered very hard. Image recognition… in 2012 I think it was… or 2013 it was a really hard task. And now we are kind of like, we can achieve great performance. Get top one, top five percent performance. But I think the gap now is, ok, we got good predictions, how do we make decisions with those predictions? And I think with that, you need to have a notion of uncertainty, and well-calibrated uncertainty. So, you need to be certain the correct percentage of the time, and then uncertain the rest. And, you know, self-driving cars and all these things will need the notion of raising the red flag and saying, “I don’t know what’s going on. I need to not make a decision right now. Please intervene.” You need a notion of a low confidence prediction, “Please do something else. Don’t use me for your decision.” And beyond. But in general, there is this notion that you need uncertainty. And we are in a decent spot for quantifying uncertainty, but there is a lot more work that needs to be done to have safe, robust, machine learning systems.
Host: So that’s a fruitful line of inquiry for somebody who’s interested in this?
Nicolo Fusi: Yes. I think, you know, as a student you need to kind of imagine like skeet shooting. You need to shoot ahead of your target. If you’re entering the game now, and you’re trying to just maximize predictive accuracy, where predictive accuracy is basically like a root mean square. Minimizing a root mean and maximizing accuracy. I think it’s about time to do machine learning if that’s your objective. But I think, thinking more end-to-end, “what is the end goal of this machine learning system” is going to be a much more interesting area in the future.
Host: Nicolo Fusi, thank you so much for joining us today on the podcast. I’m enlightened, and it was so much fun.
Nicolo Fusi: Thanks for having me. It was fun.
To learn more about Dr. Nicolo Fusi and the latest research in Automated Machine Learning, visit Microsoft.com/research.