Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling

Published

Two people side by side, Max Welling on the left and Chris Bishop on the right, in black and white smile and look forward.

Episode 129 | July 20, 2021

Unlocking the challenge of molecular simulation has the potential to yield significant breakthroughs in how we tackle such societal issues as climate change, drug discovery, and the treatment of disease, and Microsoft is ramping up its efforts in the space.

In this episode, Chris Bishop, Lab Director of Microsoft Research Cambridge, welcomes renowned machine learning researcher Max Welling to the Microsoft Research team as head of the new Amsterdam lab. Connecting over their shared physics background and vision for molecular simulation, Bishop and Welling explore several fascinating topics, including a future in which machine learning and quantum computing will be used in tandem to model molecules, the power of machine learning to provide “on demand” data in this space, and goals for the first year and beyond at the Amsterdam lab.

Learn more:

Subscribe to the Microsoft Research Podcast:
iTunes | Email | Android | Spotify | RSS feed


Transcript

MAX WELLING (TEASER): It’s kind of strange. Molecules are basically everything around us, except for light and a few other forces that we can’t really see, but everything else is made of molecules. And yet we don’t really understand them. We can’t really predict their properties. So, if we start to understand molecules better, then a number of applications become within reach. We can start to design better catalysts, for instance, to help the hydrogen economy. If you want to split water into hydrogen and oxygen, it actually costs a lot of energy. If you find a catalyst to reduce the amount of energy that you need to use, you will make that process a lot more efficient and boost the possibility to use water as a battery in some sense, right. You can store energy in hydrogen. In general, we can make better materials.

CHRIS BISHOP: Welcome to the Microsoft Research Podcast. My name is Chris Bishop, and I’m the lab director of Microsoft Research in Cambridge, UK. One of the most exciting areas of machine learning right now is its application to the field of molecular simulation. Today, I’m joined by Max Welling, one of the world’s leading researchers in machine learning and someone for whom molecular simulation has become something of a passion. I’m delighted to announce that Max will be joining Microsoft Research in September and that we’ll be opening a new lab in Amsterdam, which Max will be leading. Max, a very warm welcome to the podcast.

MAX WELLING: Hi, Chris. Thank you very much for inviting me. I’m very excited to chat with you today.

BISHOP: It’s great to have you. We’ll talk a lot more about molecular simulation in a moment, but first, Max, can you tell us a bit about your background?

WELLING: Yes, absolutely. So I was educated in physics, uh, like yourself, actually, and as I understand, we have somewhat similar backgrounds. So I did my bachelor, my master, and my PhD in physics. I did my PhD actually in two-dimensional quantum gravity. And then I decided I wanted to move to something with more sort of societal impact. In the beginning, I was working in computer vision at Caltech with Pietro Perona, and then I migrated slowly to, uh, machine learning, working with Geoff Hinton in—at UCL in London, and then, uh, when he moved to the University of Toronto, I moved with him. And after that, um, I become professor at University of California in Irvine, and then, uh, in 2013, I moved to the University of Amsterdam to become a research chair there. So, uh, I’m very excited, actually, to join Microsoft in, uh, September.

BISHOP: Yeah, we’re really excited to have you join us. I thought it might be fun just to share the story of how you and I came to be partnering in this, because, um, it was clear to me that all of your very impressive research is directly relevant to the problem of molecular simulation, and that’s an area that we’ve been very interested in for a while in Microsoft Research in Cambridge. And, uh, so I guess it was a few months now, wasn’t it, when I called you up, and I seem to recall there was just an amazing meeting of minds. We suddenly realized we were both thinking along exactly the same direction, and it was quite a—quite an energizing call.

WELLING: No, absolutely. It was strange, in a way, that I was already working on pivoting my research in that direction, and, uh, and seeing the opportunities there, and then getting that call right at that moment, um, it was indeed a meeting of the minds, and it was immediately clear that I—I wanted to do this, also, because, for me, it was very important that, um, I wanted to spend sort of the second half, let’s say, of my career on climate change. I see this as a—as a major challenge for the world, and I also see a huge opportunity here to actually make a dent in that problem through computational chemistry, and of course, you know, in order to achieve anything, you need colleagues who are smart and are willing to engage with you on this particular journey, but also, a lot of compute, and so, I know Microsoft has an amazing amount of, uh, compute infrastructure that we can leverage, uh, to try to solve this problem.

BISHOP: Absolutely. I mean, there’s so many exciting application areas for this, but you’re right. Climate change is such a—a pressing need. I—I recently read Bill Gates’ book How to Avoid a Climate Disaster. I found it very inspirational. He doesn’t spare any punches. I mean, he highlights very clearly the depth of the challenge and the extent of the challenge but also, uh, talks very systematically through all the different opportunities we have to use technology to help us address this, and in many cases, there are clear opportunities for molecular simulation to play a role, whether it’s sort of developing catalysts or new electrodes for, uh, driving the hydrogen economy and, uh, so on.

WELLING: I read that book, too, actually, before you called me, I think, and, uh, I also found it very inspirational. I really liked the way he approaches this in a super practical, economical way, um, and it actually gave me hope that this thing is solvable. So, in fact, I do think that we can solve it together.

BISHOP: Yeah, and I guess we should say a little bit about what molecular simulation is and why we think it’s so exciting right now and what—what it’s got to do with machine learning because this is, uh, possibly a moment when machine learning is about to disrupt the field of molecular simulation in much the same way that it’s already transformed fields like, uh, computer vision or speech recognition or natural language understanding. And for me, the real excitement here is the fact that there is, uh, almost a new frontier opening up of intellectually very deep research that combines machine learning with, uh, quantum physics and chemistry and molecular biology and so on but also has this tremendous potential to have very important real-world impact, not just in—in climate change, but also in domains like healthcare and drug discovery and medicine and understanding biology and understanding disease and helping us to treat disease.

WELLING: Yeah, there’s a lot of things coming together. That’s the beauty of this, uh, research. And it’s kind of strange. Molecules are basically everything around us, except for light and a few other forces that we can’t really see, but everything else is made of molecules. And yet we don’t really understand them. We can’t really predict their properties. So, if we start to understand molecules better, then a number of applications become within reach. We can start to design better catalysts, for instance, to help the hydrogen economy. If you want to split water into hydrogen and oxygen, it actually costs a lot of energy. If you find a catalyst to reduce the amount of energy that you need to use, you will make that process a lot more efficient and boost the possibility to use water as a battery in some sense, right. You can store energy in hydrogen. In general, we can make better materials. We can maybe make plastic which doesn’t, uh, sort of pollute the environment all that much. And—and maybe you can tell me something about this, Chris—uh, you know, design new drugs. I—I understand at—at Microsoft Research, there is already quite a big effort in that direction.

BISHOP: That’s right. We’ve been very interested in drug design, collaborating with some major pharma companies and, uh, looking at how machine learning can disrupt that process of drug discovery because that space of potential molecules is so vast, and surely, within that space are some really interesting and powerful drugs that don’t have side effects and are not toxic and so on, but discovering them is tremendously challenging, and we really see machine learning as having a big impact on that. A lot of the work we’ve been doing has been using machine learning driven by, uh, experimental data, but we’re very excited to augment that and amplify that by using molecular simulation, where we’re creating data more from first-principle simulation of the—the quantum physics of molecules of proteins folding and interacting with other proteins and so on. And so, we see this as an area with tremendous potential, and not just, uh, drug discovery, but just more broadly in the life sciences. It’s an area of great interest to us. Uh, for example, we have a—a collaboration with Adaptive Biotech that, uh, recently we pivoted to look at, uh, COVID-19, and, uh, in fact, just a—a few months ago, the so-called T-Detect COVID, developed by Adaptive Biotech using some of our, uh, machine learning technology, was granted, uh, emergency FDA approval and it’s become the world’s first test for past COVID infections based not on antibodies but on T cells, which are a—a key part of the adaptive immune system and which actually may be much longer lived than antibodies. So, again, that’s a lovely example of very interesting research, but which also leads to very important real-world application.

It is such an exciting field, isn’t it? It also feels that, uh, it’s almost like an ideal application of machine learning. I mean, molecular modeling’s been around for a—a good few years now, and it’s this—it offers this amazing computational microscope where you can see deep inside the—the workings of living organisms, for example. But it’s just so mind-bogglingly expensive from a computational point of view. I think one of the things that we’re all very excited about is the idea that machine learning could really speed this up by many orders of magnitude, and it just feels like a great application for machine learning because the training data, uh, in large part, could even be synthetic. We can simulate the—the systems using more conventional, uh, techniques of solving the equations of, uh, quantum physics, generate synthetic data to train, uh, machine learning, and then use those trained models to run other simulations but very much more efficiently, and, uh, of course, in machine learning, we’re—we’re very dependent on data. We like lots of data. Uh, but it can be expensive to obtain. It can be hard to find. It can be time-consuming to label, and yet, somehow, in this field, we can generate more data on demand, as much perfectly labeled data as we wish, just by spending more computation, as it were. So, it’s—it really feels like just a great application domain, but as you say, one which has so much potential for real-world impact.

So, you have this background in theoretical physics, in—in much the same way that I do, and I—I noticed that shines through in a lot of your research. I mean, you’ve done some pretty amazing things over the years, but—but one thing that I think you’re quite well-known for is work on invariances and equivariances in machine learning, and maybe you can just say a few words about what that means and why that might be relevant to this challenge of molecular simulation.

WELLING: Yeah, certainly. So, in physics, we think about symmetries. Basically, almost all of our physical theories are built around symmetries. We like to write down all our equations in such a way that they look the same whichever observer is used to describe that system and that has led to revolutions, that has led to the realization that electricity and magnetism are actually two sides of the same coin that are transformed into each other by changing the observer from one that is standing still to one that is flying by, and Einstein went, actually, one step further and said, “Well, actually, one observer, instead of experiencing, you know, acceleration, might explain that as gravity.” And so he equated, you know, acceleration and gravity together, which led to the general theory of relativity, and in fact, the whole Standard Model is built out of, you know, particles, which are organized according to the symmetry transformations. That particular principle, you know, in 2016, when I started to do research with Taco Cohen, actually, that principle, we wanted to also implement in neural network research, and, uh, convolutional neural networks already implement it to some degree. They basically have this idea that if you move an object from one place to the next place, which is a translation, the output of the neural network should either be invariant—you know, a cat is a cat, you know, if you see it on the left or the right of your image—or, you know, if—if you want to segment a cat out, the segmentation mask should move with the cat as you move the input. So, that’s called equivariance, and, uh, we were thinking about how to extend these groups, basically saying, “Well, if I rotate the object, you know, my prediction should still be invariant under those rotated objects.” A cat upside down is still a cat, after all, right? And that’s particularly important for molecular simulation because a molecule, if you rotate it, you would still think has the same properties than if you see it in—in some other orientation, and building that particular inductive bias—this particular prior knowledge—into your model is what we have recently been doing. We’ve built it into what’s called graph neural networks, um, where you can think of the atoms as the nodes in a graph and the atoms that interact with each other as edges, and these atoms are sending messages to each other, which is similar to doing a convolution. So, in that graph neural net, we made them sort of symmetric under these rotations, and then applied it to describe molecules, and that’s amazingly successful. So, the interesting thing is that you can train on datasets to predict the properties of these molecules, and the predictions are amazingly accurate, and, you know, we are now starting as a community to realize the enormous impact that can have in the future.

BISHOP: Yeah, symmetries are so powerful and fundamental in physics. It’s really interesting to see them play such a fundamental role in machine learning, as well. We’re seeing right now, a lot of interest in machine learning in this space and a lot of activities, some really interesting research projects spinning up, but if we look further out, uh, what about the field of quantum computing? What impact do you see quantum computing having in—in the domain of molecular simulation?

WELLING: Yeah, I think it’s a very interesting question because, uh, molecules are inherently quantum systems. Um, the—the electrons, in particular, are well-described by quantum mechanics, and, um, a quantum computer is, in some sense, a—a natural sort of quantum simulation. So, you know, Feynman, sort of, you know, first noticed that, uh, you can think of a quantum computer as doing some kind of quantum experiment, some kind of quantum simulation, and, uh, people think that the first actual applications of quantum computing are going to happen in this field of simulating quantum mechanics itself. Now, quantum computing is still at its infant stages, and, um, you know, we really need full tolerant quantum computing for it to be truly useful, and that might be more than 10 years away, but hopefully, already, somewhat earlier—it’s hard to predict precisely how much earlier—we can use, uh, sort of more noisy quantum devices to start, uh, modeling molecules using quantum computing, and I think the most exciting part of this is that I believe that quantum computing and machine learning are going to collaborate together in order to make this really useful. So you could imagine the machine learning algorithm doing a bit of computation, and then asking a quantum computer to do a particular computation, say, “Why don’t you use this, uh, you know, energy for me, or this Hamiltonian for me; do a computation, right; give the result back to me; and I’ll process that information to then try and figure out what the next thing is that you should compute.” And I think it’s in this interaction that we’re gonna see a lot of action in the future, but it might be still, you know, more than five years out.

[MUSIC BREAK]

BISHOP: So, really, we have two pieces of news today. As well as the news that you’ll be joining Microsoft Research, we’re also announcing the opening of a new research lab in Amsterdam, uh, which you’ll be leading, and we’re gonna be hiring extensively in molecular simulation, both in Cambridge and in Amsterdam over the coming years. Tell us a little bit about your early thoughts as to what the Amsterdam lab might be focusing on.

WELLING: So first, a bit personally, um, you know, I’m—I’m super grateful that Microsoft has decided to open a new lab in Amsterdam. I mean, this is somewhat of a dream come true for me. You know, I love Amsterdam. It’s a great place to live and a great place to set up an office, actually, I think, because there’s a lot of, uh, talent running around here. Yeah, so what are we going to do? Well, the first, uh, piece of business is, of course, to fill the lab with excellent researchers and to build a very diverse team in Amsterdam. That will be a significant effort. But also, I really want to start the actual science, you know, collaborating with the Cambridge teams and other teams around the globe to start thinking about precisely, you know, what we want to achieve over the next, uh, couple of years. I hope that we can make a start with, uh, building a system that can predict properties of molecules, can generate molecules with certain properties, and can search through this enormous space of—of these molecules. There’s so many possible molecules—more than possible atoms in the universe—and you have to somehow search through that space if you’re looking for molecules with certain properties, and it’s only through machine learning that we can now begin to search fast through this space, and, um, I want to make a beginning with that sort of program to build that software stack that can achieve that.

BISHOP: That—that sounds really exciting. Actually, we’ve already made our first hire into the Amsterdam lab, of course. Rianne van den Berg. She spent about three months with us in Cambridge as a visiting researcher a—a few years ago while she was a postdoc with you at the University of Amsterdam. So, obviously, you know Rianne very well.

WELLING: Yeah, I know her very well, and I’m, uh, extremely impressed, uh, with her, uh, technical achievements. She was a physicist like—like ourselves, then she moved to work with us on, uh, interesting graph neural network problems. She’s made a major impact in the field, and of course, given the fact that she is both, uh, sort of a physicist and, uh, a sort of machine learner, she’s the perfect hire for the Amsterdam team to work with us on these types of problems.

BISHOP: That sounds great. As well as joining Microsoft Research, you’re also going to have a joint appointment with the University of Amsterdam. Uh, Microsoft Research has worked with you for several years, uh, in that capacity, funding students and postdocs, and of course, we’ve hired some great graduates from your lab over the years, and we’ve collaborated with your lab on various research projects. Can you tell us a bit more about the work of your university group?

WELLING: Yeah, so, I’m—I’m very grateful, actually, that Microsoft allows me to have that joint position. I—I personally believe it is very important, actually, to have this, uh, influx from academia and to have one leg in one and one leg in the other. The research lab is called AMLab. It’s about 50 researchers right now with five or so faculty. We’ve been working a lot on, uh, sort of deep learning problems, but also, graphical models, which was sort of the hype maybe 10 years ago. Uh, you know, you’ve written a book about it; we all teach from your book, actually, at—at the university. And, uh, connected to that, also, generative models, normalizing flows, VAEs, and also causality research. And I really hope that Microsoft can, uh, fruitfully collaborate with the University of Amsterdam, but also with the other knowledge institutes in the Amsterdam ecosystem, um, and in particular, I’m thinking about some of these ICAI labs, which is Innovation Center for AI, where we could possibly start one of these labs and work on the more academic problems with students in that lab and, hopefully, inspire them to join Microsoft at some point.

BISHOP: It’s a very impressive group. I mean, some amazing researchers have really come out of—out of your team in Amsterdam, and so it’s very exciting to have that close partnership. So, finally, Max, you’ll be joining us in September. If we look ahead, say, 10 years, what do you hope we will have accomplished?

WELLING: So, that’s an excellent question, and, uh, it’s a bit of a—of a glass ball to be looking into, but I certainly know my dreams. So, my dreams are that in 10 years, we will have cracked the problem of understanding molecules. We will be able to design new materials on the fly. We will have designed new catalysts to feed the green, uh, economy. Uh, we will be able to design new drugs for all sorts of diseases that we cannot treat right now. I hope, also, that by that time, we have grown and, uh, inspired a whole lot of people around the world in joining us in this very important effort.

BISHOP: Well, I have no doubt at all, Max, that that combination of, uh, hiring you, as well as opening a new research lab in continental Europe and also defining this major new mission that I think really combines intellectually very deep science together with the opportunity for major social impact—I think that whole package will be very inspiring for many scientists and engineers who will want to come and join us in this exciting endeavor.

Thank you for your time, Max, and, uh, thanks also to our listeners. For those of you interested in joining Microsoft Research to work with us on molecular simulation, we’ve just made several job postings for both the Cambridge and the Amsterdam labs, and you can learn more about Microsoft Research and the great work coming out of its labs at microsoft.com/research and be sure to subscribe for new episodes of the Microsoft Research Podcast. Until next time.