Microsoft Research Lab – India

Podcast: Making cryptography accessible, efficient and scalable. With Dr. Divya Gupta and Dr. Rahul Sharma

September 8, 2020


Episode 005 | September 08, 2020

Ensuring security and privacy of data, both personal and institutional, is of paramount importance in today’s world where data itself is a highly precious commodity. Cryptography is a complex and specialized subject that not many people are familiar with, and developing and implementing cryptographic and security protocols such as Secure Multi-party Computation can be difficult and also add a lot of overhead to computational processes. But researchers at Microsoft Research have now been able to develop cryptographic protocols that are developer-friendly, efficient and that work at scale with acceptable impact on performance. Join us as we talk to Dr. Divya Gupta and Dr. Rahul Sharma about their work in making cryptography easy to use and deploy.

Dr. Divya Gupta is a senior researcher at Microsoft Research Lab. Her primary research interests are cryptography and security. Currently, she is working on secure machine learning, using secure multi-party computation (MPC), and lightweight blockchains. Earlier she received her B.Tech and M.Tech in Computer Science from IIT Delhi and PhD in Computer Science from University of California at Los Angeles where she worked on secure computation, coding theory and program obfuscation.

Dr. Rahul Sharma is a senior researcher in Microsoft Research Lab India since 2016. His research lies in the intersection of Machine Learning (ML) and Programming Languages (PL), which can be classified into the two broad themes of “ML for PL” and “PL for ML”. In the former, he has used ML to improve reliability and efficiency of software. Whereas, in the latter, he has built compilers to run ML on exotic hardware like tiny IoT devices and cryptographic protocols. Rahul holds a B.Tech in Computer Science from IIT Delhi and a PhD in Computer Science from Stanford University.

Click here for more information in Microsoft Research’s work in Secure Multi-party Computation and here to go to the GitHub page for the project.

Related

Transcript

Divya Gupta: We not only make existing Crypto out there more programmable and developer friendly, but we have developed super-duper efficient cryptographic protocols which are tailored to ML, like secure machine learning inference task and work for large machine learning benchmarks. So before our work, the prior work had three shortcomings I would say. They were slow. They only did small machine learning benchmarks and the accuracy of the secure implementations was lower than the original models. And we solved all three challenges. So our new protocols are at least 10 times faster than what existed out there.

[Music]

Sridhar: Welcome to the Microsoft Research India podcast, where we explore cutting-edge research that’s impacting technology and society. I’m your host, Sridhar Vedantham.

[Music]

Ensuring security and privacy of data, both personal and institutional, is of paramount importance in today’s world where data itself is a highly precious commodity. Cryptography is a complex and specialized subject that not many people are familiar with, and developing and implementing cryptographic and security protocols such as Secure Multi-party Computation can be difficult and also add a lot of overhead to computational processes. But researchers at Microsoft Research have now been able to develop cryptographic protocols that are developer-friendly, efficient and that work at scale without serious impact on performance. Join us as we talk to Dr. Divya Gupta and Dr. Rahul Sharma about their work in making cryptography easy to use and deploy.

Sridhar Vedantham: Alright, so Divya and Rahul, welcome to the podcast. It’s great to have you guys on the show and thank you so much. I know this is really late in the night so thank you so much for taking the time to do this.

Divya Gupta: Thanks Sridhar for having us. Late is what works for everyone right now. So yeah, that’s what it is.

Rahul Sharma: Thanks Sridhar.

Sridhar Vedantham: Alright, so this podcast, I think, is going to be interesting for a couple of reasons. One is that the topic is something I know next to nothing about, but it seems to me from everything I’ve heard that it’s quite critical to computing today, and the second reason is that the two of you come from very different backgrounds in terms of your academics, in terms of your research interests and specialities, but you’re working together on this particular project or on this particular field of research. So let me jump into this. We’re going to be talking today about something called Secure Multi-party Computation or MPC.  What exactly is that and why is it important?

Divya Gupta: Right, so Secure Multi-party Computation and as you said, popularly known as MPC, is a cryptographic primitive, which at first seems completely magical. So let me just explain with an example. So let’s say you, Sridhar, and Rahul are two millionaires and you want to know who has more money or who’s richer. And you want to do this without revealing your net worth to each other, because this is private information. So at first this seems almost impossible. As in how can you compute a function without revealing the inputs of the function? But MPC makes this possible. What MPC gives you is an interactive protocol in which you and Rahul will talk to each other back and forth, exchanging some random looking messages. And at the end of this interaction you will learn the output, which is that who is richer and you will only learn the output alone. So this object MPC comes with the strong mathematical guarantees which say that at the end of this interaction only the output is revealed, and anything which can be deduced from output, but nothing else about the input is revealed. So in this example, Sridhar, you and Rahul both will learn who is richer. And let’s say you turn out to be richer.  Then of course from this output you would know that your net worth is more than Rahul’s and that’s it. Nothing else you will learn about Rahul’s net worth. So this is what MPC is. This example is called the Millionaire’s Problem, where the function is very simple. You’re just trying to compare two values, which is the net worth. But MPC is much more general. So just going into a bit of history, I would say that MPC can compute any function of your choice on secret inputs. And this result in fact was shown as early as 1980s and this power of MPC, of being able to compute any function securely, got many people interested in this problem. So a lot of work happened and people kept coming up with better and better protocols which were more efficient. So when I say efficient, some of the parameters of interest are the data being sent in the messages back and forth. The number of messages you want to exchange, and also the end to end latency of this protocol, like how much time does it take to compute the function itself, And people kept coming with better and better protocols. And finally, the first implementations came out in 2008 and since then, people have evaluated a few real world examples using MPC and one example which I found particularly interesting is the following, which was a social study which was done in a privacy preserving manner using MPC in Estonia in 2015.

So, the situation was as follows.

Along with the boom in information and communication technology, it was observed that more and more students were dropping out of college without finishing their degree. And the hypothesis going around was that the students, when they are studying in the University, get employed in IT jobs and they start to value their salaries more than their University degree and hence drop out. But a counter hypothesis was that it is because IT courses are gaining popularity, more and more students are enrolling into it and find it hard and drop out. So the question was, is working during studies in IT jobs correlated with high dropout rate?

And to answer to answer this question, a study was proposed to understand the correlation between early employment of students in IT jobs while being enrolled in University and high dropout rate. Now this study can be done by taking in data from employment records in the tax department and also the enrollment records in the education department and just cross referencing this data. So even though all of this data is there with the government, it could not be shared in the clear between the two departments because of legal regulations and the way they solve this problem is by doing this Secure Multi-party Computation between Ministry of Education and tax board.

So this, I feel, is an excellent example which shows that MPC can help solve real problems where data sharing is important but cannot be done in the clear.

Sridhar Vedantham: OK. Rahul was there something you wanted to add to that?

Rahul Sharma: Yes, Sridhar. So if you realized what is happening today, the data is being digitized. Financial documents, medical records. Everything is being digitized, so we are getting, you can say, flood of data which is being available and the other thing which has happened in computer science is that we have now very, very powerful machine learning algorithms and very powerful hardware which can crunch these machine learning algorithms on this huge amount of data. And so machine learning people have created, for example, machine learning models which can beat human accuracy on tasks in computer vision. Computer vision is basically you have an image and you want to find some pattern in that image. For example, does the image belong to a cat or a dog? And now we have classifiers which will beat humans on such tasks. And the way these machine learning classifiers work is they use something called supervised machine learning, which has two phases. One is a training phase and one is an inference phase. In the training phase, machine learning researchers- they curate the data, they collect the data and they throw a lot of hardware on it to generate a powerful machine learning model. And then there is an inference phase in which new data points come in and the model labels or makes predictions on these new input data points. Now, after you have gone through this expensive training phase, the companies or the organizations who do this want to monetize the model which they have obtained. Now, if you have to monetize this model, then you have two options. One is that you can just release the model to the clients who can just download the model and run the model on their private data.

Now, if they do this, then first of all, the company is not able to monetize the model because the model has just been given away and second, all the privacy of the training data which was used to generate this model is lost because now someone can look at try to look at the model and try it to reverse engineer where the training data was. So this is not a good option. Another option is that the organization can hold the model as a web service and then the clients can send their data to the company for predictions. Now, this is also not a good option because, first of all, clients will have to reveal their sensitive data to the organization holding the model and moreover the organization itself would not like to have this client data because it is just red hot, right? If they hold client data and there is a data breach, then there are legal liabilities. So here we have a situation that there is an organization. It has a model which is its own proprietary model and we have clients who have all their sensitive data and these two parties don’t want to reveal their inputs to each other. But still, the organization wants to provide a service in which the client can give the data, receive predictions, and in exchange for the prediction, the client can give some money to the organization. And MPC will help achieve this task.

So what I think is that MPC will enable machine learning to reach its full potential because machine learning is always hampered by the issues of data privacy and with the MPC combined with machine learning, the data privacy issues can be mitigated.

Sridhar Vedantham: Interesting, that’s really interesting. Now obviously this sounds like a great thing to be able to do in this in this day and age of the Internet and machine learning. Uh, but it sounds to me that, uh, you know, given that you have so many people from the research community working on it, there have got to be certain challenges that you need to first overcome to make this practical and usable, right? Why don’t you walk me through the issues that currently exist with implementing MPC at scale?

Rahul Sharma: What you said, Sridhar, is exactly correct. So there are three issues which come up. They are summarized as efficiency, scalability, and programmability. So what is efficiency? The thing is that if you have a secure solution, it is going to be slower than an insecure solution. Because the insecure solution is not doing anything about security. When implementing a secure solution, you are doing something more to ensure the privacy of data and so there is going to be a performance overhead and that’s the first issue that we want the MPC protocols to have a bearable overhead and which is what Divya said that people have been working on it for decades to bring that overhead down.

The second is that machine learning models are becoming bigger and bigger and more and more complicated. So what we want to do is take these MPC protocols and scale them to the level of machine learning which exists today. And the third challenge, which I believe is the most pressing challenge, is that of programmability. So when we think of these MPC protocols, who is going to implement them at the end of the day? If it is a normal developer, then we have a problem because normal developers don’t understand security that much. There was a case in which there was a web forum post in which a person said that, “Oh, I need to ship a product. I’m going to miss the deadline. I’m getting all these security warnings. What should I do?”. And a Good Samaritan came in and said, “Oh you are calling this function with value one. Just call it with the value 0 and the error should go away.” And then the developer replied, “Great. Now I’m able to ship my product. All the warnings have gone away. You saved my life,” and so on. Now in switching that one to zero, what happened was the developer switched off all security checks, all certificate checking, all encryption, everything got switched off, so MPC protocols can be good in math, but when given to normal developers, it’s not clear whether normal developers will be able to implement these MPC protocols.

Divya Gupta: Actually, I would like to chime in here. So what Rahul said is a great story and rather an extreme one. But, uh, I, as a cryptographer, can vouch for the fact that cryptography as a whole field is mathematically challenging and quite subtle. And many a times like even we experts come up with C protocols which at the face of it looks secure and seems like there’s no issues at all. But as soon as we start to dive deeper and try to prove security proofs of the protocol and so on, we see that there are big security vulnerabilities which cannot be fixed. So I cannot stress enough that when it comes to Crypto, it is very, very important to have rigorous proofs of correctness and security. And even small, tiny tweaks here and there which look completely harmless can completely break the whole system. So it is completely unreasonable to expect people or developers who have had no formal training in Crypto or security to be able to implement these crypto protocols correctly and securely and so on. And this in fact we feel is one of the biggest challenge, which is the technical challenge to deploy MPC to do real world applications.

Sridhar Vedantham: Interesting, so I’ve got a follow-up question to something that you both just spoke about. Obviously, the cryptographer brings in the whole thing about the security and how to make secure protocols, and so on and so forth. What does the programming languages guy or the ML person bring to the table in this scenario?

Rahul Sharma: Yeah, so I think that’s a question for me since I work in the intersection of compilers and machine learning. So if I put my developer hat on and someone tells me that implement these MPC protocols written in these papers. I will be scared to death. I’m pretty sure I will break some security thing here or there. So I think the only way to get secure systems is to not let programmers implement those secure systems. So what we want to do is we want to build compilers which are automatic tools which translate programs from one language to another so that programmers write their normal code without any security like they are used to writing and then the compiler does all the cryptography and generates MPC protocols. So this is where a compiler person comes in to make the system programmable by normal programmers.

[Music]

Sridhar Vedantham: OK, so let’s be a little more specific about the work that both of you have actually been doing over the past few years, I guess. Could you talk a bit about that?

Rahul Sharma: So, continuing on the compiler part of the story. So first we build compilers in which developers can write C-like code, and we could automatically generate secure protocols out, and this gives a lot of flexibility because C is a very expressive language and you can do all sorts of different computations. But then we realized that the machine learning people don’t want to write in C. They want to write in their favorite machine learning frameworks like Tensorflow or PyTorch and ONNX. So what we did is build compilers which take machine learning models written in Tensorflow, PyTorch, ONNX and compile them directly to MPC protocols and the compilers which we built have some good properties. First of all, they’re accuracy preserving, which means that if you run insecure computation and you get some accuracy, and if you run secure computation, then you get the same accuracy. Now, this was extremely important because these machine learning people care for every ounce of accuracy. They can live with some overhead- computational overhead- because of security, but if they lose accuracy, that means the user experience gets degraded, they lose revenue. That is just a no go. So our compiler ensures that no accuracy is lost in doing the secure execution. Moreover, the compiler also has some formal guarantees, which means that even if the developer unintentionally or inadvertently does something wrong, which can create a security leak, then the compiler will just reject the program, which means that now developers can be confident that when they use our framework that if they have written something and it is compiling then it is secure.

Divya Gupta: So as I think Sridhar already pointed out that this is a project which is a great collaboration between cryptographers and programming languages folks. So we not only make advances on the programming languages front, but also on the cryptography side. So we make progress on all three challenges which Rahul mentioned before, which are efficiency, scalability and programmability. So we not only make existing Crypto out there more programmable and developer friendly, but we have developed super-duper efficient cryptographic protocols which are tailored to ML, like secure machine learning inference task and work for large machine learning benchmarks. So before our work, the prior work had three shortcomings, I would say. They were slow. They only did small machine learning benchmarks and the accuracy of the secure implementations was lower than the original models. And we solved all three challenges. So our new protocols are at least 10 times faster than what existed out there. We run large ImageNet scale benchmarks using our protocols. So ImageNet data set is a standard machine learning classification task where an image needs to be classified into one of thousand classes, which is even hard for a human to do. And for this task we take the state-of-the-art machine learning models and run them securely. And these models are again at least 10 times larger than what the prior works did securely. And finally, all our secure implementations, in fact, match the accuracy of original models, which is very important to ML folks. And all of this could not have been possible without our framework, which is called CryptFlow, which again would not have been possible without a deep collaboration between cryptographers and programming languages folks.

So this, I think, summarizes well what we have achieved in the last few years with this collaboration.

Sridhar Vedantham: That’s fantastic. Rahul, you wanted to add to that?

Rahul Sharma: I want to add a little bit about the collaboration aspect, which Divya mentioned. So this project was started by Nishanth Chandran, Divya, Aseem Rastogi and me at MSR India, and all of us come from very different backgrounds. Divya, Nishanth are cryptographers, I work at the intersection of machine learning and programming languages, Aseem works in intersection of programming languages and security. And since all of us came together, we could solve applications or scenarios with MPC much better because given a scenario, we could find out that should we fix the compiler or should we fix the cryptography, and our meetings are generally sword fights. We would fight for hours on very, very simple design decisions, and the final design we came up with is something which all of us are very happy with and this wouldn’t have been possible if we did not have our hard-working Research Fellows. And we had a fantastic set of interns which worked on this project.

Sridhar Vedantham: Fantastic, and I think that’s a great testament to the power of interdisciplinary work. And I totally can buy what you said in terms of sword fights during research meetings. Because, while I’ve not sat through research meetings myself, I have certainly attended research reviews so I can completely identify with what you’re saying from what I’ve seen myself. Alright, so this one thing that I wanted to kind of clarify for myself and I, and I think for the benefit of a lot of people who would be listening. You know when you say things like the complexity decreases and we can run things faster and the overheads are less and so on, these concepts sound fairly abstract to people who are not familiar with the area of research. Could you put a more tangible face to it in terms of you know when you’re saying that we reduce overheads, is there a certain percentage or can you give it in terms of time and so on?

Divya Gupta: Right so when you talk about efficiency of our protocols, we measure things like end to end runtimes of them, like how much time does it take for the whole function to run securely and this depends on things like the amount of data being transferred in the messages which are being exchanged between different parties. So just to take an example from our latest paper to appear at CCS this year, we built new protocols for the simple Millionaire’s Problem, which I described in the very beginning. And there we have almost 5X- five times improvement in just the communication numbers. And this translates to run times as well. And now when I look at this Millionaire’s, this is a building block to our other protocols. So in a machine learning task, let’s say there is a neural network. Neural network consist of these linear layers which look like matrix multiplications or convolutions and also some nonlinear operators which are, let’s say rectified linear units (or ReLU) or MaxPool etc. And in all of these nonlinear layers you have to do some kind of comparison on secret values, which essentially boils down to doing some kind of Millionaire’s Problem. So whatever improvements we got in in the simplest setting of Millionaire’s translate to these more complicated functions as well. And in fact, our improvement for more complicated functions are much better than just the Millionaire’s and there we have almost 10 times improvement in the communication numbers. And when you’re actually running these protocols over a network, communication is what matters the most, because like compute is local you can parallelize it, you can run it on heavy machines and so on, but communication is something which you cannot essentially make go faster. So all our protocols have been handmade and tailored to the exact setting of functions which occur in the neural networks. And we improve the communication numbers and hence the other parameters of the runtimes as well.

Sridhar Vedantham: OK, thanks for that. It certainly makes things a little clearer to me. Because to me a lot of this stuff just sounds very abstract unless I hear some actual numbers or some real instances where these things impact computation and actual time taken to conduct certain computations.

Divya Gupta: Right, so just to give you another example, our task of ImageNet classification, right which I talked about? We took state of the art models there and our inference runtime end to end was under a minute. So this shows that it doesn’t run in seconds, but it definitely runs under a minute, so it is still real, I would say.

Sridhar Vedantham: Right, so Divya, thanks. I mean, that certainly puts a much more tangible spin on it, which I can identify with. Are there any real-life scenarios in which you see MPC bringing benefits to people or to industry etc? Right now in the real- you know, in in the near term.

Rahul Sharma: So Sridhar, I believe that MPC has the potential to change the way we think about healthcare. So if we think of, for example, a hospital, which has trained a model that, given a patient image, it can tell whether the patient has COVID or pneumonia, or whether the patient is alright. Now, the hospital can post this model as a web service and what I can do- I can go to my favorite pathological lab, get a chest X-Ray done and then I can do a multi-party computation with the hospital and my sensitive data which are my chest X-Ray images will not be revealed at all to the hospital and I will get a prediction which can tell me how to go about doing the next steps. Now this task, we have run it actually with MPC protocols and this runs in a matter of minute or two. So, a latency which is quite acceptable in real life. Other applications which we have looked at is- one is detecting diabetic retinopathy from retina scans. We have also run machine learning algorithms which can give you state of the art accuracies in terms of detecting about 14 chest diseases from X-Ray images and the most recent work which we have done is in tumor segmentation. So there what happens is that the doctor is given a 3D image and the doctor has to mark the boundary of the tumor in this 3D image. So it is like a volume which the doctor is marking. Now this is a very intensive process and takes lot of time and one can think of training a machine learning model which can help the doctor do this task- the machine learning model will mark some boundary and then the doctor can just fine tune the boundary or make minor modifications to it and approve the boundary. Now we already have machine learning algorithms which can do this, but then again patients will be wary of giving their 3D scans to the model owners. So what MPC again can do is that they will be able to do this task securely without revealing the 3D scan to the organization which owns the machine learning model, and this we can do in a couple of hours. And to put things in perspective, doctors usually get to a scan in a matter of couple of days. So again, this latency is acceptable.

Divya Gupta: So another domain of interest for MPC is potentially finance and we all know that banks are highly secretive entities, for the right reasons, and they cannot and do not share the data even with other banks. And this makes many tasks quite challenging, such as detecting fraudulent transactions and detecting money laundering as the only data available is the bank’s own data and nothing else. What MPC can enable is that the banks can pool in their data and do fraud detection and detection of money laundering together on all the banks’ data and at the same time no bank’s data would be revealed in the clear to any other bank. So all this can happen securely and still you can reap benefits from pooling in data of all the banks. And in fact, many of these tasks like money laundering, actually works by siphoning money through multiple banks so you indeed need the data of all the banks. What I’m trying to get at is that the power of MPC is very general, and as long as you and I have some secret data which we do not want to reveal to each other but at the same time we want to pool in this data together and compute some function jointly so that it benefits us both, MPC can be used.

Sridhar Vedantham: So this sounds fantastic and it also sounds like there’s a huge number of areas in which you can actually deploy and implement MPC, and I guess it’s being made much easier now that you guys have come up with something that makes it usable, which it wasn’t really earlier. So, are the research findings and the research work that you guys have done, is it available to people outside of Microsoft? Can the tech community as such be able to leverage and use this work?

Divya Gupta: Yes, actually fortunately all of our protocols and work has been published at top security conferences and is available online and all the code is also available on GitHub, so if you have a secure inference scenario, you can actually go out there and try this code and code up your application.

Sridhar Vedantham: Excellent, so I think what we’ll also do is provide the links to resources that folks can access in the transcript of this podcast itself. Now, where do you guys plan to go with this in the future and what are your future research directions, future plans for this particular area?

Rahul Sharma: So, going back to machine learning. As I said there are two phases. There’s a training phase and there is the inference phase, and we have been talking mainly about the inference phase till now, because that is what we have focused on in our work. But the training phase is also very important. Suppose there are multiple data holders, for example, take multiple hospitals and they want to pool in their data together to train a joint model. But there can be legal regulations which prohibit them from sharing data indiscriminately between each other. So then they can use MPC to train a model together. Then I’ve heard like bizarre stories like nurses will sit down with permanent marker and where they will be just redacting documents and there will be legal agreements which will take years to get through and MPC just provides a technological solution to do this multi-party training.

Divya Gupta: So, we live in a world where security is a term which gets thrown around a lot without any solid backing. And to make MPC real, we feel that we have to educate people and businesses about the power of MPC and what security guarantees it can provide. So as an example, let’s take encryption. I think most people, businesses and even law understands what encryption is, what guarantees it provides, and as a result, most real-world applications use end to end encryption. But if I ask a person and say the following that there are two parties who have the secret input and they want to compute some function by pooling in their inputs, how do I do this? And the most likely answer I would get would be that the only solution possible out there is to share the data under some legal NDAs. Most people just simply don’t know that something like MPC exists. So I’m not saying that MPC would be as omnipresent as encryption, but with this education we can put MPC on the table and people and businesses can think of MPC as a potential solution to security problems. And in fact, as we talk to more and more people and educate them about MPC new scenarios are discovered which MPC can enable. And moreover, with regulations like GDPR which are aimed at preserving privacy, and also bigger and bigger ML models which need more and more data for more accuracy, we feel that MPC is a technology which can resolve this tension.

Sridhar Vedantham: Excellent, this has been a really eye opening conversation for me and I hope the people who listen to this podcast will learn as much as I have during this. Thank you so much, Divya and Rahul. I know once again- so once again I’m just going to say that it’s really late and I totally appreciated your time.

Divya Gupta: Thanks Sridhar, thanks a lot for having us here.

Rahul Sharma: Thanks Sridhar, this was fun.

[Music Ends]