Dr. Igor Perisic – Chief Data Officer

Episode 11, February 7, 2018

Big data is a big deal, and if you follow the popular technical press, you’ll have heard all the metaphors: data is the new oil, the new bacon, the new currency, the new electricity. It’s even been called the new black. While data may not actually be any of these things, we can say this: in today’s networked world, data is increasingly valuable, and it is essential to research, both basic and applied.

Today, we welcome a special guest to the podcast. Dr. Igor Perisic is the Vice President of Engineering and Chief Data Officer at LinkedIn, the social network for business and employment. On this episode, Dr. Perisic talks about the key attributes of a data scientist, how AI and machine learning are helping personalize member experiences, why we should all be big open source fans, and how LinkedIn is partnering with other researchers through their innovative Economic Graph program to “create economic opportunity for every member of the global workforce.”



Igor Perisic: Fundamentally, at the core of everything, you want an interaction to be very natural. Whether the design is superb, and it fits exactly the way that you would anticipate it, it just feels natural. And that’s exactly where our field comes into play. How do you make that thing natural? It’s not just a design perspective, but it’s also, what is the content that you’re showing? How are you going there? So, it’s transforming that experience, instead of just making it better.

Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research, and the scientists behind it. I’m your host, Gretchen Huizinga.

Big data is a big deal, and if you follow the popular technical press, you’ll have heard all the metaphors: data is the new oil, the new bacon, the new currency, the new electricity. It’s even been called the new black. While data may not actually be any of these things, we can say this: in today’s networked world, data is increasingly valuable, and it is essential to research, both basic and applied.

Today, we welcome a special guest to the podcast. Dr. Igor Perisic is the Vice President of Engineering and Chief Data Officer at LinkedIn, the social network for business and employment. Today, Dr. Perisic talks about the key attributes of a data scientist, how AI and machine learning are helping personalize member experiences, why we should all be big open source fans, and how LinkedIn is partnering with other researchers through their innovative Economic Graph program to “create economic opportunity for every member of the global workforce.” That and much more on this episode of the Microsoft Research Podcast.

Host: Igor Perisic, it’s great to have you here on the podcast joining us via Skype from Mountainview. Welcome.

Igor Perisic: Thank you very much.

Host: Yeah. Igor, you’re the VP of Engineering at LinkedIn, and you’re also the Chief Data Officer. So, in broad strokes, give us an overview of what you do, and define your responsibilities in each role.

Igor Perisic: So, I’m an engineer. I build things. I build the data systems, the infrastructure, that powers the back end of LinkedIn, and the offline systems upon which we can manipulate our data. And I’m a scientist in the sense that I build also the algorithms and the personalizations that we can provide to our members using those systems, to make the experience better for members. So, the Chief Data Officer is bridging these two, and has a little bit of a component around policy, in the sense of making sure that on one side, the security, engineering and legal are talking together to address the problems that can arise from there.

Host: LinkedIn has been both a thought leader and early adopter in the field of data science. Talk a little bit about that field in general, and, more specifically, its history and current role at LinkedIn.

Igor Perisic: Well, so history. we can go far, far back. I mean, the terms “data” and “science” have been defined, let’s say, in the English dictionary, somewhere around the 17th century. But fundamentally, I think it had another change of direction somewhere within the last ten years. It became somewhere around 2007 or 8-ish, which we worked at LinkedIn, and D.J. Patil, who was my product counterpart at that time, somewhat coined a term within the, let’s say, more recent generation of it, which was more about an individual who would really do five things: who could hack code, create code, who can reason around data and see the product that it has inherently within it, and hack the product for it. An individual who was really good about machine learning, so it can actually create those algorithms. An individual that could communicate. So, you can see the story. You can, maybe, write it up. But you can communicate that to somebody else, and not just towards somebody who’s very knowledgeable about it. And a great engineer who would build infrastructure to it. Because you’ve got to remember that 5 or 6, 7 years ago, the infrastructure wasn’t there. So, you have to build everything. Today, I see it migrating back more towards the ability to create the story, the ability to see the pattern through the data, and then how it can actually help a product.

Host: And the term “data scientist” is also a very connotative term right now.

Igor Perisic: Yeah, I think originally it was because we created something that was different. And we gave it those two terms. Then, later on, as with everything, something becomes very hot and sexy, as a career. So, then everything becomes “data science.”

Host: Right.

Igor Perisic: So, we do the same thing as we call it, a relevance engineer is the same thing as we call it, data scientist, or that we would call it just researchers, research engineers, data engineers. They all migrate around the same topics, so certainly.

Host: Now, you just mentioned a term, “relevance,” which is a term I’ve seen along and around your website. What does “relevance” mean?

Igor Perisic: Actually, that’s a good question. It’s certainly something that’s very specific at LinkedIn, where we move towards being much more data-informed in the way that we build products and we take decisions through time. And then bringing all this optimization, techniques… we didn’t want to call it like optimizers, optimize this, optimize that… but making things more relevant. And that’s where we sort of keyed around the term “relevance,” as the science to make things more relevant, which is fundamentally to make something more personal to an individual.

Host: I love that. And I think that resonates with a lot of people. So, let’s move over to the mission of LinkedIn and it’s, to quote, “giving economic opportunity to every member of the global workforce.” So, tell us what role data science and research play in making that mission happen.

Igor Perisic: Well, we do have a fundamental role at the core, because when you create economic opportunity for every member of the global workforce, you’re creating that economic opportunity for an individual. And that individual is always unique. Certainly, from, let’s say, a statistical perspective, it’s part of a bigger bucket. But at the core, that opportunity is unique for that individual. And in that situation, it means that you need to actually build up your products or build up your experiences to that individual, and to tailor it to him or her. And that’s exactly where our field comes into play. You can create broad-scope experiences, but when it starts getting into this personalization aspect, that’s where, exactly where machine learning comes in, or AI.


Host: So, in a recent post on your engineering blog last year, it was called “Celebrating Research Excellence in LinkedIn.” You gave a shout out to your researchers for 3 things. One of course is making your products better. But you also referred to contributing to the open source community, and producing world-class research, peer-reviewed research. So, why is the work of your researchers important to the scientific community at large, and vice versa, I would say?

Igor Perisic: [laughs] I grew up with an ideal of science. And I, um, there was something that resonated to me from a very, very long time. And I don’t know why. You know, you grow up and there’s some sentences that stick with you. And one that really stick with me from very early on was, “Standing on the shoulder of the giants.” And the reason for that is I could see the broad scientific community as just one community. And however great you were, you were standing on somebody’s shoulders. And sooner or later, somebody would stand on your shoulders. So, it was never something that it was done for yourself. It was something that was done for the greater community and leveraging all that knowledge and moving it forward. So, then of course, within that scope, contributing back is very critical and very important on two things: one, well, if you have the next generation, the next individuals; and also, a measure of quality. You may think that what you’ve done is great, but it may be completely wrong. So, once you expose it, it’s when you get that feedback that you can actually learn. Being right or wrong is not necessarily fundamentally the most valuable thing, compared to just the experience of learning.

Host: So, there’s been always a bit of tension in the research community between proponents of applied research and basic research, especially in industry. What’s your philosophy about the role of basic research in industry, where the ROI may be years out instead of immediate, and what role does basic research play at LinkedIn, a company whose product research is basically embedded?

Igor Perisic: So, I’m a big proponent of research, whether it’s basic or applied. And it’s all a matter of context. I believe that if you only limit yourself to just applied research, you tend to sort of focus into just one area, and just dig in more and more and more and more. And you’re preventing yourself from actually seeing this paradigm shift that can occur. Basic research has this goodness, or this overall perspective of, that there’s something that’s extremely challenging, and there’s something that advances our understanding, and in itself is valuable. The application will follow or will not follow, but it will spark ideas. Within the industry, it’s actually really hard to do just basic research. It’s just a matter of whether you have the ability to do so. In the end, the majority of companies are, need to generate a certain amount of revenue, so you have to be able to actually balance the two. There, Microsoft, I think it was very, very good at balancing the basic as well as applied research. At LinkedIn when we started, we were a startup, so you can’t really do basic research or fundamental research and go to, let’s say, the CEO at that time, in your group of 5 or 6, and work on something and, “Don’t worry about it, in 200 years, it will make a difference.” Because you don’t know whether the company is going to be there two or three years down the road. So, you have to play within in, but still have a very good perspective and overall view that those two worlds are sort of intertwined. You can see the work that Microsoft has done through the years. Fundamentally, you can say, “Well, at which point of time does, let’s say, um, quantum computing become basic to applied?” Well, at the beginning, it was certainly basic. And today, it’s almost applied. Although, the window is very far ahead.

Host: Yeah.

Igor Perisic: It’s not happening tomorrow, but it is happening.

Host: So, let’s talk about research cycle for a second, specifically at LinkedIn. What does it look like time-wise and outcome-wise for you, and how is it similar and different from maybe some other things? Because you have a really unique approach to research at LinkedIn.

Igor Perisic: Yeah. Our cycles are much shorter. Like, for example, the thing that we’ve published, I made it sure that, very early on, that publications would be about things that we’ve developed and shipped on the site and had affected our members’ experiences, compared to things that we’ve developed in an offline environment on some subqueries and whatnot, and it moved some metric. It had to have gone all the way in. So, in this case, our cycles are different in a sense that the end value is around whether it actually went all the way down to the site. Now, tying into what you said, there’s also another cycle. It’s, how long do you research before you make a difference? And there, our window at LinkedIn is I think most often, except for infrastructure, where it takes time to develop, it’s probably maybe 1 or 2 years out. So, it’s certainly not Microsoft Research.

Host: Right. So, what’s the relationship between the researchers at, say, Microsoft Research now, and LinkedIn, now that you’re sort of part of the same family?

Igor Perisic: Well, it’s in multiple dimensions. One is, in some places, we were, let’s say for example, the neural networks, and using CNNs and GPUs, Microsoft was far ahead of the curve, and then we’re benefiting from the learnings from it. For example, how to set up the topology of our clusters, how do we set up the topology of our, let’s say, rankers for it. So, then we’re benefiting from knowledge that somebody else has done, if you want. And there’s also another side as well. We have a different perspective at times, and then it’s just sharing it, as I mentioned a couple of times here, that science advances, or research advances, when you bridge those different types of connections with different types of fields. So, we bring another perspective. We’ve been interacting very closely with the New England team, who has been our, let’s say, partner and front door to the rest of Microsoft Research. As you can expect, there’s lots of interest for Microsoft Research. It’s good that it’s a little bit centralized, so we can actually navigate around all those expectations and demands.


Host: So, you talk about the importance of what you call “conversations” with your members. And there are multiple, multi-level conversations going on at any given time. So, from a technical point of view, how do you stay on top of those and make sense of those conversations in a meaningful way?

Igor Perisic: I think it’s a new perspective that I’ve looked at it, and you tend to think that once you’ve found out this new paradigm, the way that you’re going to communicate about it, it makes sense, and you wonder why you didn’t see it before. And probably in 5 to 10 years, I would look at myself and what a complete idiot I was. But that’s the nature of things. Trying to find a model that can help you reason about a problem that you have. In this situation, it’s about conversations in a sense that very early on, we try to do things with our members, get our members to do things. And in this case, a conversation is really an informal chat, if you want, between two individuals, and it’s between with let’s say LinkedIn and the member. And early one, what we’ve done is say, well, we just blasted things. We just sent you emails like there’s no tomorrow. We did very good optimization saying that, well, the more email you get, the more likely you’re going to do something. Which is probably wrong.

Host: Not me.

Igor Perisic: Uh. My point. But apparently, somebody had done a study, and that seemed to be good. Of course, when you look at high-level statistics, it seems to be fine. But the conversation was more like me screaming at you. And then when you start shifting by saying, well, there’s a lot of conversation that we can have, so which one should I communicate to you, like, right now? And today, or in the recent past, we’ve moved more into thinking about, well, this conversation doesn’t, don’t stop just with the action. That there’s some sense of, what are you doing at LinkedIn, and what do you want to achieve at LinkedIn? And those goals don’t stop tomorrow, don’t stop on the click, don’t stop on the view, don’t stop on the share. They go for a longer period of time. And the problem then starts shifting, because very often the techniques would be, I need to find, let’s say, a way to – use some different techniques – to figure out what is your likelihood to act on something? It’s a probability. And in that case, whether you do logistic regressions, whether you do different types of all the way up to uh, neural nets, you come up with a number. And that number, and then you optimize around it. And it’s usually just about that activity. Now, if you view it in the context of a conversation, that activity doesn’t stop right there. It continues. We’re having a dialogue. It’s going to go on for a little while. You can’t just optimize for the next step. So how do you go in and out of that?

Host: Right.

Igor Perisic: And I felt that it’s the right time to think about it. Although, we had thought about it already some years ago, simply because of the shift that we’re seeing nowadays with more and more of just voice-driven interfaces appearing in a lot of places, which becomes another way to communicate with an individual. Like, in the beginning, it was email, then it was notification, then you have mobile pushes, or pull downs, then you have Windows tiles on the desktop. And it becomes more and more natural, like it’s voice-driven. And if it’s voice-driven, then that, it is becoming a real dialogue. So, once you have that interface that you can leverage to build up your application, how are you thinking about your optimizations? If you’re thinking them still in the logistic regression work, it’s not going to work.

Host: What does work?

Igor Perisic: Well, we’ve taken some early steps a couple of years ago to look more around the quadratic constraints and quadratic programming. It worked for us, that first item, across multiple different types of dialogues that we’re having with members, and they work at, let’s say within a couple of steps. Overall, where that is all going is still research. Just one or two steps ahead.

Host: Right.

Igor Perisic: And we’re making good progress. And it’s going to actually be very, very, interesting to see how those things are actually shifting.

Host: Yeah. Well, it’s an exciting time with so much research in machine learning, and people trying things, to see how it impacts both the technology and the people that they’re working with. So, you said at one point that machine learning actually helps transform, not just inform, but transform your interactions with your members. Is that what you’re talking about here?

Igor Perisic: Fundamentally, at the core of everything, you want an interaction to be very natural. Whether the design is superb, and it fits exactly the way that you would anticipate it, it just feels natural. And that’s exactly where our field comes into play. How do you make that thing natural? It’s not just a design perspective, but it’s also, what is the content that you’re showing? How are you going there? So, it’s transforming that experience, instead of just making it better.

Host: Right. So, I’m a LinkedIn member myself.

Igor Perisic: Thank you.

Host: Yeah. And I got to tell you, I have noticed, over the last year, a difference in how I’m getting notified. And one thing I’ve noticed is that they seem more personal to me? Like, I would get a notification about a person that I know, or care about, as opposed to just feeling like you’re trying to pull me into the app, right?

Igor Perisic: So, that’s exactly what we worked over the last, I would say two years, to make it exactly that, compared to me pinging you, “Hey, come back, come back, come back.” I don’t know why you would come back, but here it is. Here’s like 2,000 reasons for you to come back, pick one… To more, hey, here’s something of value within the context of your interest of LinkedIn, or your value propositions we believe are important. Here’s something that hey, on one side, you ought to know, it would be a good thing, and on the other hand, it’s to share within the network also.

Host: Right. So, I have to say, that makes me happy that the kinds of technologies you’re working on, especially machine-learning technologies that might be helping to broaden your understanding of what I like, what I’m interested in, is actually playing out in the real world for people like me. And I tend to be skeptical and cynical of, you know, high-tech notifications. I don’t like those red notifications on my phone, but anyway…

Igor Perisic: Sure. It’s kind of interesting when you pick up the red thing. We associated red with danger. And it never occurred to me why would it be red. Maybe it’s saying like, oh, you need to do something like right now. But we have so many of them. And that’s why I think we’re ready for that next iteration, that other change. I look at my mailbox, and it has like numbers in the hundreds. From time to time I clear it out. I look at my phone, and it’s interesting to see how some interfaces have decided to keep the number low, and others to keep the number high. And to me, it seems that the ones that keep the number high is those that are still screaming at you, and the ones that keep the number low is to say like, we understand that you have a busy life. We understand that you have a lot of things and a lot of applications on your phone. Let’s make sure that if I put a number out there, it’s something that would be really good for you to know, compared to just, “Come back, come back. come back!”

Host: And that’s a fine line, right? Because you’re competing for people’s attention in an increasingly crowded world. And where do you, you know, stop ALL CAPS, and whisper to…?

Igor Perisic: But it seems to be working, right?

Host: Yeah, for me….

Igor Perisic: No, but we can see it also. Of course, everybody is, to some extent, informed by the metrics of the product performances. And we see that change.

Host: Yeah.

Igor Perisic: Originally, people would be confused about, “Hey, why LinkedIn?” And now it becomes more obvious. And then it becomes more natural, because you actually steer the conversation to the right places.

Host: Right. Which is interesting. This whole topic of conversation that we’re on right now is talking about how technology is actually making me feel more personal towards a particular product. Which is somewhat, sort of, backwards. But I like it. It’s working. Good job.

Igor Perisic: Data science, the terms themselves, were not “glued together” to define what we’re doing less than 7, 8 years ago, let’s say. Granted, there are some others that you can, that were sort of anticipating that it was going to go there. But it’s similar with this sense of personalization or this sense of, it makes sense within the scope of what I do, or who it is. And in the past, we used to call it an augmentation of yourself.

Host: Oh, yeah.

Igor Perisic: An agent that works, that extends and works for you. And it fits naturally within what you do. And similar to data science back then, people would look at you blank and say, “What are you talking about? What does it mean?” And today, you can start seeing it. You say, well, let’s bring all these AI, let’s bring all this technology to make me quote-unquote better. Better as an understanding of what’s happening around me. Better in the way that I can connect, interact with it, augment the way that I can deal with the problems that I need to do at work, and make me better at that.


Host: Most of the researchers that I’ve talked to on this podcast are big “open source” fans, and I think you are too. You’ve published articles, given talks, and you’re on the open source council, or you oversee the open source council at LinkedIn. Why is open source a good thing, especially for researchers?

Igor Perisic: Well, I started by saying that I love science because we all stand on the shoulder of giants. But open source is certainly a movement that has just accelerated. And I have to say that in the beginning that, everybody was wondering, is it going to stick for real? Is it going to stay forever? But it just fit so naturally, the way that we develop things as developers. If you look at any company that has more than two people, eventually there’s some level of abstraction that have built, and people just leverage the code of each other. And open source, it’s just a further extension to reaching out to the entire community. I was always a big proponent, even prior to starting at LinkedIn, because we were just leveraging solutions, and every other individuals had developed. For example, at LinkedIn, the search engine, originally, the core of it was Lucene, an Apache open source project. Then we felt like, well, since we’re riding on those giants, we should contribute back. And some other individuals will create businesses around it, and that creates a whole ecosystem that becomes that much, much better. Then through it, we stumbled upon truths, some of them being that you actually write a much better code when you actually think about open sourcing it, compared to when it’s kept internally. And not only do you write better code, but you have a better way to reason about it. You look at it, does it make sense, does it not make sense? Does it complement what is out there, or does it not complement what is out there? And a lot of goodness comes through that.

Host: So, when we talked before, you mentioned open source code as similar to a peer-reviewed research paper that you’re publishing in some way. Can you explain a little bit about what you think about that?

Igor Perisic: I usually take two allegories about open source. One, is it feels like peer review, because in a sense, you’re writing a piece of software, you’re documenting it, and you’re pushing it out there. But there’s so many publications. And if your paper is not good, nobody is going to actually build other things from it. You’re not going to be cited much. And open source is the same. If you’re not, investing into your product to make it great, nobody’s going to actually really leverage from it. And I used to take another one, which is I guess through my life experience. Open sourcing is a little baby also. You cherish it. It grows. But then at some point of time, you’ve got to let it have its own life. And it’s like your kid that grew up and now becomes mature. And they’re going to do something that you may not have envisioned, which is perfect. And open sourcing is the same. You sometimes have the tendency to keep that kid at home. I’d say, “Well, no, let them discover the rest of the world.”

Host: Igor, I just open sourced my daughter at the University of Washington.

Igor Perisic: Congratulations.

Host: I totally get what you mean. Let’s talk about LinkedIn’s relationship with the larger academic and research community through public/private research partnerships. You have a program called The Economic Graph. And it started I think as a challenge, but now it’s an ongoing program. Give us an overview of what it is and how it came about, and what research you focus on.

Igor Perisic: So, the Economic Graph challenge. The idea of letting external individuals from LinkedIn look at our data and answer some extremely pertinent and relevant questions, was there from the time that I joined LinkedIn, and probably even before that. So, how do we create an environment where we preserve that privacy of the individual, and open it up to the community? That was what the Economic Graph challenge was. We basically reached out to the, to global community, friends and family, and others, to say, “Well, if you had LinkedIn’s data, what are the questions that you would want to answer?” And within that, we had more than 200 proposals that came back. We selected, uh, 12. The ideas were just brilliant. They ranged from, how do you think about micro industries and different environments from a geo perspective, like the fashion industry in Milan, or, let’s say, the rise of electric cars in the valley, for example, to, how do women define themselves, or write about themselves on LinkedIn compared to men, if you control for multiple factors? But again, the resources are limited, so you reach out to the community to go….

Host: So, the people that participate in this with you, the academics or researchers, come from all different walks of the research life, and they participate by presenting a proposal and getting access to some of the data from LinkedIn? Is that how that works?

Igor Perisic: Well, originally, we, as I mentioned, we just broad-casted and we got a lot of proposals back. And then we would evaluate on the idea, the interestingness of the idea, but also the ability to execute on that idea. Meaning that you have the ability to, using our data, answer the question.

Host: Right.

Igor Perisic: Most ended up by being from academia.

Host: What’s your overall goal with Economic Graph?

Igor Perisic: To create economic opportunity for every member of the global workforce. And that’s where I got to believe a bit more into the vision and mission of LinkedIn, to create that economic opportunity. And not only “an” economic opportunity, but the right one for you; the one that you aspire to. So, the way that we used the Economic Graph, on one side, it still needs to be descriptive, because it provides the value of the labor market. It provides an image, whether it’s instantaneous, or whether it’s in the past. It provides a sense of, “What are people doing in the labor market, and what are the movements within it?” It’s a living entity, and we provide a picture of it. And there’s tremendous value to it, because the timelines are very quick. You don’t have to wait 6 months, or 3 months, to get a government statistic, but it provides a good picture, so then that is extremely valuable.

Host: Yeah.

Igor Perisic: The other is, to understand and see that careers are also living, they migrate, they shift from one domain to another. You have multiple careers in your lifetime. And understanding what it takes and what it doesn’t take, and helping all members to actually build up their careers. So, that’s where the Economic Graph is also moving, to figure it out, how do I share that information with you? How do I help you nurture your career? And in the end, well, that’s to create economic opportunity and get you along your life cycle. Everybody, I believe, starts a job thinking that it’s going to be the job they’re going to do forever. I started LinkedIn thinking that, “Ehh, 2 to 3 years of my life,” and 10 years later, I’m still here. But… and actually, I’m still here because of the vision and the belief of the ability that we can move it forward. But once you think in that route, you’re seeing the fluidity of the labor market as being something that is important. And I wasn’t an economist before coming to LinkedIn, whatsoever. But there’s some very fascinating mind-frame around it, or frameworks to think about it.


Host: You’re a researcher. You’ve ended up parlaying a research interest into a career. What advice would you give researchers who were heading into the field of maybe data science, or even more generally, as they prepare for their next steps after grad school?

Igor Perisic: The main one, be curious. Continue to be curious. There’s a sense of, if you went into getting a PhD, you’re wanting to solve a problem, there’s something that was missing, you went at it, and you dig in, and you attempted to move the answer a bit further along… that’s great. It’s certainly one of the best experiences. On the other hand, make sure that once that happens, don’t just limit the focus. I think that’s, at times, what researchers are missing, that, yes, it’s valuable to dig in, very much, but be aware of the rest, and make connections from that to maybe better what you’re doing, see different paradigms, different perspectives. I was listening to Fabiola Gianotti just recently, who is the head of the CERN. She studied first humanities before moving into physics. And I view myself as a scientist also. And the things that I value more, that I wish I had more going to school, is actually humanities. Not math or the rest, because I feel like, no, I get these anyway. So, that’s a given. But the rest if missing. And if you think about it this way, today at LinkedIn, I’m trying to understand, from a member’s perspective, what are the struggles that they’re going through in order to make their career better? And you get that more by this site exposure, by this interest. Make these connections. And these connections have to be outside of just your domain. If I had one encouragement, it’s keep the wonder. Just go for it. Just keep the wonder and be willing to learn different topics. And challenge yourself a lot.

Host: Igor Perisic, thank you so much. It’s been a delight to talk to you today.

Igor Perisic: You’re welcome very much. Thank you.

To learn more about how researchers are harnessing and using data to make the digital world – and your experience with it – better, visit Microsoft.com/research