{"id":602724,"date":"2019-08-21T07:56:58","date_gmt":"2019-08-21T14:56:58","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=602724"},"modified":"2022-11-07T12:31:13","modified_gmt":"2022-11-07T20:31:13","slug":"machine-reading-comprehension-with-dr-t-j-hazen","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/machine-reading-comprehension-with-dr-t-j-hazen\/","title":{"rendered":"Machine reading comprehension with Dr. T.J. Hazen"},"content":{"rendered":"<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-602727 size-large\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-1024x576.png\" alt=\"Dr. TJ Hazen\" width=\"1024\" height=\"576\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-1280x720.png 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788.png 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<h3>Episode 86, August 21, 2019<\/h3>\n<p>The ability to read and understand unstructured text, and then answer questions about it, is a common skill among literate humans. But for machines? Not so much. At least not yet! And not if <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tihazen\/\">Dr. T.J. Hazen<\/a>, Senior Principal Research Manager in the Engineering and Applied Research group at <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/lab\/microsoft-research-montreal\/\">MSR Montreal<\/a>, has a say. He\u2019s spent much of his career working on machine speech and language understanding, and particularly, of late, machine reading comprehension, or <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/machine-reading-comprehension\/\">MRC<\/a>.<\/p>\n<p>On today\u2019s podcast, Dr. Hazen talks about why reading comprehension is so hard for machines, gives us an inside look at the technical approaches applied researchers and their engineering colleagues are using to tackle the problem, and shares the story of how an a-ha moment with a Rubik\u2019s Cube inspired a career in computer science and a quest to teach computers to answer complex, text-based questions in the real world.<\/p>\n<h3>Related:<\/h3>\n<ul type=\"disc\">\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\">Microsoft Research Podcast<\/a>: View more podcasts on Microsoft.com<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/itunes.apple.com\/us\/podcast\/microsoft-research-a-podcast\/id1318021537?mt=2\">iTunes<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: Subscribe and listen to new podcasts each week on iTunes<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/subscribebyemail.com\/www.blubrry.com\/feeds\/microsoftresearch.xml\">Email<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: Subscribe and listen by email<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/subscribeonandroid.com\/www.blubrry.com\/feeds\/microsoftresearch.xml\">Android<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: Subscribe and listen on Android<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/open.spotify.com\/show\/4ndjUXyL0hH1FXHgwIiTWU\">Spotify<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: Listen on Spotify<\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.blubrry.com\/feeds\/microsoftresearch.xml\">RSS feed<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/note.microsoft.com\/ww-registration-microsoft-research-newsletter-s.html?wt.mc_id=S-webpage_podcast\">Microsoft Research Newsletter<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>: Sign up to receive the latest news from Microsoft Research<\/li>\n<\/ul>\n<hr>\n<h3>Transcript<\/h3>\n<p>T.J. Hazen: Most of the questions are fact-based questions like, who did something, or when did something happen? And most of the answers are fairly easy to find. So, you know, doing as well as a human on a task is fantastic, but it only gets you part of the way there. What happened is, after this was announced that Microsoft had this great achievement in machine reading comprehension, lots of customers started coming to Microsoft saying, how can we have that for our company? And this is where we\u2019re focused right now. How can we make this technology work for real problems that our enterprise customers are bringing in?<\/p>\n<p><strong>Host: You\u2019re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I\u2019m your host, Gretchen Huizinga.<\/strong><\/p>\n<p><strong>Host: The ability to read and understand unstructured text, and then answer questions about it, is a common skill among literate humans. But for machines? Not so much. At least not yet! And not if Dr. T.J. Hazen, Senior Principal Research Manager in the Engineering and Applied Research group at MSR Montreal, has a say. He\u2019s spent much of his career working on machine speech and language understanding, and particularly, of late, machine reading comprehension, or MRC.<\/strong><\/p>\n<p><strong>On today\u2019s podcast, Dr. Hazen talks about why reading comprehension is so hard for machines, gives us an inside look at the technical approaches applied researchers and their engineering colleagues are using to tackle the problem, and shares the story of how an a-ha moment with a Rubik\u2019s Cube inspired a career in computer science and a quest to teach computers to answer complex, text-based questions in the real world. That and much more on this episode of the Microsoft Research Podcast.<\/strong><\/p>\n<p><strong>(music plays)<\/strong><\/p>\n<p><strong>Host: T.J. Hazen, welcome to the podcast!<\/strong><\/p>\n<p>T.J. Hazen: Thanks for having me.<\/p>\n<p><strong>Host: Researchers like to situate their research, and I like to situate my researchers so let\u2019s get you situated. You are a Senior Principal Research Manager in the Engineering and Applied Research group at Microsoft Research in Montreal. Tell us what you do there. What are the big questions you\u2019re asking, what are the big problems you\u2019re trying to solve, what gets you up in the morning?<\/strong><\/p>\n<p>T.J. Hazen: Well, I\u2019ve spent my whole career working in speech and language understanding, and I think the primary goal of everything I do is to try to be able to answer questions. So, people have questions and we\u2019d like the computer to be able to provide answers. So that\u2019s sort of the high-level goal, how do we go about answering questions? Now, answers can come from many places.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: A lot of the systems that you\u2019re probably aware of like Siri for example, or Cortana or Bing or Google, any of them\u2026<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: \u2026the answers typically come from structured places, databases that contain information, and for years these models have been built in a very domain-specific way. If you want to know the weather, somebody built a system to tell you about the weather.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: And somebody else might build a system to tell you about the age of your favorite celebrity and somebody else might have written a system to tell you about the sports scores, and each of them can be built to handle that very specific case. But that limits the range of questions you can ask because you have to curate all this data, you have to put it into structured form. And right now, what we\u2019re worried about is, how can you answer questions more generally, about anything? And the internet is a wealth of information. The internet has got tons and tons of documents on every topic, you know, in addition to the obvious ones like Wikipedia. If you go into any enterprise domain, you\u2019ve got manuals about how their operation works. You\u2019ve got policy documents. You\u2019ve got financial reports. And it\u2019s not typical that all this information is going to be curated by somebody. It\u2019s just sitting there in text. So how can we answer any question about anything that\u2019s sitting in text? We don\u2019t have a million or five million or ten million librarians doing this for us\u2026<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: \u2026uhm, but the information is there, and we need a way to get at it.<\/p>\n<p><strong>Host: Is that what you are working on?<\/strong><\/p>\n<p>T.J. Hazen: Yes, that\u2019s exactly what we\u2019re working on. I think one of the difficulties with today\u2019s systems is, they seem really smart\u2026<\/p>\n<p><strong>Host: Right?<\/strong><\/p>\n<p>T.J. Hazen: Sometimes. Sometimes they give you fantastically accurate answers. But then you can just ask a slightly different question and it can fall on its face.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: That\u2019s the real gap between what the models currently do, which is, you know, really good pattern matching some of the time, versus something that can actually understand what your question is and know when the answer that it\u2019s giving you is correct.<\/p>\n<p><strong>Host: Let\u2019s talk a bit about your group, which, out of Montreal, is Engineering and Applied Research. And that\u2019s an interesting umbrella at Microsoft Research. You\u2019re technically doing fundamental research, but your focus is a little different from some of your pure research peers. How would you differentiate what you do from others in your field?<\/strong><\/p>\n<p>T.J. Hazen: Well, I think there\u2019s two aspects to this. The first is that the lab up in Montreal was created as an offshoot of an acquisition. Microsoft bought Maluuba, which was a startup that was doing really incredible deep learning research, but at the same time they were a startup and they needed to make money. So, they also had this very talented engineering team in place to be able to take the research that they were doing in deep learning and apply it to problems where it could go into products for customers.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: When you think about that need that they had to actually build something, you could see why they had a strong engineering team.<\/p>\n<p><strong>Host: Yeah.<\/strong><\/p>\n<p>T.J. Hazen: Now, when I joined, I wasn\u2019t with them when they were a startup, I actually joined them from Azure where I was working with outside customers in the Azure Data Science Solution team, and I observed lots of problems that our customers have. And when I saw this new team that we had acquired and we had turned into a research lab in Montreal, I said I really want to be involved because they have exactly the type of technology that can solve customer problems and they have this engineering team in place that can actually deliver on turning from a concept into something real.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: So, I joined, and I had this agreement with my manager that we would focus on real problems. They were now part of the research environment at Microsoft, but I said that doesn\u2019t restrict us on thinking about blue sky, far-afield research. We can go and talk to product teams and say what are the real problems that are hindering your products, you know, what are the difficulties you have in actually making something real? And we could focus our research to try to solve those difficult problems. And if we\u2019re successful, then we have an immediate product that could be beneficial.<\/p>\n<p><strong>Host: Well in any case, you\u2019re swimming someplace in a \u201cwe could do this immediately\u201d but you have permission to take longer, or is there a mandate, as you live in this engineering and applied research group?<\/strong><\/p>\n<p>T.J. Hazen: I think there\u2019s a mandate to solve hard problems. I think that\u2019s the mandate of research. If it wasn\u2019t a hard problem, then somebody\u2026<\/p>\n<p><strong>Host: \u2026would already have a product.<\/strong><\/p>\n<p>T.J. Hazen: \u2026in the product team would already have a solution, right? So, we do want to tackle hard problems. But we also want to tackle real problems. That\u2019s, at least, our focus of our team. And there\u2019s plenty of people doing blue sky research and that\u2019s an absolute need as well. You know, we can\u2019t just be thinking one or two years ahead. Research should be also be thinking five, ten, fifteen years ahead.<\/p>\n<p><strong>Host: So, there\u2019s a whole spectrum there.<\/strong><\/p>\n<p>T.J. Hazen: So, there\u2019s a spectrum. But there is a real need, I think, to fill that gap between taking an idea that works well in a lab and turning it into something that works well in practice for a real problem. And that\u2019s the key. And many of the problems that have been solved by Microsoft have not just been blue sky ideas, but they\u2019ve come from this problem space where a real product says, ahh, we\u2019re struggling with this. So, it could be anything. It can be, like, how does Bing efficiently rank documents over billions of documents? You don\u2019t just solve that problem by thinking about it, you have to get dirty with the data, you have to understand what the real issues are. So, many of these research problems that we\u2019re focusing on, and we\u2019re focusing on, how do you answer questions out of documents when the questions could be arbitrary, and on any topic? And you\u2019ve probably experienced this, if you are going into a search site for your company, that company typically doesn\u2019t have the advantage of having a big Bing infrastructure behind it that\u2019s collecting all this data and doing sophisticated machine learning. Sometimes it\u2019s really hard to find an answer to your question. And, you know, the tricks that people use can be creative and inventive but oftentimes, trying to figure out what the right keywords are to get you to an answer is not the right thing.<\/p>\n<p><strong>Host: You work closely with engineers on the path from research to product. So how does your daily proximity to the people that reify your ideas as a researcher impact the way you view, and do, your work as a researcher?<\/strong><\/p>\n<p>T.J. Hazen: Well, I think when you\u2019re working in this applied research and engineering space, as opposed to a pure research space, it really forces you to think about the practical implications of what you\u2019re building. How easy is it going to be for somebody else to use this? Is it efficient? Is it going to run at scale? All of these problems are problems that engineers care a lot about. And sometimes researchers just say, let me solve the problem first and everything else is just engineering. If you say that to an engineer, they\u2019ll be very frustrated because you don\u2019t want to bring something to an engineer that works ten times slower than needs to be, uses ten times more memory. So, when you\u2019re in close proximity to engineers, you\u2019re thinking about these problems as you are developing your methods.<\/p>\n<p><strong>Host: Interesting, because those two things, I mean, you could come up with a great idea that would do it and you pay a performance penalty in spades, right?<\/strong><\/p>\n<p>T.J. Hazen: Yeah, yeah. So, sometimes it\u2019s necessary. Sometimes you don\u2019t know how to do it and you just say let me find a solution that works and then you spend ten years actually trying to figure out how to make it work in a real product.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: And I\u2019d rather not spend that time. I\u2019d rather think about, you know, how can I solve something and have it be effective as soon as possible?<\/p>\n<p>(music plays)<\/p>\n<p><strong>Host: Let\u2019s talk about <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/research-area\/human-language-technologies\/\">human language technologies<\/a>. They\u2019ve been referred to by some of your colleagues as \u201c<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/blog\/speech-and-language-the-crown-jewel-of-ai-with-dr-xuedong-huang\/\">the crown jewel of AI<\/a>.\u201d Speech and language comprehension is still a really hard problem. Give us a lay of the land, both in the field in general and at Microsoft Research specifically. What\u2019s hope and what\u2019s hype, and what are the common misconceptions that run alongside the remarkable strides you actually are making?<\/strong><\/p>\n<p>T.J. Hazen: I think that word we mentioned already: understand. That\u2019s really the key of it. Or comprehend is another way to say it. What we\u2019ve developed doesn\u2019t really understand, at least when we\u2019re talking about general purpose AI. So, the deep learning mechanisms that people are working on right now that can learn really sophisticated things from examples. They do an incredible job of learning specific tasks, but they really don\u2019t understand what they\u2019re learning.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: So, they can discover complex patterns that can associate things. So in the vision domain, you know, if you\u2019re trying to identify objects, and then you go in and see what the deep learning algorithm has learned, it might have learned features that are like, uh, you know, if you\u2019re trying to identify a dog, it learns features that would say, oh, this is part of a leg, or this is part of an ear, or this is part of the nose, or this is the tail. It doesn\u2019t know what these things are, but it knows they all go together. And the combination of them will make a dog. And it doesn\u2019t know what a dog is either. But the idea that you could just feed data in and you give it some labels, and it figures everything else out about how to associate that label with that, that\u2019s really impressive learning, okay? But it\u2019s not understanding. It\u2019s just really sophisticated pattern-matching. And the same is true in language. We\u2019ve gotten to the point where we can answer general-purpose questions and it can go and find the answer out of a piece of text, and it can do it really well in some cases, and like, some of the examples we\u2019ll give it, we\u2019ll give it \u201cwho\u201d questions and it learns that \u201cwho\u201d questions should contain proper names or names of organizations. And \u201cwhen\u201d questions should express concepts of time. It doesn\u2019t know anything about what time is, but it\u2019s figured out the patterns about, how can I relate a question like \u201cwhen\u201d to an answer that contains time expression? And that\u2019s all done automatically. There\u2019s no features that somebody sits down and says, oh, this is a month and a month means this, and this is a year, and a year means this. And a month is a part of a year. Expert AI systems of the past would do this. They would create ontologies and they would describe things about how things are related to each other and they would write rules. And within limited domains, they would work really, really well if you stayed within a nice, tightly constrained part of that domain. But as soon as you went out and asked something else, it would fall on its face. And so, we can\u2019t really generalize that way efficiently. If we want computers to be able to learn arbitrarily, we can\u2019t have a human behind the scene creating an ontology for everything. That\u2019s the difference between understanding and crafting relationships and hierarchies versus learning from scratch. We\u2019ve gotten to the point now where the algorithms can learn all these sophisticated things, but they really don\u2019t understand the relationships the way that humans understand it.<\/p>\n<p><strong>Host: Go back to the, sort of, the lay of the land, and how I sharpened that by saying, what\u2019s hope and what\u2019s hype? Could you give us a \u201cTBH\u201d answer?<\/strong><\/p>\n<p>T.J. Hazen: Well, what\u2019s hope is that we can actually find reasonable answers to an extremely wide range of questions. What\u2019s hype is that the computer will actually understand, at some deep and meaningful level, what this answer actually means. I do think that we\u2019re going to grow our understanding of algorithms and we\u2019re going to figure out ways that we can build algorithms that could learn more about relationships and learn more about reasoning, learn more about common sense, but right now, they\u2019re just not at that level of sophistication yet.<\/p>\n<p><strong>Host: All right. Well let\u2019s do the podcast version of your NERD Lunch and Learn. Tell us what you are working on in machine reading comprehension, or MRC, and what contributions you are making to the field right now.<\/strong><\/p>\n<p>T.J. Hazen: You know, NERD is short for <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/lab\/microsoft-research-new-england\/\">New England Research and Development Center<\/a>\u2026<\/p>\n<p><strong>Host: I did not!<\/strong><\/p>\n<p>T.J. Hazen: \u2026which is where I physically work.<\/p>\n<p><strong>Host: Okay\u2026<\/strong><\/p>\n<p>T.J. Hazen: Even though I work closely and am affiliated with the Montreal lab, I work out of the lab in Cambridge, Massachusetts, and NERD has a weekly Lunch and Learn where people present the work they\u2019re doing, or the research that they\u2019re working on, and at one of these Lunch and Learns, I gave this talk on machine reading comprehension. Machine reading comprehension, in its simplest version, is being able to take a question and then being able to find the answer anywhere in some collection of text. As we\u2019ve already mentioned, it\u2019s not really \u201ccomprehending\u201d at this point, it\u2019s more just very sophisticated pattern-matching. But it works really well in many circumstances. And even on tasks like the Stanford Question Answering Dataset, it\u2019s a common competition that people have competed in, question answering, by computer, has achieved a human level of parity on that task.<\/p>\n<p><strong>Host: Mm-hmm.<\/strong><\/p>\n<p>T.J. Hazen: Okay. But that task itself is somewhat simple because most of the questions are fact-based questions like, who did something or when did something happen? And most of the answers are fairly easy to find. So, you know, doing as well as a human on a task is fantastic, but it only gets you part of the way there. What happened is, after this was announced that Microsoft had this great achievement in machine reading comprehension, lots of customers started coming to Microsoft saying, how can we have that for our company? And this is where we\u2019re focused right now. Like, how can we make this technology work for real problems that our enterprise customers are bringing in? So, we have customers coming in saying, I want to be able to answer any question in our financial policies, or our auditing guidelines, or our operations manual. And people don\u2019t ask \u201cwho\u201d or \u201cwhen\u201d questions of their operations manual. They ask questions like, how do I do something? Or explain some process to me. And those answers are completely different. They tend to be longer and more complex and you don\u2019t always, necessarily, find a short, simple answer that\u2019s well situated in some context.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: So, our focus at MSR Montreal is to take this machine reading comprehension technology and apply it into these new areas where our customers are really expressing that there\u2019s a need.<\/p>\n<p><strong>Host: Well, let\u2019s go a little deeper, technically, on what it takes to enable or teach machines to answer questions, and this is key, with limited data. That\u2019s part of your equation, right?<\/strong><\/p>\n<p>T.J. Hazen: Right, right. So, when we go to a new task, uh, so if a company comes to us and says, oh, here\u2019s our operations manual, they often have this expectation, because we\u2019ve achieved human parity on some dataset, that we can answer any question out of that manual. But when we test the general-purpose models that have been trained on these other tasks on these manuals, they don\u2019t generally work well. And these models have been trained on hundreds of thousands, if not millions, of examples, depending on what datasets you\u2019ve been using. And it\u2019s not reasonable to ask a company to collect that level of data in order to be able to answer questions about their operations manual. But we need something. We need some examples of what are the types of questions, because we have to understand what types of questions they ask, we need to understand the vocabulary. We\u2019ll try to learn what we can from the manual itself. But without some examples, we don\u2019t really understand how to answer questions in these new domains. But what we discovered through some of the techniques that are available, transfer learning is what we refer to as sort of our model adaptation, how do you learn from data in some new domain and take an existing model and make it adapt to that domain? We call that transfer learning. We can actually use transfer learning to do really well in a new domain without requiring a ton of data. So, our goal is to have it be examples like hundreds of examples, not tens of thousands of examples.<\/p>\n<p><strong>Host: How\u2019s that working now?<\/strong><\/p>\n<p>T.J. Hazen: It works surprisingly well. I\u2019m always amazed at how well these machine learning algorithms work with all the techniques that are available now. These models are very complex. When we\u2019re talking about our question answering model, it has hundreds of millions of parameters and what you\u2019re talking about is trying to adjust a model that is hundreds of millions of parameters with only hundreds of examples and, through a variety of different techniques where we can avoid what we call overfitting, we can allow the generalizations that are learned from all this other data to stay in place while still adapting it so it does well in this specific domain. So, yeah, I think we\u2019re doing quite well. We\u2019re still exploring, you know, what are the limits?<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: And we\u2019re still trying to figure out how to make it work so that an outside company can easily create the dataset, put the dataset into a system, push a button. The engineering for that and the research for that is still ongoing, but I think we\u2019re pretty close to being able to, you know, provide a solution for this type of problem.<\/p>\n<p><strong>Host: All right. Well I\u2019m going to push in technically because to me, it seems like that would be super hard for a machine. We keep referring to these techniques\u2026 Do we have to sign an NDA, as listeners?<\/strong><\/p>\n<p>T.J. Hazen: No, no. I can explain stuff that\u2019s out\u2026<\/p>\n<p><strong>Host: Yeah, do!<\/strong><\/p>\n<p>T.J. Hazen: \u2026 in the public domain. So, there are two common underlying technical components that make this work. One is called word embeddings and the other is called attention. Word embeddings are a mechanism where it learns how to take words or phrases and express them in what we call vector space.<\/p>\n<p><strong>Host: Okay.<\/strong><\/p>\n<p>T.J. Hazen: So, it turns them into a collection of numbers. And it does this by figuring out what types of words are similar to each other based on the context that they appear in, and then placing them together in this vector space, so they\u2019re nearby each other. So, we would learn, that let\u2019s say, city names are all similar because they appear in similar contexts. And so, therefore, Boston and New York and Montreal, they should all be close together in this vector space.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: And blue and red and yellow should be close together. And then advances were made to figure this out in context. So that was the next step, because some words have multiple meanings.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: So, you know, if you have a word like apple, sometimes it refers to a fruit and it should be near orange and banana, but sometimes it refers to the company and it should be near Microsoft and Google. So, we\u2019ve developed context dependent ones, so that says, based on the context, I\u2019ll place this word into this vector space so it\u2019s close to the types of things that it really represents in that context.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: That\u2019s the first part. And you can learn these word embeddings from massive amounts of data. So, we start off with a model that\u2019s learned on far more data than we actually have question and answer data for. The second part is called attention and that\u2019s how you associate things together. And it\u2019s the attention mechanisms that learn things like a word like \u201cwho\u201d has to attend to words like person names or company names. And a word like \u201cwhen\u201d has to attend to\u2026<\/p>\n<p><strong>Host: Time.<\/strong><\/p>\n<p>T.J. Hazen: \u2026time. And those associations are learned through this attention mechanism. And again, we can actually learn on a lot of associations between things just from looking at raw text without actually having it annotated.<\/p>\n<p><strong>Host: Mm-hmm.<\/strong><\/p>\n<p>T.J. Hazen: Once we\u2019ve learned all that, we have a base, and that base tells us a lot about how language works. And then we just have to have it focus on the task, okay? So, depending on the task, we might have a small amount of data and we feed in examples in that small amount, but it takes advantage of all the stuff that it\u2019s learned about language from all these, you know, rich data that\u2019s out there on the web. And so that\u2019s how it can learn these associations even if you don\u2019t give it examples in your domain, but it\u2019s learned a lot of these associations from all the raw data.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: And so, that\u2019s the base, right? You\u2019ve got this base of all this raw data and then you train a task-specific thing, like a question answering system, but even then, what we find is that, if we train a question answering system on basic facts, it doesn\u2019t always work well when you go to operation manuals or other things. So, then we have to have it adapt.<\/p>\n<p><strong>Host: Sure.<\/strong><\/p>\n<p>T.J. Hazen: But, like I said, that base is very helpful because it\u2019s already learned a lot of characteristics of language just by observing massive amounts of text.<\/p>\n<p>(music plays)<\/p>\n<p><strong>Host: I\u2019d like you to predict the future. No pressure. What\u2019s on the horizon for machine reading comprehension research? What are the big challenges that lie ahead? I mean, we\u2019ve sort of laid the land out on what we\u2019re doing now. What next?<\/strong><\/p>\n<p>T.J. Hazen: Yeah. Well certainly, more complex questions. What we\u2019ve been talking about so far is still fairly simple in the sense that you have a question, and we try to find passages of text that answer that question. But sometimes a question actually requires that you get multiple pieces of evidence from multiple places and you somehow synthesize them together. So, a simple example we call the multi-hop example. If I ask a question like, you know, where was Barack Obama\u2019s wife born? I have to figure out first, who is Barack Obama\u2019s wife? And then I have to figure out where she was born. And those pieces of information might be in two different places.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: So that\u2019s what we call a multi-hop question. And then, sometimes, we have to do some operation on the data. So, you could say, you know like, what players, you know, from one Super Bowl team also played on another Super Bowl team? Well there, what you have to do is, you have to get the list of all the players from both teams and then you have to do an intersection between them to figure out which ones are the same on both. So that\u2019s an operation on the data\u2026<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: \u2026and you can imagine that there\u2019s lots of questions like that where the information is there, but it\u2019s not enough to just show the person where the information is. You also would like to go a step further and actually do the computation for that. That\u2019s a step that we haven\u2019t done, like, how do you actually go from mapping text to text, and saying these two things are associated, to mapping text to some sequence of operations that will actually give you an exact answer. And, you know, it can be quite difficult. I can give you a very simple example. Like, just answering a question, yes or no, out of text, is not a solved problem. Let\u2019s say I have a question where someone says, I\u2019m going to fly to London next week. Am I allowed to fly business class according to my policies from my company, right? We can have a system that would be really good at finding the section of the policy that says, you know, if you are a VP-level or higher and you are flying overseas, you can fly business class, otherwise, no. Okay? But, you know, if we actually want the system to answer yes or no, we have to actually figure out all the details, like okay, who\u2019s asking the question? Are they a VP? Where are they located? Oh, they\u2019re in New York. What does flying overseas mean??<\/p>\n<p><strong>Host: Right. They\u2019re are layers.<\/strong><\/p>\n<p>T.J. Hazen: Right. So that type of comprehension, you know, we\u2019re not quite there yet for all types of questions. Usually these things have to be crafted by hand for specific domains. So, all of these things about how can you answer complex questions, and even simple things like common sense, like, things that we all know\u2026 Um. And so, my manager, Andrew McNamara, he was supposed to be here with us, one of his favorite examples is this concept of coffee being black. But if you spill coffee on your shirt, do you have a black stain on your shirt? No, you\u2019ve got a brown stain on your shirt. And that\u2019s just common knowledge. That is, you know, a common-sense thing that computers may not understand.<\/p>\n<p><strong>Host: You\u2019re working on research, and ultimately products or product features, that make people think they can talk to their machines and that their machines can understand and talk back to them. So, is there anything you find disturbing about this? Anything that keeps you up at night? And if so, how are you dealing with it?<\/strong><\/p>\n<p>T.J. Hazen: Well, I\u2019m certainly not worried about the fact that people can ask questions of the computer and the computer can give them answers. What I\u2019m trying to get at is something that\u2019s helpful and can help you solve tasks. In terms of the work that we do, yeah, there are actually issues that concern me. So, one of the big ones is, even if a computer can say, oh, I found a good answer for you, here\u2019s the answer, it doesn\u2019t know anything about whether that answer is true. If you go and ask your computer, was the Holocaust real? and it finds an article on the web that says no, the Holocaust was a hoax, do I want my computer to show that answer? No, I don\u2019t. But\u2026<\/p>\n<p><strong>Host: Or the moon landing\u2026!<\/strong><\/p>\n<p>T.J. Hazen: \u2026if all you are doing is teaching the computer about word associations, it might think that\u2019s a perfectly reasonable answer without actually knowing that this is a horrible answer to be showing. So yeah, the moon landing, vaccinations\u2026 The easy way that people can defame people on the internet, you know, even if you ask a question that might seem like a fact-based question, you can get vast differences of opinion on this and you can get extremely biased and untrue answers. And how does a computer actually understand that some of these things are not things that we should represent as truth, right? Especially if your goal is to find a truthful answer to a question.<\/p>\n<p><strong>Host: All right. So, then what do we do about that? And by we, I mean you!<\/strong><\/p>\n<p>T.J. Hazen: Well, I have been working on this problem a little bit with the Bing team. And one of the things that we discovered is that if you can determine that a question is phrased in a derogatory way, that usually means the search results that you\u2019re going to get back are probably going to be phrased in a derogatory way. So, even if we don\u2019t understand the answer, we can just be very careful about what types of questions we actually want to answer.<\/p>\n<p><strong>Host: Well, what does the world look like if you are wildly successful?<\/strong><\/p>\n<p>T.J. Hazen: I want the systems that we build to just make life easier for people. If you have an information task, the world is successful if you get that piece of information and you don\u2019t have to work too hard to get it. We call it task completion. If you have to struggle to find an answer, then we\u2019re not successful. But if you can ask a question, and we can get you the answer, and you go, yeah, that\u2019s the answer, that\u2019s success to me. And we\u2019ll be wildly successful if the types of things where that happens become more and more complex. You know, where if someone can start asking questions where you are synthesizing data and computing answers from multiple pieces of information, for me, that\u2019s the wildly successful part. And we\u2019re not there yet with what we\u2019re going to deliver into product, but it\u2019s on the research horizon. It will be incremental. It\u2019s not going to happen all at once. But I can see it coming, and hopefully by the time I retire, I can see significant progress in that direction.<\/p>\n<p><strong>Host: Off script a little\u2026 will I be talking to my computer, my phone, a HoloLens? Who am I asking? Where am I asking? What device? Is that so \u201cout there\u201d as well?<\/strong><\/p>\n<p>T.J. Hazen: Uh, yeah, I don\u2019t know how to think about where devices are going. You know, when I was a kid, I watched the original Star Trek, you know, and everything on there, it seemed like a wildly futuristic thing, you know? And then fifteen, twenty years later, everybody\u2019s got their own little \u201ccommunicator.\u201d<\/p>\n<p><strong>Host: Oh my gosh.<\/strong><\/p>\n<p>T.J. Hazen: And so, uh, you know, the fact that we\u2019re now beyond where Star Trek predicted we would be, you know, that itself, is impressive to me. So, I don\u2019t want to speculate where the devices are going. But I do think that this ability to answer questions, it\u2019s going to get better and better. We\u2019re going to be more interconnected. We\u2019re going to have more access to data. The range of things that computers will be able to answer is going to continue to expand. And I\u2019m not quite sure exactly what it looks like in the future, to be honest, but, you know, I know it\u2019s going to get better and easier to get information. I\u2019m a little less worried about, you know, what the form factor is going to be. I\u2019m more worried about how I\u2019m going to actually answer questions reliably.<\/p>\n<p><strong>Host: Well it\u2019s story time. Tell us a little bit about yourself, your life, your path to MSR. How did you get interested in computer science research and how did you land where you are now working from Microsoft Research in New England for Montreal?<\/strong><\/p>\n<p>T.J. Hazen: Right. Well, I\u2019ve never been one to long-term plan for things. I\u2019ve always gone from what I find interesting to the next thing I find interesting. I never had a really serious, long-term goal. I didn\u2019t wake up some morning when I was seven and say, oh, I want to be a Principal Research Manager at Microsoft in my future! I didn\u2019t even know what Microsoft was when I was seven. I went to college and I just knew I wanted to study computers. I didn\u2019t know really what that meant at the time, it just seemed really cool.<\/p>\n<p><strong>Host: Yeah.<\/strong><\/p>\n<p>T.J. Hazen: I had an Apple II when I was a kid and I learned how to do some basic programming. And then I, you know, was going through my course work. I was, in my junior year, I was taking a course in audio signal processing and in the course of that class, we got into a discussion about speech recognition, which to me was, again, it was Star Trek. It was something I saw on TV. Of course, now it was Next Generation\u2026.!<\/p>\n<p><strong>Host: Right!<\/strong><\/p>\n<p>T.J. Hazen: But you know, you watch the next generation of Star Trek and they\u2019re talking to the computer and the computer is giving them answers and here somebody is telling me you know there\u2019s this guy over in the lab for computer science, Victor Zue, and he\u2019s building systems that recognize speech and give answers to questions! And to me, that was science-fiction. So, I went over and asked the guy, you know, I heard you\u2019re building a system, and can I do my bachelor\u2019s thesis on this? And he gave me a demo of the system \u2013 it was called Voyager \u2013 and he asked a question, I don\u2019t remember the exact question, but it was probably something like, show me a map of Harvard Square. And the system starts chugging along and it\u2019s showing results on the screen as it\u2019s going. And it literally took about two minutes for it to process the whole thing. It was long enough that he actually explained to me how the entire system worked while it was processing. But then it came back, and it popped up a map of Harvard Square on the screen. And I was like, ohhh my gosh, this is so cool, I have to do this! So, I did my bachelor\u2019s thesis with him and then I stayed on for graduate school. And by seven years later, we had a system that was running in real time. We had a publicly available system in 1997 that you could call up on a toll-free number and you could ask for weather reports and weather information for anywhere in the United States. And so, the idea that it went from something that was \u201cStar Trek\u201d to something that I could pick up my phone, call a number and, you know, show my parents, this is what I\u2019m working on, it was astonishing how fast that developed! I stayed on in that field with that research group. I was at MIT for another fifteen years after I graduated. At some point, a lot of the things that we were doing, they moved from the research lab to actually being real.<\/p>\n<p><strong>Host: Right.<\/strong><\/p>\n<p>T.J. Hazen: So, like twenty years after I went and asked to do my bachelor\u2019s thesis, Siri comes out, okay? And so that was our goal. They were like, twenty years ago, we should be able to have a device where you can talk to it and it gives you answers and twenty years later there it was. So, that, for me, that was a queue that maybe it\u2019s time to go where the action is, which was in companies that were building these things. Once you have a large company like Microsoft or Google throwing their resources behind these hard problems, then you can\u2019t compete when you\u2019re in academia for that space. You know, you have to move on to something harder and more far out. But I still really enjoyed it. So, I joined Microsoft to work on Cortana\u2026<\/p>\n<p><strong>Host: Okay\u2026<\/strong><\/p>\n<p>T.J. Hazen: \u2026when we were building the first version of Cortana. And I spent a few years working on that. I\u2019ve worked on some Bing products. I then spent some time in Azure trying to transfer these things so that companies that had the similar types of problems could solve their problems on Azure with our technology.<\/p>\n<p><strong>Host: And then we come full circle to&#8230;<\/strong><\/p>\n<p>T.J. Hazen: Then full circle, yeah. You know, once I realized that some of the stuff that customers were asking for wasn\u2019t quite ready yet, I said, let me go back to research and see if I can improve that. It\u2019s fantastic to see something through all the way to product, but once you\u2019re successful and you have something in a product, it\u2019s nice to then say, okay, what\u2019s the next hard problem? And then start over and work on the next hard problem.<\/p>\n<p><strong>Host: Before we wrap up, tell us one interesting thing about yourself, maybe it\u2019s a trait, a characteristic, a life event, a side quest, whatever\u2026 that people might not know, or be able to find on a basic web search, that\u2019s influenced your career as a researcher?<\/strong><\/p>\n<p>T.J. Hazen: Okay. You know, when I was a kid, maybe about eleven years old, the Rubik\u2019s Cube came out. And I got fascinated with it. And I wanted to learn how to solve it. And a kid down the street from my cousin had taught himself from a book how to solve it. And he taught me. His name was Jonathan Cheyer. And he was actually in the first national speed Rubik\u2019s Cube solving competition. It was on this TV show, That\u2019s Incredible. I don\u2019t know if you remember that TV show.<\/p>\n<p><strong>Host: I do.<\/strong><\/p>\n<p>T.J. Hazen: It turned out what he did was, he had learned what is now known as the simple solution. And I learned it from him. And I didn\u2019t realize it until many years later, but what I learned was an algorithm. I learned, you know, a sequence of steps to solve a problem. And once I got into computer science, I discovered all that problem-solving I was doing with the Rubik\u2019s Cube and figuring out what are the steps to solve a problem, that\u2019s essentially what things like machine learning are doing. What are the steps to figure out, what are the features of something, what are the steps I have to do to solve the problem? I didn\u2019t realize that at the time, but the idea of being able to break down a hard problem like solving a Rubik\u2019s Cube, and figuring out what are the stages to get you there, is interesting. Now, here\u2019s the interesting fact. So, Jonathan Cheyer, his older brother is Adam Cheyer. Adam Cheyer is one of the co-founders of Siri.<\/p>\n<p><strong>Host: Oh my gosh. Are you kidding me?<\/strong><\/p>\n<p>T.J. Hazen: So, I met the kid when I was young, and we didn\u2019t really stay in touch. I discovered, you know, many years later that Adam Cheyer was actually the older brother of this kid who taught me the Rubik\u2019s Cube years and years earlier, and Jonathan ended up at Siri also. So, it\u2019s an interesting coincidence that we ended up working in the same field after all those years from this Rubik\u2019s Cube connection!<\/p>\n<p><strong>Host: You see, this is my favorite question now because I\u2019m getting the broadest spectrum of little things that influenced and triggered something\u2026!<\/strong><\/p>\n<p><strong>Host: At the end of every podcast, I give my guests a chance for the proverbial last word. Here\u2019s your chance to say anything you want to would-be researchers, both applied and other otherwise, who might be interested in working on machine reading comprehension for real-world applications.<\/strong><\/p>\n<p>T.J. Hazen: Well, I could say all the things that you would expect me to say, like you should learn about deep learning algorithms and you should possibly learn Python because that\u2019s what everybody is using these days, but I think the single most important thing that I could tell anybody who wants to get into a field like this is that you need to explore it and you need to figure out how it works and do something in depth. Don\u2019t just get some instruction set or some high-level overview on the internet, run it on your computer and then say, oh, I think I understand this. Like get into the nitty-gritty of it. Become an expert. And the other thing I could say is, of all the people I\u2019ve met who are extremely successful, the thing that sets them apart isn\u2019t so much, you know, what they learned, it\u2019s the initiative that they took. So, if you see a problem, try to fix it. If you see a problem, try to find a solution for it. And I say this to people who work for me. If you really want to have an impact, don\u2019t just do what I tell you to do, but explore, think outside the box. Try different things. OK? I\u2019m not going to have the answer to everything, so therefore, if I don\u2019t have the answer to everything, then if you\u2019re only doing what I\u2019m telling you to do, then we both, together, aren\u2019t going to have the answer. But if you explore things on your own and take the initiative and try to figure out something, that\u2019s the best way to really be successful.<\/p>\n<p><strong>Host: T.J. Hazen, thanks for coming in today, all the way from the east coast to talk to us. It\u2019s been delightful.<\/strong><\/p>\n<p>T.J. Hazen: Thank you. It\u2019s been a pleasure.<\/p>\n<p>(music plays)<\/p>\n<p>To learn more about Dr. T.J. Hazen and how researchers and engineers are teaching machines to answer complicated questions, visit <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/\">Microsoft.com\/research<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The ability to read and understand unstructured text, and then answer questions about it, is a common skill among literate humans. But for machines? Not so much. At least not yet! And not if Dr. T.J. Hazen, Senior Principal Research Manager in the Engineering and Applied Research group at MSR Montreal, has a say. He\u2019s spent much of his career working on machine speech and language understanding, and particularly, of late, machine reading comprehension, or MRC.<\/p>\n<p>On today\u2019s podcast, Dr. Hazen talks about why reading comprehension is so hard for machines, gives us an inside look at the technical approaches applied researchers and their engineering colleagues are using to tackle the problem, and shares the story of how an a-ha moment with a Rubik\u2019s Cube inspired a career in computer science and a quest to teach computers to answer complex, text-based questions in the real world.<\/p>\n","protected":false},"author":38022,"featured_media":602772,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"https:\/\/player.blubrry.com\/id\/48051137","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[],"msr_hide_image_in_river":0,"footnotes":""},"categories":[240054],"tags":[],"research-area":[13545],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-602724","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-msr-podcast","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"https:\/\/player.blubrry.com\/id\/48051137","podcast_episode":"","msr_research_lab":[199563,437514],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[592723],"related-events":[],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-960x540.png\" class=\"img-object-cover\" alt=\"a man smiling for the camera\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da-1280x720.png 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/08\/TJ-Hazen_Podcast_Site_08_2019_1400x788-5d4db849d37da.png 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"August 21, 2019","formattedExcerpt":"The ability to read and understand unstructured text, and then answer questions about it, is a common skill among literate humans. But for machines? Not so much. At least not yet! And not if Dr. T.J. Hazen, Senior Principal Research Manager in the Engineering and&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/602724","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/38022"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=602724"}],"version-history":[{"count":7,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/602724\/revisions"}],"predecessor-version":[{"id":896382,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/602724\/revisions\/896382"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/602772"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=602724"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=602724"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=602724"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=602724"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=602724"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=602724"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=602724"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=602724"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=602724"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=602724"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=602724"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}