Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus announcements about noteworthy events, scholarships, and fellowships designed for academic and scientific communities.

Total Recall: How to Have It All

October 26, 2009 | Posted by Microsoft Research Blog

By Rob Knies, Managing Editor, Microsoft Research

In September, pioneering computer-science researcher Gordon Bell and his Microsoft Research colleague Jim Gemmell published Total Recall: How the E-Memory Revolution Will Change Everything, a book that summarizes nearly a decade of an effort to record digitally everything in Bell’s life: What he did, what he saw, what he read, some of what he ate, what he felt—his entire life experience. The book is written in Bell’s voice, but the publication, like the MyLifeBits project that generated it, is a true collaboration. Bell and Gemmell recently found time to discuss the genesis of the research—and its implications for us all.

Q: How is the book being received?

Gemmell: We’re getting a really good reception. We’ve heard from entrepreneurs who are inspired by it and from typical people who are excited and wanting to get started. We get every kind of reaction, from people who are scared of the concept, usually because they don’t truly understand it, to people who are really excited.

Q: Gordon, what do you recall as the first germ of the idea that eventually led to MyLifeBits?

Gordon Bell and Jim Gemmell

Gordon Bell (left) and Jim Gemmell

Bell: It started from three points of view. Raj Reddy [of Carnegie Mellon University] asked for my books so he could scan them as part of his Million Book Project. Meanwhile, I wanted to scan and get rid of all my paper; I’ve had a long desire to have everything online. And [Microsoft Chairman] Bill Gates stimulated the idea by saying that, someday, you’ll be able to record everything you see and hear.

I published a paper on storing everything in January of 2001. In a way, I thought I was done. I was saving everything, but it became clear that the problem was a search problem. You could save everything, and there was a huge value in saving everything, but the big problem was organizing and then searching—in general, trying to provide value to the content.

At that point, Jim Gray and Jim Gemmell said, “Let’s create a serious database and work on the problem of organization and search.” Roger Lueder came in to work on the database and to write code to make it work.

Gemmell: Until that point, Gordon had been trying to capture everything he had already. Then, we started to make a concerted effort to capture more, to write software to record every Web page visited, to set up a telephone recording system in his office. We set up TV and radio recording—all these tools to capture even more of someone’s life.

Bell: In 2003, what really kicked the project into gear was Lyndsay Williams’ SenseCam. We got one of the first ones, and we incorporated it into the recording. Now, we had an icon. It’s a mixed blessing, because we don’t record our daily activities with this camera, but that’s what grabbed everybody’s imagination.

When you’ve got a project, it’s important to have a good name and to have something that’s going to grab everybody. The MyLifeBits name was good; that got us a nice article in New Scientist. Then, in 2003, we got a further boost from the SenseCam. People said, “That’s what it means to record everything you see,” even though we don’t really advocate doing that.

Q: Jim, what prompted you to get involved?

Gemmell: Gordon and I were working closely together on telepresence. His going paperless was partly an attempt to be an efficient teleworker. He wanted his files available to him wherever he may be.

Gordon was beginning to scan his photographs, and he said, “I don’t even remember what half these things are, and unless they’re labeled, I can’t really do anything with them.” I said, “Why don’t we try to write some software?”

Here, I had a factory that was producing data—collecting, scanning, and committed to capturing everything. That’s what made it interesting. It’s the corpus you get that starts to make the whole project interesting. This is the epitome of problem-driven research—an approach that both Jim Gray and Gordon advocate.

Q: What’s a real-life scenario in which the results of this project have proved invaluable to you?

total-recallGemmell: What we’ve done is a small foretaste of what’s eventually going to be possible. But something as simple as writing a control that would record every Web page that we visited … It became something you come to rely on. If I’ve ever seen a Web page, I’ve got a copy of it, and I can find it again quickly because I only have to search my own corpus, not the entire Web. At first, it just seemed like something fun to do, but over time, it became something you presume is part of your technological arsenal.

Gordon and I are treated in a different way by people because we have these electronic memories. People come to us saying, “Don’t you have such and such?” Just the other day, I had my son saying, “Do you have a copy of that essay I wrote several years ago?” And I said, “Yes, of course.” Within about a minute, I was able to e-mail that to him.

Bell: Having everything gives you a feeling of security. You almost feel smug that you’ve got it all, you’ve got your life with you, you’ve got a record of things. The other day, I had a bunch of receipts that came in, so I took a picture of all of them and threw them away. I submitted an expense report, and later I was asked for a receipt. so I got that picture and sent it away.

On the health side, you get something important, whether it’s an explanation of benefits, or immunizations, or allergies, or a hospital discharge. I once had to go into a hospital that didn’t know me, and I had a lot of information that was critical. Everyone should go ahead and have this health information available when they travel, and Microsoft’s HealthVault is a fine place to keep it.

Gemmell: People today have health records strewn around, at the general practitioner, at the lab, at the chiropractor, at the dentist—all over the place. In the future, you’ll have all that information at your fingertips. You’ll have a quantitative record of your health. You’ll have your blood pressure and your temperature, and, eventually, you’ll have in-body sensors knowing what’s going on with your blood. Our health data will be increasingly captured with all sorts of instrumentation. That will be a huge change.

Bell: I encourage all my grandchildren to retain everything they do, all their homework, and take pictures of their artwork. The whole family is now geared to this, that there will be a time when this information will be useful. It certainly doesn’t cost anything. The cost to keep an essay or to photograph a piece of artwork or take a little movie is so small that if there’s any value at all, it’s worth it.

When you’ve done this over a long time, you basically see that your enemy is a little scrap of paper. You’ll write something down—a phone number, a name—and the chances are that if you ever do that, it’s going to be used again. You might as well tell the computer, because you will need it later on. I went paperless in 2002, and, at that point, I had maybe a foot of stuff a year that I scanned. Now, it’s down to three or four inches a year, because more information is coming in electronically, such as bills.

Q: How much time does it take to maintain your database?

Gemmell: About the only time I spend is scanning paper. A lot of the research that we’ve done or that we’ve encouraged is in automatic methods of capturing information. There is so much that it’s got to be automatic.

Bell: Our rule is that the computer has to understand what we’re doing, so we give it as much information as we can. That means labeling photographs. Increasingly, though, computers have good face-recognition software. The computer will understand more, and that makes it a lot easier and much more useful.

Gemmell: When Gordon was scanning his old photos, he had to label them to get any value. But if he has a new photo and he was using his GPS, that’s both location- and time-stamped. Between the time and the location and what’s on his calendar—and with face recognition getting better—the amount of work has gone way down.

Bell: Our feeling from the start has been that if it’s going to take any time, people are not going to use it. Whatever you do has to almost immediately pay for itself in terms of time savings.

Q: Gordon, what event occurred before the project began that you wish you had been able to capture?

Bell: I had been pretty good about keeping things, but my regret is that I threw anything away. If I had known what I know now, I would have kept everything. I would have five to 10 times more stuff.

I was lucky in that I was head of engineering at Digital, and they let me have my files, so I have a lot of documents from the 12 years that I was head of engineering. That provided an interesting corpus. I don’t have all the e-mails I had while running engineering, which would be nice. I should have kept all my e-mails.

Q: How is your information shared?

Gemmell: We’re not life bloggers. We’re not interested in sharing all of our lives on the Web with the general public. We think that people who do that are pretty crazy, and the number of life-changing catastrophes of people appearing publically on the Web is staggering. We are life loggers. It’s intensely private and personal for us. We’re very interested in research that makes things more safe, more secure, and private. We’re interested in pursuing the notion of a Swiss data bank that can have plausibly deniable data storage that’s secure and that makes it possible to deny things exist and keep it out of the reach of anybody but yourself.

Q: What would it take for the Total Recall concept to attain critical mass? What would that mean?

Gemmell: It’s going to attain critical mass. It’s inevitable. The train has left the station. In fact, we were concerned about not getting the book published fast enough. We have new startups in all kinds of areas related to Total Recall. Some of the things we did when we began, that were a real nuisance to do with software, today are standard-issue features, like desktop search and recording of instant-messenger conversations. The iPod Nano now includes video recording and a pedometer. GPS in cellphones is getting more common. There are already all kinds of wearable health devices. Things are already rolling in this direction, no question about it.

Bell: One of the most important things about having everything available is what it does for our lives. I can work anywhere; I just back up my computer.

Gemmell: In the future, it will be the exception to not be able to recall something. Certainly, everything you ever read—all your correspondence, books, and articles—you’ll be able to instantly search through. If I want to look at some topic, what are the things I’ve dealt with? It will be instantly available to you. That will have a huge impact on learning, people’s professional lives, and any hobby they’re interested in.

Bell: I downplay all the personal stuff—the movies, the photos—but a family is much more oriented to that. The ability to take photographs and music and videos and make productions of it is really important. We concentrate on the things we’re doing in our work lives, and, in a way, we sometimes neglect the more personal uses. Most families recognize them: better tools for logging your life, easier ways to show a timeline.

Gemmell: There’s software that exists in research to do automatic summarization of content and automatically come up with stories. More of that software coming out is critical to the momentum of Total Recall. Software will come out to automatically classify stuff. And then there is the ease of use of simple things like backup and replication. It’s too complicated to explain to my parents how to manage their information and keep it safe. It’s got to be a no-brainer.

Gordon, being our chief guinea pig in this, has gone through a lot of pain and hassle changing batteries, charging devices, and installing them. We’re very excited about Microsoft’s HealthVault, but some medical devices can be a pain. A lot of it is software, but some of it is hardware. As that gets better, as we improve usability and come up with these tools, then we go from the beginning that’s already here to the full fruition of the Total Recall vision.