TechFest 2007


TechFest2007_190_120Microsoft Research TechFest provides a strategic forum for Microsoft researchers to connect with the broader group of Microsoft employees and product managers. Hundreds of researchers from Microsoft’s worldwide labs in China, England, India and the US gather for the annual event at the company’s corporate headquarters in Redmond, Washington. They come together to exchange ideas with colleagues, show off their latest innovations, and shine a light into the future of computing.

rickrashidphotoKeynote Speaker: Rich Rashid, Senior Vice President, Microsoft Research.

Feature Stories

Providing Accurate, More Secure Information

By Rob Knies

example_driven_biometricFor all the mind-boggling advances in information technology over the past couple of decades, much work remains—and a significant amount of it occurs behind the scenes, in areas near invisible to the casual computer user.

Some computer-science work gets lots of publicity; witness the explosion of interest in search technology in recent years. Social networking has been in the spotlight for quite a while now, and peer-to-peer networks continue to garner our attention. And then there are those world-conquering mobile phones.

Such technologies gain their high profiles because they are, in a real or virtual sense, tangible. You can hold a cellphone, and while popular Web sites and services might not be literally tangible, they certainly seem that way when they become woven into people’s day-to-day lives.

But not all computer scientists can—or choose to—labor under the media glare. There are plenty of challenging problems to go around: difficult, necessary work every bit as important to advancing the state of the art as their more flashy counterparts.

Want proof? How about computer security? Bet that got your attention. Oh, and managing vast amounts of data—encountered that one lately? If you’re like most of us, of course you have.

Those happen to be just two of the many hard problems being tackled by Microsoft Research personnel, and during TechFest, Microsoft Research’s annual showcase of leading-edge projects, being held March 7-8 in Redmond, rather ingenious solutions are proposed for each of these key challenges.

Database Distillations

Business-intelligence applications often need to match table entries that represent the same real-world entity, such as a customer name. Are Daniel Smith and Dan Smith the same person? Product databases pose similar scenarios; products may be labeled differently in various places, making reconciliation a must to the maintenance of reliable information.

Such record matching is vital to accurate data analysis but can be challenging to achieve. Matching logic might need to compare multiple portions of an entry—and their combinations—and might need to consider similarities. But Venky Ganti, a researcher in the Data Management, Exploration and Mining Group within Microsoft Research Redmond, is offering a potential solution.

“It is often much easier for a programmer to provide a set of example matching and non-matching record pairs,” Ganti says, “than it is to design an accurate query from scratch and work manually through the gamut of available choices. Assisting programmers by automatically creating accurate record-matching programs with the help of examples, which they can then review, edit, and execute, is of great help.”

Such example-based programs have been tried before but have failed to generate programs that proved scalable for large collections of relational data.

“This is where our technique shines,” Ganti says, “and thus helps take an important step in reducing the difficulty of data cleaning over large data warehouses.”

Ganti’s work offers two significant benefits:

  • Identifying a small number of flexible, efficiently executable building blocks that enable the creation of a rich class of highly scalable, accurate record-matching programs.
  • Efficient search through possible programs via exploitation of properties of these building blocks and the characteristics of the record-matching problem.

This approach, which uses SQL Server™ Integration Services as a platform, makes it easier to create initial record-matching programs, composed over a basic set of primitive operators, by specifying a set of examples. The resulting program then can be modified to meet the requirements of the scenario being addressed.

“Our technology,” Ganti says, “helps significantly reduce the time required to develop scalable and accurate record-matching programs. In fact, we demonstrate that the programs generated using our technology are comparable in accuracy to handcrafted, domain-specific commercial technology.”

Fingerprint Protection

For some time now, computer security analysts have investigated the use of fingerprints as authentication tools, physical “passwords” unique to an individual that don’t have to be memorized or written down. It seems logical, and fingerprint readers are being used to provide access credentials in various business and home scenarios.

One problem, though: In the wrong hands—and, unfortunately, there are a lot of wrong hands out there—fingerprints used for authentication also could be used for nefarious purposes. And there you have it: Another promising, useful idea scuttled by the Web’s illicit element.

Perhaps not. Ramarathnam Venkatesan and Mariusz Jakubowski, principal researcher and senior researcher, respectively, for Microsoft Research Redmond’s Cryptography and Anti-Piracy Group, has devised a technique called Biometric Authentication via Fingerprint Hashing that offers the same benefits of unique, handy fingerprint “passwords” without the vulnerability attached.

“If we store the fingerprint on a computer,” Venkatesan says, “or pass it on a network or show it to someone, they can use the image of the fingerprint to misuse it. So the question is: Can we verify the fingerprint by using some data in a way in which that data does not disclose the fingerprint itself?”

His idea is to use fingerprint hashes, summaries of information contained in human fingerprints. The method calculates and aggregates various metrics over fingerprint images, producing short summaries that cannot be used to reconstruct source fingerprints absent a key to the metrics.

“We propose a way to represent fingerprints such that it is not easy to figure out what the fingerprint is,” Venkatesan explains. “That representation allows us to compare fingerprints even when there are small changes due to the way in which people register their fingerprints in a scanner.”

Indeed, with the technique’s resistance to minor distortions, fingerprint hashes can help provide biometric authentication and thus can augment or replace traditional passwords. As a result, the security and the usability of Web services and other client-server systems are enhanced.

“There are fewer than 7 billion people on the planet,” Venkatesan says, “but a typical security system requires a much larger number of combinations. The challenge was to use randomness in a secret key to derive a representation that combines both the fingerprint and the random key.”

The key, it seems, is the key.

“Given the fingerprint representation, it is not easy to figure out the fingerprint,” Venkatesan concludes. “The use of randomness makes it possible to have a fingerprint verified without revealing what exactly it is.”

Improving Collaboration and Communication

By Rob Knies

smstidal_surfacecompOne of the hallmarks of innovation is that it builds on the work of others. Scientific breakthroughs don’t occur in a vacuum. Today’s aeronautics engineers can trace their work back to the Wright brothers. Benjamin Franklin prefigured the work of Thomas Edison. Whoever it was who harnessed fire or invented the wheel, rest assured that they, too, learned from those who had gone before.

So it is, as well, in information technology. We live in a fast-twitch age, in which today’s sensation becomes a platform on which to build tomorrow’s. Even as the latest technology is unveiled, somebody, somewhere, is making plans to improve upon it. News travels fast these days, and human ingenuity is up to the challenge.

Nowhere is that more evident than at TechFest, Microsoft Research’s annual celebration of research innovation, being held March 7-8 in Redmond. More than 150 demos—encompassing the breadth of computer-science disciplines and stemming from Microsoft Research labs located in Redmond; New York City; Cambridge, New England; Beijing; and Bangalore, India—testify to the never-ending quest to improve upon the state of the technological art.

For example, a couple of the booths at TechFest feature projects to extend current technologies to make it even easier and more convenient for people to communicate and collaborate. A closer look:

Everyday Surfaces Become Interactive

Andy Wilson’s demo features game play, but it’s not really about games at all.

Wilson, a researcher within the Adaptive Systems and Interaction group at Microsoft Research Redmond, works on surface computing, a concept devised in collaboration with Steven Bathiche, a research manager with the Microsoft Hardware group. Surface computing involves imbuing everyday surfaces—a table top, a whiteboard—with interactive capabilities, enabling them to act as displays that respond to touch or gesture or objects placed atop them.

He has received quite a bit of attention for his previous projects, such as TouchLight, which uses a combination of a projector, sensors, and cameras to transform a sheet of acrylic plastic into a display on which objects respond to user gestures, and PlayAnywhere, an interactive projection-vision system that features a real-life version of the sort of futuristic technology depicted as being used by Tom Cruise in the film Minority Report.

The latest step forward from Wilson is an extension of PlayAnywhere called PlayTogether, a collaboration with Daniel Robbins, a user-interface designer from Microsoft Research Redmond’s Visualization and Interaction for Business and Entertainment group. PlayTogether uses a couple of networked PlayAnywhere units to enable collaboration between remote users in a virtual 3-D environment.

“Essentially, what’s going on is we have the video from one machine being sent to the video of another,” Wilson says. “The effect you have is that you’re at the desk, you’re manipulating something, and you see your partner’s hands in the scene. Anything they’re doing on their surface is reflected onto yours.”

Say you’re playing chess with a PlayTogether-enabled opponent . You choose the move Rh4. You then see your opponent’s hands hover into view atop the board; she opts for Qxh4—it all happens, horrifyingly, right in front of your eyes, even though the opponent is a thousand miles away. You’ve lost a rook. At least you can’t hear her gleeful cackle.

“Contrast that with an approach that involves icons or cursors that depict other people,” Wilson says. “There’s no intermediary. I’m seeing exactly what the other machine sees.”

What’s not quite as obvious is that while game play might provide a compelling—even diverting—demonstration, the research behind it holds vastly broader potential.

“Once display hardware and technology become cheap enough,” Wilson explains, “you can afford to start putting display and computational goodies together and using them in interesting ways that you would never have considered before.

“In an office setting,” he adds, “you can imagine having a wall display or a conference-table display that reflects all the documents appropriate to a meeting and enable multiple people to interact with them at once. I want people to think of new ways to use this kind of technology and to go find interesting places to deploy it.”

In other words, Wilson seems to be saying to the developer community, it’s your move.

Text Messaging for Small and Emerging-Market Businesses

Short Message Service (SMS) has become ubiquitous in many parts of the world, particularly in emerging markets, where challenging rural conditions and the costs of PC maintenance can be prohibitive.

Responding to the popularity of SMS, Microsoft Research India has been investigating ways for people to connect PC and Internet applications to SMS-capable phones and to create useful, SMS-based applications. One product of such research is the SMS Toolkit, which enables anyone with a PC and a Windows Mobile®-based phone to run an SMS server.*

“This new mobile system,” says Rajesh Veeraraghavan, an associate researcher in the Technology for Emerging Markets group, “replicates almost all of the PC-based functionality and is cheaper, adds additional functionality, and is more popular.”

The SMS Toolkit stems from work Veeraraghavan and colleagues performed as part of a project called Warana Unwired. That, in turn, emerged from an earlier effort called the Warana Wired Village, involving a sugar-cane cooperative in a rural community in the Indian state of Maharashtra. The co-op serves about 50,000 farmers across 75 villages, and in 1998, the Indian government started a pilot project to bridge the digital divide in which 54 PC kiosks were established to connect the farmers. The original goal was to provide the farmers with Internet access so they could check market prices to obtain the best price for their product, and to establish a remote agricultural-advisory system.

The pilot was not wildly successful, because PC maintenance was difficult and expensive. So Microsoft Research India replaced the PCs with SMS phones connected via USB to a PC server to create an SMS gateway that receives incoming SMS messages and converts them into database calls. Now, the system is available 24 hours a day, and farmers use it in a variety of mobile settings—buying fertilizer, checking their payment history, registering their land—and the savings are considerable. The model is being extended to other sectors, enabling schools, for example, to send bulk SMS messages to students’ parents.

“We’ve worked hard to make the programming model very simple,” says Sean Blagsvedt, head of Advanced Prototyping and Program Management for Microsoft Research India, “so a developer does not have to worry about which phone is in use or the intricacies of USB connections.

“Additionally, for simple scenarios such as sending bulk SMS messages or creating SMS-based information –access applications, we’ve provided samples in Excel so that no programming is required.”

One challenge is that SMS messages are limited to 160 characters. But that, too, has been addressed.

“It’s incredibly easy,” Blagsvedt says, “for a developer to enable an SMS user to send a simple query. Our code takes care of splitting the resulting matches into multiple SMS messages that the user can effectively scroll through to get the particular result desired.”

In such scenarios, this sort of research success means not only doing well, but also doing good.

“We are most excited,” Blagsvedt says, “about lowering the technical ability needed to create an SMS-based application. We hope this will lead to many more SMS applications throughout the world.”

*Connectivity and synchronization may require separately purchased equipment and/or wireless products, such as a Wi-Fi card, network software, server hardware, and/or redirector software. Service plans are required for Internet, Wi-Fi and phone access. Features and performance may vary by service provider and are subject to network limitations. See your device manufacturer, your service provider, and/or your corporate IT department for details.

Making the Web People-Friendly

By Rob Knies

insite_internetvideoOne of the great things about the Internet is the sheer magnitude of the resources it offers. It’s humbling, when you stop to think about it. With Web pages numbering in the billions, any individual’s particular interests, no matter how expansive, amount to merely a drop in the online bucket. The greatest limitation to surfing the Net is the imagination of the surfer.

That’s good, though. Practically anything imaginable is available for investigation. Even a relatively narrow topic is likely to offer tens of thousands of Internet trails to explore. Never before have so many had access to so much information.

That’s bad, though. In most cases, we don’t need to tread down a thousand info-trails. Usually, we need something specific—a name, a number, a technique, an explanation. Time is short, and we are busy. On occasion, too much data can be as shackling as too little.

But don’t fret. Help is on the way—as demonstrated at a pair of booths featured March 7-8 during TechFest 2007, Microsoft Research’s annual research-project showcase. Each, in its own way, sets its sights on helping cut through the clutter to enable users to identify precisely what they need and want from their online experiences.

One focuses on streamlining Internet navigation to provide easy access to pages of interest. The other intends to make Internet video as simple and inviting as watching TV. Two different problems, two different approaches, one common denominator: whittling the Web down to size.

Web-Site Insight

“Comprehending the scope and the size of a Web site based on the home page alone is very difficult,” says Natasa Milic-Frayling, a senior researcher within the Integrated Systems group at Microsoft Research Cambridge.

Milic-Frayling and her colleagues Eduarda Mendes Rodrigues and Blaz Fortuna are exploring the challenges in effective navigation of Web sites that, in many cases, evolve organically to the point where they include thousands of pages, some of them created for specific purposes, in styles significantly different from the site’s home page.

How can users possibly come to understand the global organization of such sites and thus gain easy access to pages of interest?

Milic-Frayling and her team have devised a technology called InSite Live! that analyzes the organization of a site based on a technique called Link Structure Graph (LSG). It uses information about the groupings of navigational links on individual pages—such as menus and clusters of links that refer to pages with related content—to identify subsites and their prominent topics.

“Based on this information,” Mendes Rodrigues says, “InSite Live! provides visualization of the navigation and topic structure and enables the user to explore the site not only by navigating individual pages, but also by hopping from one subsite to another.”

insiteliveA Web-site visualization produced by InSite Live!

To accomplish this, InSite Live! crawls a site and performs LSG analysis on the collected pages. The biggest challenge the technology must overcome is to collect all site pages to provide a full representation. Many Web pages these days are dynamically generated and can change often, making a complete site representation obsolete in a hurry.

InSite Live! can help in that regard because of the advantages it offers to Web-site administrators. The technology can help them organize their site and create a representation that would enable visitors to learn more about the site’s scope and organization. Meanwhile, the Web-site administrator gets a chance to see the frequency at which subsites are accessed and, therefore, can develop a strategy to maximize traffic as desired.

Once InSite Live! has concluded its site analysis, it generates a graphical site map dazzling in its complexity and its visual connections.

“For the first time, we can see how constellations of Web pages are connected through navigation menus,” Mendes Rodrigues says, “and how these menus are connected to take the user from the home page to peripheral parts of the site. With a single click, we can see which topics are covered by the site and its subsites.”

Such a view also provides a bit of reward for Milic-Frayling, Fortuna, and Mendes Rodrigues—who literally can see the fruits of their labors.

“It is absolutely fascinating to view the InSite map of an entire site, which may have thousands of pages,” Milic-Frayling says. “No single user can possibly browse and build in their mind a comprehensive model of the Web-site organization. With InSite Live!, we can.”

New Things to Watch on TV

Internet video has enjoyed mushrooming popularity in recent months, as exemplified by the explosive popularity of Web sites such as YouTube and Soapbox on MSN Video. Millions of people are accessing short video clips every day, for occasional edification and an endless stream of laughs. It has become a virtual medium in itself, a user-controlled alternative to mainstream television.

One problem, though. Almost exclusively, such content is accessible only via computers. And computers, for all their many conveniences, do not offer the same easy, relaxed viewing experience as does your garden-variety TV. Couch potatoes want in on the fun, too.

Kit Thambiratnam wants to change all that. Thambiratnam, a researcher within the Speech Group at Microsoft Research Asia, has been working on a project he calls Relaxed Internet Video Exploration and Discovery, and his mission—and that of collaborating colleagues Frank Seide and Roger Yu—is to bring Internet video to the masses.

“The purpose of this project,” Thambiratnam says, “is to make enjoying Internet video as simple and easy as it is to watch traditional TV content.

“Video content was made to be enjoyed on your TV, not on your PC,” he continues. “Unfortunately, it’s just much too difficult to enjoy all the great Internet video content on your TV.”

The reasons for this are many, but they boil down to ease of use. When people are sitting back on their couch, looking for something to watch, they don’t want to browse lists, search for content, or formulate queries. They want to click a remote and have entertaining video wash over them.

“We want to build upon familiar concepts such as TV channels and channel surfing,” Thambiratnam explains, “but then use machine learning and other ‘smart’ technologies to bridge the gap between the user and the vast amounts of video on the Internet.”

To do so, Thambiratnam is packaging three technologies into a seamless whole:

  • Automatic content recommendations: His system performs content analysis on what you watch to recommend other things that you might also enjoy.
  • Remote-control search: Using speech recognition, his work can find video clips relevant to a specific request—and find where in the clip the relevant content appears. “Our search,” he says, “not only helps you to locate interesting video, but also to navigate within it.”
  • Passive user interfaces for TV: The search and browse paradigms that work on your PC are too cumbersome for the TV experience. Thambiratnam’s work is investigating new interfaces and interaction modes, such as an inline recommendation bar.

Of the three, the content recommendations are paramount in this project.

“We use speech-recognition technology to understand what’s being said in the videos, so when we recommend content, it’s more related to the actual content of the video,” Thambiratnam explains. “The idea is to provide a broad range of content related to what you’re watching. If you’re watching a documentary on houses, we may offer a news clip about rising house prices or a do-it-yourself home-improvement show.

“We’re also trying to build a system that allows you to vote for content that you like, so that, as you watch, you can teach the computer how to make recommendations for you.”

Given the diversity of Internet video, one of the biggest challenges Thambiratnam faces is developing a machine-learning algorithm that works across genre, type, and quality of particular videos.

“The only ways we can realize this project is to use a cross-section of technologies,” he says, “all backed by the fundamental machine-learning techniques we use in speech recognition.

“The goal is to create the TV of tomorrow. We want to realize a video jukebox that is as simple to use as your TV, but uses machine-learning and intelligence technologies to transparently give you access to the almost infinite amount of video on the Internet.”

And the most rewarding part of working on a project like this? Thambiratnam smiles.

“It’s something I would actually use.”

Changing of the Guard: A Conversation with Dan Ling and Rico Malvar

By Rob Knies

techfest_qandaDan Ling, corporate vice president of Microsoft Research, recently announced his impending retirement. Ling, who has been with the organization almost since its inception 15 years ago, has been responsible for overseeing Microsoft Research Redmond since 1995. In announcing Ling’s departure, Rick Rashid, senior vice president of Microsoft Research, also introduced Henrique (Rico) Malvar, Microsoft distinguished engineer and Microsoft Research general manager, as the lab’s new managing director. Malvar has an extensive history of accomplishments within Microsoft Research, both technical and managerial, which have led him to his new position. As Microsoft Research continues to celebrate its 15th anniversary with its seventh annual TechFest, the company’s annual showcase of research projects, on March 7-8, Ling and Malvar found time beforehand for a chat about the transition.

Q: Rico, what were your initial thoughts upon hearing about Dan’s retirement?

Malvar: The first thought in my mind was: Who wants to see Dan go? Nobody wanted that. The second one, of course, was that I was happy that Dan has helped me and encouraged me to ramp up to this new role. I look forward to that.

Q: Dan, take a minute to reflect briefly on the highlights of your Microsoft Research career. Is there a particular moment that is most memorable for you?

Ling: The past 15 years seem to have passed by in a flash. When I look back, I’m really very proud to see what we’ve been able to accomplish, creating a research lab of international reputation and, arguably, one of the best computer-science organizations in the world. That’s a very exciting thing.

rick_bill_danDan Ling (right), corporate vice president of Microsoft Research, meets with Rick Rashid (left), senior vice president of Microsoft Research, and Bill Gates (center) in late February 2007.

There are a lot of highlights that come to mind. One is seeing the impact of researchers’ work on quite a number of Microsoft® products—from the very early days, when we were able to help with Windows 95® and Office 95®, to work that Rico did on compression techniques that are widely used within Microsoft today, to different kinds of programming tools that are used throughout the company as an integral part of our development processes. It’s very gratifying to see that Research work has made it into the hands of our customers and helped make our customers’ lives better.

As Research has progressed, one of things that I’ve become really very excited about is to see the development of a generation of researchers who joined fairly early on who have spent their entire research career at Microsoft Research and are now being widely recognized for their research contributions. For example, Harry Shum, who’s now the director of our research lab in China, came to us shortly after graduate school and now is widely recognized for his work in computer vision and for his work in leading the Microsoft Research Asia lab to quite incredible heights. To see somebody like that grow his entire professional career within Microsoft Research has been another very satisfying thing.

The last thing I’ll mention is my involvement in helping both Microsoft Research Asia and Microsoft Research India get started. I had a relatively small role in both of those cases, but at least being part of those organizations at the very beginning, helping them get started, and then seeing their success today has been a very exciting thing for me.

Q: Where within Microsoft Research do you think you have been able to make the greatest impact?

Ling: With any organization, the most important resource is its people. My impact has been very indirect in many ways, but helping to recruit and hire and retain outstanding people in the lab … I’m very proud of what we’ve been able to do in that case, because they’re the reason why the lab is what it is. That’s where a lot of my attention has gone and where I think I’ve been able to contribute something.

Q: What do you think you’ll miss the most?

Ling: The same thing: the people and the exciting conversations I’ve enjoyed with all of them—technical conversations, brainstorming, thinking about new ideas, looking at solving difficult technical problems, thinking about what Research should do next. Those are the kinds of things that have been really, really exciting.

One thing I do every year is group reviews, where I review the projects of every single research group at the lab. Those are always some of the highlights of the year, because I get to see all this really exciting work going on and all the progress that’s been made in the past 12 months and see new projects and see the new results. It’s always a wonderful experience.

Q: Rico, describe your feelings as you transition into your new role.

Malvar: First of all, I was very happy when I was entrusted to take on this new role. Of course, I underestimated how difficult it would be: It’s really a big task. Dan has been incredibly influential in defining a very broad range of research directions for the lab and at bringing in excellent people to drive those. In the past few years, as a part of management, I have been able to help with that process.

henrique_malvarWe really have an incredible team here. I feel like somebody named me coach for the best possible football team you could dream of. It may be tough, but on the other hand, it is the best team, right? So it can’t be that difficult. And, of course, it brings a ton of excitement to me. That’s the main feeling, that it’s a difficult job, but, boy, what a team we have.

That’s really what excites me: the opportunity to work with those folks and to try my best to provide good leadership and maintain their motivation.

Q: Dan, how would you characterize Rico’s contributions to Microsoft Research and his new role within the organization?

Ling: Rico has been an incredible contributor since he joined Microsoft Research 10 years ago, first, as a technical contributor in his work, and then leading a group and demonstrating his managerial skills and leadership, and then the last three years as part of the Office of Directors. He has provided leadership to the lab as a whole, being a technical leader, a people leader, recruiting new people, motivating people. He has all the skills necessary to be an incredibly successful lab director.

Q: Rico, are you planning any changes?

Malvar: No. Since the inception of Microsoft Research, Rick and Dan have developed a culture that is incredibly successful. We give our researchers a ton of freedom to define their projects. And when we’re hiring, we also keep an eye on what motivates the researchers.

Our community of researchers and engineers is really excited about our two main goals: pushing the state of the art, making sure that we’re pushing the frontiers of what can be done in computer science and related areas, and, at the same time, not missing opportunities to see those great ideas become really influential in people’s lives. Our recruits see the opportunity: “If I join Microsoft and I have this great idea, suddenly 500 million people will be using my great idea.” We achieve that through our contribution to our products.

We continually revisit areas in which we should invest more, areas in which we need to try a little harder to hire. That’s a very dynamic process, because things are changing all the time. The company changes, the products change, our organization changes. Research directions do change: We try something, it fails. We fail all the time, because we need to—it’s research. Then we try something else.

What will change is the portfolio of projects that we will be doing. We still have many groups in many different areas, with the group managers having a lot of flexibility in defining what their groups should be doing. But in the way we operate, no, there will be no changes.

Q: When you look at the new set of tasks and responsibilities that you’ll be inheriting, what are your immediate goals? What are the first things you’ll be focusing on?

Malvar: The main thing is to make sure that folks understand that the process is the same, that we will continue to define our research priorities in terms of where we collectively think there could be a major impact. That’s not my decision to make. We don’t really use a top-down process; we use collective thinking driven by our researchers and our group and area managers.

I’ll give you one example: Mobile phones are being used more and more not just as communication tools, but as information-management tools and, in many cases, as information-producing tools. They’ve gone from something on which you can just talk to something on which you can gather information in many different ways, not just by dialing 411. That opens tremendous opportunities as a new platform for us. We constantly have to evaluate what kinds of technologies we need to put together, what kinds of new scenarios in which we see people being more effective if we provide them all these tools via cellphone.

There are many others. As we move more into services, with and other, similar initiatives, there are many more opportunities—for software services, Web services, business-to-business services.

Then there is the evolution of computers themselves. Now, we see computers having two processors, or a chip that has two processors inside, a chip that has four. Who knows where that’s going to stop? How can software be even more effective so that people can have even more functionality in their hands as they buy these more powerful processors?

Q: Dan, what will you be expecting to see from Microsoft Research in the future?

Ling: I will continue to expect that Microsoft Research will be one of the premier research organizations in the world, that it will continue to do outstanding technical work and push the state of the art forward. And I certainly expect that the organization will continue to work very closely with the Microsoft product groups and get a lot of those exciting new technologies into the hands of millions of customers.

I think another hallmark of the lab has been its involvement with academic and other research institutions around the world, and I expect that will continue, as well.

Q: Rico, from your vantage point, what’s the future of Microsoft Research look like?

Malvar: My expectations match what Dan just said. We do have a few more challenges, though. Microsoft just shipped two major products, Windows Vista™ and Office 2007. That opens to the company an opportunity to revisit, to reassess opportunities and priorities, and we are going to be part of that process. We’re working a bit harder than usual to help figure out what’s next for Office, what’s next for Windows, what’s next for software services. We’re excited about those challenges, because we see them as opportunities.

TechFest '07: Innovation on Inimitable Display

By Rob Knies

tf_teaserOne thing that strikes those even casually acquainted with Microsoft Research is the immense scope of the work performed by the hundreds of researchers spread among the organization’s five labs worldwide.

Whether it be high-level conceptualization about the future of technology or more immediately accessible efforts to improve consumers’ media experiences, there are few aspects of the IT revolution that have not been touched—and advanced—by Microsoft Research computer scientists in the 15 years since the group was formed in 1991.

Such achievements have continued to gain momentum over the years, and the cream of the latest crop will be on full display in Redmond on March 7-8 during the seventh annual TechFest, to be held at the Microsoft Conference Center on the company’s main campus.

In advance of the event, Rick Rashid, senior vice president of Microsoft Research, will deliver a keynote address at 9 a.m. Pacific Time on March 6, to be joined by Rico Malvar, a Microsoft distinguished engineer recently chosen to become the new managing director of Microsoft Research Redmond. The keynote can be heard via Webcast and will be available, along with a wealth of images, video, and other resources.

The show, which has grown to the point where nearly 7,000 Microsoft employees attended TechFest 2006, offers a bit of symbiosis for both researchers and visiting employees. The former get a chance to exchange ideas and concepts with peers, showcase their latest innovative work, and put a focus on the potential that computing holds to further enhance the lives of users around the globe.

What’s in it for those who attend TechFest? Simple. They get a chance to mingle with researchers, pick their brains for exciting new ideas, and inquire about the latest innovations available to be deployed in the array of products Microsoft builds. Such partnerships have, on numerous occasions, produced successful collaborations between researchers and product groups, to the mutual benefit of both—and, ultimately, to the audience of enterprise customers, small and medium-sized businesses, and home consumers who are enabled to build their individual visions of the future atop the platforms such collaborations provide.

Rashid knows. As head of Microsoft Research, he is in a unique position to survey both the breadth and the depth of the endeavors performed by those from labs located in Redmond; Silicon Valley; Cambridge, England; Beijing; and Bangalore, India. Here’s what Rashid has to say about the show:

“TechFest is one-stop shopping to see and experience the breadth of software innovations we’re pursuing that will allow people to explore their interests more deeply and share the things they care about more easily.”

Mary Czerwinski, a Redmond-based research-area manager, provides a complementary observation.

“What’s cool about TechFest,” Czerwinski says, “is the palpable energy and excitement obvious in both the researchers talking about their projects and the attendees who get jazzed about the direction we are taking.”

There are also testimonials of a more grass-roots variety. A few comments gleaned at random from attendees on the floor of last year’s TechFest:

  • “This is my favorite event of the year.”
  • “I like getting the chance to talk to researchers face-to-face.”
  • “It’s a really, really great opportunity for people in the product groups to see what research is doing.”
  • “I think it’s beautiful.”
  • “I’ve never seen so many geeks in my life!”

Well, yes, there are “geeks” in attendance at TechFest, “geeks” who will be determining the lay of the technological land over the next 5-10 years. They’ve been working hard to get a chance to display their latest research wares—in a mind-boggling variety of disciplines. Consider the six loosely amalgamated themes for this year’s show:

  • Emerging Markets and Research Partners.
  • Hardware, Devices and Mobile Computing.
  • Search, Interaction and Collaboration.
  • Software, Theory and Security.
  • Systems, Networking and Databases.
  • User Interfaces, Graphics and Media.

In other words, for those of a technological bent, something for everyone.

Not only do attending employees get a chance to get up-close and personal with no fewer than 156 demos—hearing detailed descriptions, asking questions, engaging in a bit of give-and-take—but there are also 24 academic-style lectures to be delivered, in lecture-hall-like surroundings. While the ultimate goal of TechFest might be the transfer of technology, during the show itself, what occurs is nothing less than a transfer of intellect.

But don’t get the idea this is some sort of straitlaced, whisper-if-you-dare gathering. Such lofty-minded results are achieved within an atmosphere of genuine conviviality. It’s a big party, TechFest, noisy and boisterous and just a tad unruly. Attendees jostle for position in front of popular booths. Loud laughter can be heard, as can hearty greetings being exchanged between colleagues based halfway around the world. Oohs and aahs are audible.

Of course, they are. It’s a bit like magic, really, all this turning ideas into reality. At Microsoft Research, it’s been going on for 15 years now. Those who throng to TechFest 2007 will get a glimpse at the sort of magic the next 15 years have to offer.


Emerging Markets and Research Partners

Split-Screen UIs for Small Businesses

In developing nations and emerging markets, there are situations, especially in small-business settings, where a single computer is shared among multiple users at the same time. A lot of juggling of control takes place, and people share without changing sessions; rather, they manage with minimization and maximization of relevant windows. Typical applications include word processing, accounting, image editing, and browsing. We provide shared access around the same single display, with multiple mice and multiple keyboards, by splitting the screen into separate sections for each user, optimally for two users. The sections are adjustable, with permissions, and separate applications or operating systems can run independently in each area. Various interactions are enabled by features such as a common area for sharing common files and resources, and joint editing of documents.

The SMS Tidal Wave

techfest_india_capIn emerging markets, Short Messaging Service (SMS) is one of the most popular modes of communication. We will demonstrate a variety of ways in which SMS can be used via mobile phones to enhance small businesses, microfinance, and agricultural production. Using mobile phones linked to a PC acting as an SMS server can improve data collection and sharing, improving organizational efficiency and rural economic development. SMS can also be made more relevant for specific Web-based applications, such as blogging, searching, chatting, instant messaging, and Outlook® Access. Such uses can help a small business run an SMS server as if it is running a Web site, and we have a software-development kit that makes it simple to build custom SMS servers.

Digital Assistance for Emerging Markets

Microsoft Research India is investigating ways in which digital technology can help enrich the lives of illiterate or rural residents in emerging markets. One such technique is to use a text-free UI to enable illiterate, first-time computer users to access relevant health information to make educated healthcare decisions. Another empowers users unable to read or write to e-mail their loved ones, also using text-free navigation. And a third seeks to assist poor rural farmers by providing specific, targeted advice about relevant farming practices using appropriate digital technologies.

Hardware, Devices, and Mobile Computing

Telescopic Pixel and New UI Devices

Today’s digital images are enabled through arrays of small light gates (LCD, DMD) or light emitters (LED, plasma). The most popular display technology, the LCD, is inefficient, allowing less than 10 percent of backlight to reach the viewing surface. LCDs also are slow, making separate R, G and B necessary. The telescopic pixel is a microminiature reflecting telescope in which the focus controls the amount of light passing through. With a theoretical light efficiency of 75 percent, the telescopic pixel is much more energy-efficient, and its faster switching time enables a single pixel to serve R, G and B functions. We also will demonstrate a capacitance touch-pad control, a low-cost X/Y touch pad for a mobile device, and a gesture-sensing keyboard, in which sensitive capacitance-to-digital converters enable an X/Y position sensor for resolving hand gestures above the keyboard.

Wi-Fi Ads

Many consumers carry portable electronic devices, smartphones, personal digital assistants, or laptops that can connect to Wi-Fi networks. Location-sensitive advertisements, ads targeted to a Wi-Fi user based in part on the physical location of that user, will be an important market in the near future. We have developed a scheme for distributing location-sensitive ads to Wi-Fi devices.

Our approach has three advantages:

  • We do not require information from the client device to deliver ads to the client.
  • We do not require the client to have Internet connectivity. In fact, we can deliver ads even when the client is connected to a competitor’s Wi-Fi network.
  • We can supply dynamic information to consumers in real time. For example, a restaurant can continuously advertise an expected wait time to all wireless clients in its vicinity.

Location-Based Enterprise Wi-Fi Management

The physical locations of clients and access points in a wireless LAN have a large impact on network performance. We demonstrate a scalable, easy-to-deploy WLAN performance-management system that includes a self-configuring location-estimation engine. Our system displays the location of all the WLAN clients, and it tracks those access points with which clients associate, along with a variety of performance metrics that characterize the client’s experience. Using our system to observe the WLAN usage in our building, we show that information about client locations is crucial for understanding WLAN performance.

Surface-Computing Innovations

Surface computing uses sensing and display technology to imbue everyday surfaces with interaction. PlayAnywhere is a compact surface-computing system shown at TechFest last year. This year, we will show PlayTogether: two networked PlayAnywhere units exchanging video of each other’s desktop surface, including hands, game pieces, and drawing surfaces. PlayTogether offers interesting combinations of the real world with the virtual world: Playing chess across the network, you see your opponent’s hands and pieces superimposed on your own real pieces and desktop. We will show other technologies, including an application of depth-sensing video cameras, which work like a normal video camera but also calculate how far away the imaged surface is at each pixel, resulting in (R, G, B, Z)-valued images. We will show a game that combines the surface-computing idea with this exciting new technology.

Search, Interaction, and Collaboration

VIBE Team Demos

The VIBE research group will showcase:

  • DynaVis: A visualization framework for Dynamics UX that supports animated transitions, direct manipulation of data, and compositing.
  • CandidTree visualizes structural uncertainty in merged trees.
  • For PP and agile dev teams: a peripheral display that shows software-development teams where team members are in the code, the methods on which they are working, and who may need help.
  • A novel UX for the smartphone.
  • Courier: Take your documents with you, and share them on a large display using a smartphone.

Tango:Find Your Cicle, Enjoy Your Social

techfest_beijing_cap.jpgTango enables users not only to manage their own cyber traces—tags, rankings, and comments—but also to see what other people in the same “circle”—friends, favorite users, people sharing the same interest—are doing and what’s popular on a more global scale. Tango users can browse any URL and find tags or comments left by others and therefore expand their social network by exploring common interests. All content can be filtered based on circle, ranking, or tag so a user can find the desired information, supported by a trusted relationship. By supporting various social activities, Tango adds interaction to a social network and, thus, is more fun.

Tag Booster: A System for Ranking, Suggesting Tags

Finding things on your PC via the Internet is hard. Search engines provide a solution, provided that sufficient features about information items are available. But increasingly, we wish to navigate and search items such as images, video, music, and even a person’s reputation. This is where user-generated tagging helps. We propose a solution for tag recommendations to help users’ consensus to emerge more rapidly.

Recognition and Disambiguation of Entities in Text

Our project proposes a substantial change in the way we interact with text and information. This includes instant access to relevant data on the Web, as well as contextualized bookmarks and search. The core of the system is a powerful, named-entity recognition and disambiguation technology. The system identifies and disambiguates the named entities and the most important concepts in text based on information extracted from a large, encyclopedic collection and search-query logs. It also enables a user to create context-dependent bookmarks and to share them with other users. The system then employs such data as user feedback to improve its performance. In addition, the system enables a user to perform context-aware Web searches. For this, the system disambiguates the user’s queries by using the information extracted from the documents the user has been reading or editing.


A picture is worth a thousand words. In this demo, we show a Web service that can be used to match your photo against millions of street-side-view photos in our database. An efficient, distributed, high-dimensional index is developed to speed the query performance. In our system, which supports both PC and mobile interfaces, each query can be answered in mere seconds. We will use Seattle as an example city to illustrate the performance of our system.

New Concepts for the Home

We will present nine new technologies aimed at enriching home life, under four themes:

  • New messaging concepts: We will show a “digital postcard” device for the living room, a “visual answering machine” for the kitchen, and the Epigraph—a kitchen display supporting family presence and identity.
  • New mobile concepts: We will show Glancephone, a way of turning a cellphone into a Webcam, and Grab & Share, a system for “trafficking” TV clips through your cellphone.
  • New image displays: We will present the Photo Shoebox, which shows a tangible way of archiving and displaying photos in the home, and three variations of Time-Mill, an interactive mirror that captures and reflects photos in the home.
  • Paper-digital concepts: We will show two concepts: one using paper to send remote messages into the home, and another that enables a family to message on paper from the home.

Community Buzz

Community Buzz is a new window into online communities! Interesting and useful conversations, authors, and groups are discovered easily using this tool, jointly developed by Microsoft Research Redmond’s Community Technologies group and Microsoft Research Cambridge’s Integrated Systems team, with sponsorship from Live Labs. Community Buzz combines text mining, social accounting (Netscan/MSR-Halo), and new visualization techniques to study and present the content of communication threads in online discussion groups. The merging of these research technologies results in a system that gives great value to community participants, enables highly directed advertising, and supplies rich metrics to product managers.

Pictures of Search Relevance

The link structure of the Web plays an important role in today’s search engines, with techniques such as PageRank. These analyses typically work at the level of the entire Web. Our work examines characteristics of key subsets of the Web graph. In particular, we characterize the subgraphs induced by projecting the results of a search onto the larger Web graph. We represent the subgraphs using a rich variety of graphical properties—number of nodes and edges, graph diameter, connected components, triads—and use this representation to predict behavior on several search-related tasks. For example, we can predict the overall quality of a set of search results, when a user will reformulate a query and whether a user will specialize or generalize a query.

Wearable Sensors for Health, Sports, and Community

We will present four projects that utilize a variety of sensors, such as electrocardiograms, blood oximetry, and GPS, in conjunction with Windows Mobile and SPOT devices, to provide feedback to users and online communities:

  • The iPox project uses two multisensor, wearable devices to investigate how easy access to one’s physiological data influences individuals and communities.
  • SlamXR is a system supporting outdoor sports communities through sensor-annotated GPS traces, such as heart rate and altitude.
  • HealthGear utilizes sensor inputs such as blood oximetry to assist with a variety of personal health issues, such as sleep apnea.
  • Mobile-sensor-extraction technology for Windows Mobile devices developed at the European Microsoft Innovation Centre is presented in the context of an application for an enhanced SPOT watch to assist diabetics.

Using E-Mail to Query Structured Business Apps

Users of business applications such as CRM or ERP respond to incoming e-mails by manually navigating through the UI of the app. We help users become more efficient by using incoming e-mail as queries against the underlying database of the app. We will show an example of such a query system, called Business Context Expediter (BCE), operating on a CRM database. BCE will find entities within e-mail and offer the users actions related to these entities. BCE also automatically extracts the category of the e-mail and summarizes each e-mail with three sentences. The technology underlying BCE, a joint project of Office Labs and Microsoft Research’s Knowledge Tools group, is not tied to CRM: It should be applicable to many structured business scenarios. Come see the demo to learn more!

InSite Live!

techfest_cambridge_cap.jpgInSite Live! is a tool for visualizing the structure of Web sites and intranets. It assists users in orientating themselves during navigation, enabling them to jump easily to subsites of interest. It uses a novel link-structure-graph technique to infer the structure from the layout of hyperlinks on site pages. InSite can expand as the community of users browses the site, or it can present a static view of crawled site pages.

Gazing into Web Search

We use eye-tracking technology to help us understand how people use Web-search interfaces. How do people scan search results? Does it depend on what they’re doing? Can we improve our interfaces based on this information?


Wikis and blogs have facilitated greatly the lightweight creation of collaborative documents. A wiki is a type of Web site that makes it easy for users to add, remove, or otherwise edit all content. Wikis, however, are primarily textual in nature. We propose a system that enables a pasteboard metaphor for collaboratively creating Web-based documents. Users easily can add, remove, or rearrange images or text blocks on a page. As on a wiki page, anyone, or only those with appropriate credentials, can edit a page, and new pages can be added and linked to a current page. Users can place text or images anywhere on the page. Since the VIKI maintains a strict model/view separation, both manual and data-driven views can be represented. We will demonstrate the VIKI system and show the kinds of projects that can be built with it, from photo scrapbooks to note taking and to-do lists.

Software, Theory, and Security

Backstory: Find the Story Behind the Code

What were they thinking when they wrote this code? This is a common question and one difficult to answer, because relevant information can be scattered across bugs, e-mails, check-in messages, and elsewhere. We have built a multisearch investigative UI to help you dig for answers.

The Yogi Project

Yogi is a research project on software-property checking from the Rigorous Software Engineering group at Microsoft Research India. Our goal is to build a scalable software-property checker by directly analyzing program binaries. This involves a new algorithm for property checking that systematically combines static analysis with testing. We will show that this synergy of static analysis and testing can be harnessed for effectively finding bugs in system software.

Asirra: Securing Web Services with Cute Kittens

Can you tell a dog from a cat? Perhaps you’ve seen Web services that require you to solve a small challenge to prove you are not an automated script. This is known as a CAPTCHA, and it commonly involves looking at distorted text and typing it into a box. Since OCR software can identify distorted characters quite well, CAPTCHAs add visual clutter to their images, but this also makes the challenges harder and more annoying for humans. We are developing a system, called Asirra, that challenges users to classify images of dogs and cats, a task difficult for software but easy and even fun for humans. Because software is of little help to us, Asirra needs a large source of classified pet images. We obtain them through an alliance with, a nationwide pet-adoption site, which benefits because every challenge implicitly advertises adoptable pets.

Competitive Online Algorithms and Ad Auctions

There are many situations in which one must make decisions before all the input data has arrived. Algorithms that work in this setting are called online algorithms. For example, how do you choose which ads to display for a given search query when the advertisers have budgets and the number of queries is not known in advance? How do you schedule continually arriving jobs to processors? How do you even evaluate the performance of an online algorithm? We describe online algorithms for auctions and scheduling, and evaluate their performance using the notion of the competitive ratio with an algorithm that knows all the inputs in advance.

From Physics and Geometry to Algorithms

Simple—and not so simple—techniques and results from physics and geometry are incredibly useful in the analysis of applied problems in computer science. One example is sphere packing, a fundamental problem in geometry that has applications to communication over noisy channels. Sphere packings arise in nature as materials minimize their energy. Other examples are algorithms for quickly finding good matchings using Coulombic forces, or fair allocations using gravity.

Pex: Dynamic Analysis and Test Generation for .NET

Pex enables a new development experience in Visual Studio® Team System, taking test-driven development to the next level. Pex analyzes .NET applications. From a parameterized unit test, it automatically produces traditional unit-test cases with high code coverage. Moreover, when a generated test fails, Pex often can suggest a bug fix. Pex performs a systematic program analysis, recording detailed execution traces of existing test cases. Pex learns program behavior from the execution traces, and a constraint solver produces new test cases with different behavior. The result is a minimal test suite with maximal code coverage. When a test fails, Pex uses detailed data-flow information to determine the root cause and a potential bug fix.

MemRay: Viewing Memory-Reference Locality

The performance tools we use are code-centric and produce summary information about the entire program execution. Consequently, performance problems, such as poor data locality or bad performance during a specific phase of program execution, often go undetected. Research prototypes of more sophisticated tools for investigating data locality and time-specific program behavior exist, but they are complex, limiting their utilization. We present MemRay, a memory-access-animation tool that makes understanding memory-reference behavior accessible to a much broader audience. MemRay displays a memory-access movie that can be viewed to identify memory bottlenecks, as well as locality and scalability problems. It pinpoints the code and data structures responsible and shows when during execution the problem arises. We show how MemRay can be used to understand and optimize memory performance.

Concurrent Programming: A New Approach

techfest_sv_cap.jpgWe’ll describe a new technique that enables programmers to create correctly synchronized, efficient, concurrent programs without having to write synchronization code. We’ll explain how this magic happens, supported by examples and an outline of an implementation.

Biometric Authentication via Fingerprint Hashing

We present a new technique for generating biometric fingerprint hashes, or summaries of information contained in human fingerprints. Our method calculates and aggregates various key-determined metrics over fingerprint images, producing short hash strings that cannot be used to reconstruct the source fingerprints without knowledge of the key. This can be considered a randomized form of the Radon transform in which a custom metric replaces the standard, line-based metric. Resistant to minor distortions and noise, the resulting fingerprint hashes are useful for secure biometric authentication, either augmenting or replacing traditional password hashes. As shown in our hands-on demo, this approach can help increase the security and usability of Web services and other client-server systems.

Systems, Networking, and Databases

Example-Driven Design of Record Matching Queries

Matching records from two relations is an important component of data-cleaning processes and Extract Transform and Load. The goal is to identify pairs of records, which may differ because of representational differences and errors, that represent the same real-world entity. Searching through the large space of possible queries, evaluating each, and finding the most accurate is difficult. In this demo, we will illustrate tools to develop a record-matching package in SQL Server™ Integration Services (SSIS). A user has to mark a set of example record pairs as matches or non-matches. We then suggest an accurate package, using a set of SSIS transforms, which can be reviewed and used as a foundation for further analysis.

Efficient Point-to-Point Shortest Paths

A lot of progress recently has been made in point-to-point shortest-path algorithms. In particular, highly practical algorithms have been developed for computing driving directions. We demonstrate our recent codes for this application. These codes work well on servers, desktops, and handheld devices.

Scaling P2P Games in Low-Bandwidth Environments

First-person shooter games such as Halo and Quake are limited to a few players, but we wish to scale such fast-paced games to massive battles with many players. Since this requires more outbound bandwidth than any home machine has, we partition the game state among all machines. But even so, at extremely large scales, there is still not enough bandwidth for each machine to update all others in every frame. We thus send updates infrequently and use guidable AI to emulate remote avatars’ behavior between updates. We estimate which remote avatars are most interesting to the local player and ensure a higher update rate for them. And we ensure consistent interaction when necessary, such as when one player damages another. A user study shows that these techniques make Quake III over low-bandwidth connections nearly as much fun as on a LAN. Come play the game and see for yourself!

Automatically Finding Network and Server Problems

Does your browser sometimes temporarily hang while loading a Web page, even from intranet sites? You are not alone. We have observed that 10 percent of requests take 10 times longer than expected. We will demo the Analysis of Network Dependencies system, which automatically finds the causes of these hangs, whether it be an overloaded Web or SQL server, a delay in the Domain Name System, a congested network link, or a scheduling delay in the client operating system. The system uses software running on the clients to observe the behavior of the IT infrastructure and to determine dependencies among components. It then uses tomography and Bayesian inference to find problems. Our goal is for the system’s output to support helpdesk staff answering user issues, IT managers planning for capacity upgrades, architects modeling system deployment, and, potentially, data-center operations staff.

UI, Graphics and Media

Boku: Lightweight Programming for Kids

Boku uses a novel, high-level programming paradigm within a 3-D gaming world on the Xbox 360® to introduce children to creative use of the computer. Boku’s programming model is extremely simple as it does not use a textual language or wiring diagrams. Kids use simple behavior cards to enable a small virtual robot to navigate its world and achieve specific tasks. The goal is to provide a gentle introduction to some of the foundational elements of creative programming to children who may not yet be ready for the complexity of classical computer languages. The user is exposed to behavior arbitration, generality, representation of an abstract state, real-time experimentation and feedback, simulation, sensors, physics, and message passing. The programming environment is integrated in an attractive gaming world and controlled entirely via an Xbox 360 game controller.

Mix: Search-Based Authoring

Search, aggregators, and RSS enable people to draw information from many dynamic streams of information on their desktop. People are getting used to reading dynamic content, but there are limited tools today to author and share dynamic content. Mix enables people to build and share dynamic documents with rich structure and visualizations on top of first-class query objects that draw from desktop, intranets, and Web-based search. Mix explores new user interfaces with regard to privacy and security. Sharing a query presents challenges, because the recipient of the query may not have the same access permissions as the publisher. This involves new notions of publishing and privacy control in the user interface.

Linking the World Through Pictures

Have you ever wanted to know more about a DVD, a painting, or a rock-band poster? Take a picture of it. Our prototype connects the world through pictures, providing relevant Web pages and comments from a community of users. Discover if the DVD got good reviews or if you like the music of the rock band. This works as a smartphone application on camera phones: capturing a picture, sending it to our servers, and retrieving relevant information. Content for our system is supplied by the community. Users can add new images, as well as add links and comments to existing images. Using our Web site, users also can search using photos from any digital camera. Our technology is based on a new image-matching technique. Pictures of flat objects, such as signs, posters, and advertisements, can be matched reliably without the need for special bar codes.

Personal Audio Space

A Personal Audio Space is a semi-private, energy-efficient system for real-time communication. We recreate the headset experience without using a headset. Only the intended user can hear the system. Using multiple speakers, we focus the sound into a region around the user. To anyone outside of this personal audio space, the sound is inaudible. By focusing the sound, we can achieve any absolute sound level with less power than a conventional system.

HDView: IE Plug-in for Viewing Very Large Images

New imaging modalities range from photo collections arranged in 3-D to super-high-resolution (gigapixel) images to 360-degree panoramic video. This is revolutionizing the way that people view and interact with their photos. We will demonstrate a new viewer that can be embedded in any application or Web page. It merges traditional slide shows, super-high-resolution panoramas, high-dynamic-range imagery, and 360-degree animations to create an incredibly rich photo-viewing and -browsing experience. During TechFest, we will demonstrate these features. A version with a subset of these features will be made available both internally and externally. We also will demonstrate a prototype authoring tool that generates HDView content.

Digital Effects for Internet Video Clips

We will present a set of offline video-editing tools that make videos more fun. Existing video-editing tools provide filters such as de-noising and adjustment of color and contrast or transitions such as fade in and out. These tools are useful, but they provide only slight improvement to videos. We will show fun video-editing tools that can improve a user’s video experiences significantly. Our tools achieve three operations to a video:

  • Add—adding objects such as 3-D synthetic objects and video hyperlinks into a video.
  • Separate—separating a video’s foreground from its background, to achieve cut and paste of video objects.
  • Browse—browsing video in the form of montage, summarizing the video in a space-time manner.

We will demonstrate our technologies as applied to Internet video clips rendered even more enjoyable.

Using Touch to Operate Stylus-Based Devices

Operating a personal digital assistant or other pen-based devices with bare fingers is often faster than retrieving the stylus. I will present extensions for Windows Mobile that help users using touch, even though the application was designed to be stylus-based.

Improved Podcast Authoring with Speech Recognition

Creation of audio/video content, podcasts in particular, presents challenges. Editing long podcasts can be tedious. The author must precisely identify the boundaries of the material he wishes to delete, move, or manipulate. This is time-consuming, because it requires marking of boundaries while listening or watching the content and then checking or modifying those boundaries by repeating the process multiple times. Automatic Speech Recognition recognizes the words and aligns them with the podcast content. The author then can manipulate the raw audio content by manipulating words in a GUI. Words can be processed further to extract keywords or summaries automatically.

Dynamic Noise Reduction

Speech enhancement is used to improve the quality of recorded speech and to remove non-speech sounds from a recording. We’ve developed a new, simple, strong model for the structure of speech audio. We use this model to identify the user’s speech and to remove everything else. It is much better than conventional techniques at removing non-stationary noises such as restaurant and traffic noise. Speech enhancement is especially important in applications such as preventing environmental noise from leaking into a conference call, creating a professional-sounding podcast, and polishing recordings taken under less-than-ideal conditions.

Relaxed Internet Video Exploration and Discovery

The ultimate challenge for Internet video is to bring it into the living room. But the active, disruptive discovery modes of the lean-forward Internet used in today’s video portals, such as keyword search and hierarchical list browsing, do not translate well into the relaxed culture of the living room. Creating new technologies for realizing relaxed discovery modes are the subject of this project. We will show a system for exploring the immense collection of Internet video from the comfort of the living room. Speech/audio-based content analysis and document similarity is combined with collaborative filtering to organize, select, and recommend video content. Familiar TV metaphors such as channel zapping and headline bars are used to enable a low-interaction, more passive perusal of Internet video on the television.



DynaVis: A visualization framework for next generation Dynamics UX, which is extensible, supports animated transitions, direct manipulation of data, and compositing. FastDASH: a peripheral display that shows software development teams where their team members are in the code, what methods they are working on, and who may need help.

New Concepts for the Home

We will present nine new technologies aimed at enriching home life, under four themes: New messaging concepts, New mobile concepts, New image displays and Paper-digital concepts.

Scaling P2P Games in Low-Bandwidth Environments

First-person shooter games such as Halo and Quake are limited to a few players, but we wish to scale such fast-paced games to massive battles with many players. A user study shows that our techniques make Quake III over low-bandwidth connections nearly as much fun as on a LAN. Come play the game and see for yourself!

Surface-Computing Innovations

Surface computing uses sensing and display technology to imbue everyday surfaces with interaction. PlayAnywhere is a compact surface-computing system shown at TechFest last year.

Asirra: Securing Web Services with Cute Kittens

Can you tell a dog from a cat? Perhaps you’ve seen Web services that require you to solve a small challenge to prove you are not an automated script. This is known as a CAPTCHA. We are developing a system, called Asirra, that challenges users to classify images of dogs and cats, a task  difficult for software but easy — even fun — for humans.

Boku: Lightweight Programming for Kids

Boku uses a novel, high-level programming paradigm within a 3-D gaming world on the Xbox 360 to introduce children to creative use of the computer. Boku’s programming model is extremely simple as it does not use a textual language or wiring diagrams. The programming environment is integrated in an attractive gaming world and controlled entirely via an Xbox 360 game controller.

HDView — An IE plugin for Sharing Images

We will demonstrate a new viewer which can be embedded in any application or Web page. It merges traditional slide shows, super high-resolution panoramas, high dynamic range imagery, and 360 degree animations to create an incredibly rich photo viewing and browsing experience.