TechFest is an annual event that brings researchers from Microsoft Research’s labs around the world to Redmond to share their latest work with Microsoft product teams. Attendees experience some of the freshest, most innovative technologies emerging from Microsoft’s research efforts. The event provides a forum in which product teams and researchers can discuss the novel work occurring in the labs, thereby encouraging effective technology transfer into Microsoft products.
Emerging Markets and Research Partners
Technology for Rural Education and Healthcare
Rural environments in developing countries pose steep challenges in delivering effective education and healthcare. While not all of them can be solved with technology, we present a couple of examples in which technology plays a key role. In one series of experiments with education, we examine how a single PC in a classroom can best support a time-constrained, technology-naïve teacher, both with and without student interaction. In another, an electronic pillbox helps to verify that tuberculosis patients receive their medications and take them on the right schedule.
Tools and Services for Data-Intensive Research
We will show efforts in Microsoft Research to collaborate with external researchers to explore the application of new technologies, specifically Dryad and DryadLINQ, to big data-research problems in science. We also highlight our efforts to provide software and services to academics across the world, through the binary release of Dryad with associated programming-user documentation, as well as a Microsoft Research initiative to provide researchers with access to computational resources. Finally, we will discuss initiatives to provide both services to analyze data and safe access to sensitive PII data. The demo will include two lighthouse eScience applications, Large Synoptic Survey Telescope (LSST) image simulation and analysis of biological sequences. Researchers from the University of Washington will discuss the science behind the demos and explain how this represents a paradigm shift in the research. We will run live demos using a cluster in the Microsoft Research shared infrastructure, highlighting the ability to marshal the computational resources of hundreds of cores to analyze data using Dryad. Our booth will include a poster prepared by the University of Washington to provide context for both LSST and associated science and biology applications from Lee Hoods lab at the Institute for Systems Biology and a poster that illustrates the Dryad computing graph and how it enables a programmer to manage a distributed application over a Windows cluster.
Hardware, Devices and Mobile Computing
Back-of-Device Touch Input
Most mobile devices have small displays and require some form of human (touch) input. With much of the real estate in mobile devices consumed by the display, touch input can be a problem, either from image occlusion or from a lack of tactile feedback when trying to input text. We will demonstrate a device with touch input on the back and one with a clear tactile overlay directly on the display.
Low-Power Processors in the Data Center
This demonstration will show an experimental prototype to study the use of low-power processors in the data center (DC). These processors offer substantial fractions 33 percent to 50 percent of the performance of the high-performance processors used in Microsoft’s data centers but consume a disproportionally smaller amount of power 5 percent to 10 percent. Power consumption accounts for as much as 30 percent of the total operating costs of a DC. Across Microsoft’s Internet properties, about 10 percent of DC CPU cycles perform useful work. Remaining systems run at near full power, because servers and Windows Server do not yet support low-power modes. To study the potential of low-power processors in the DC, the Data Center Futures team built a prototype containing 100 dual-core Atom processors. Half are attached to standard hard disks, and half are attached to low-power flash storage, to study the tradeoffs of this technology in the DC. The processors are connected through the Monsoon network. This prototype will be used to test new technologies including flash, optical networks, and FPGA accelerators and for a variety of experiments, such as powering down idle processors. We believe that data centers in the future will include many more low-power processors. This prototype is the first step in demonstrating their potential cost savings and in developing the algorithms and software to take advantage of these processors.
SecondLight: Bringing the UI into the Real World
SecondLight is a new surface-computing technology that can project images and detect gestures in mid-air above the display, in addition to supporting multitouch interactions on the surface. It works by using an electrically switchable liquid-crystal diffuser as the rear-projection display surface. This material is continually toggled between diffuse and clear states, so quickly that the switching is imperceptible. When it is diffuse, the system behaves like a regular surface computer, but when clear, it is possible to project into and image the area above the display surface. This enables magical new forms of interaction in which the UI is no longer bound to the display surface, but becomes part of the real world.
Search, Interaction and Collaboration
Code Name Viveri: A Platform for Search Incubation
We will demo an externally facing search engine that serves as a platform for incubation that will be released to the web this summer. Experimental search interfaces can be too unusual or jarring to trial directly on unprepared users of our primary search engine, but refinement and improvement rely on interaction with real users. Our technology aggregates content from multiple sites, presenting multiple user-interface elements and types of information for each query, enabling the exploration of multiple experimental ideas at once, as well as demonstrating the possibilities of the search engine as a platform. A search idea can be developed easily and deployed with little integration work. The Web-slice API facilitates rapid deployment of existing technology, while Silverlight concurrency provides scalability and low latency. An arbitration algorithm targets the most appropriate set of user interfaces and auxiliary information to display based on query features. We will present the interface and API, as well as a set of technologies shipping to the Web in the first release.
Color-Structured Image Search
We propose a novel method for image-search result refinement by exploiting color spatial-relation information, called color-layout-sensitive image search. Given image-search results, this method enables users to specify color layout of interest and then reorders image-search results by promoting the images whose color layouts are more accordant to color layout of interest. Specifically, this method performs an offline process to extract color-layout features as metadata, with the advantages of low storage and cheap computational cost. For the online refinement of image-search results, an efficient, effective color-layout similarity-evaluation scheme is proposed by investigating the consistency between the specified color layout of interest and the color layouts of searched images, with regard to the interest colors and their spatial relationship. This evaluation scheme is fast, which makes the online refinement respond immediately. Moreover, a convenient, interactive interface is presented to enable users to specify color layout of interest flexibly. Experimental results demonstrate the effectiveness and efficiency of the proposed approach. We are investigating color-structured image searches, but this technique also could be extended to other kinds of semantic structures.
Content Services for Minority Languages
Our demo has two parts:
- Searching scanned books without the need for OCR: OCRLess is a language-independent technology that enables search in printed documents for languages that lack OCR or whose OCR quality is poor. By combining image-based matching with text-based indexing, OCRLess overcomes high character-error rates common to languages with complex orthographic features, surpassing the search performance of traditional OCR-based systems. This is achieved by segmenting image documents into shapes-words, parts of words, characters, or parts of characters; clustering similar shapes; and indexing their unique IDs. Text queries then are drawn, segmented, and matched to the table of clusters to produce hits. The demo encompasses English, Arabic, Chinese, Hebrew, and hieroglyphics.
- Trans-Bulletization: We will present a tool that translates a document into English, then summarizes it into bullets. Machine translation is effective in conveying meaning but lacks style and fluency, degrading the user experience. By reducing the machine-translation output to bullets, the user’s expectation of fluency and style are diminished. This is presented in the context of cross-language English-Arabic news search. The system uses Microsoft Research phrase-based statistical machine translation, coupled with a custom Arabic word breaker. For sentence reduction, we use state-of-the-art natural-language-processing tools, including parsing, part-of-speech tagging, and summarization.
GeoLife 2.0: A Location-Based Social Network
The increasing availability of GPS-enabled devices is changing the way people interact with the Web and brings us a large number of GPS trajectories representing peoples location histories. Such real-world location histories imply, to some extent, users interests and intentions and enable us to understand people and locations. GeoLife 2.0 is a location-based social-networking service on Virtual Earth that enables people to build connections with each other using their location histories. With multiple users GPS trajectories, GeoLife helps us not only understand an individual and a location, but also explore the similarity between users and the correlation among locations. By mining the similarity between peoples location histories, this system can help a user automatically discover potential friends in a community who might share similar interests. Thus, the user can conveniently deliver invitations to persons in the community and hence sponsor, with minimal effort, a social activity such as hiking, cycling, or traveling. From these potential friends past experiences, a trustworthy resource, the user is more likely to discover places that might match the users tastes. We have had 112 people using this system over a period of a year. We have collected more than 7 million GPS points, and the total distance of the data set surpasses 170,000 kilometers.
Helping Writers Find the Right Words
Writers often need help in choosing words. They might be seeking to introduce variety into their prose or to avoid an awkward or inappropriate phrase. Often, they will consult an online thesaurus. In other instances, as when writing technical documents, writers might need to use terminology aligned with organizational standards and refer to a style manual, possibly stored on a corporate intranet. Conventional thesauri and terminology lists though, are static and usually quite unhelpful with regard to usage in context. We will demonstrate a tool that provides writers with inline, contextual thesaurus help and offers a potential path to a new generation of online writing-assistance applications. We combine a paraphrase model, derived from aligned translation corpora and other corpora-based word-similarity data, with a large language model to provide suggested rephrasings that might be appropriate in the writers intended context. Optional Web-search functionality provides further examples of real-world use. By swapping in, or combining in, new, domain-specific language models, it also becomes possible to tailor editing assistance to specific domains or corporate clients. This application can help both native speakers of English and those for whom English is a second or foreign language.
Image-Centric Advertisement Platform
In existing online-advertisement platforms, the relevance between advertisers and users is decided largely by advanced keyword matching. Typically, in a pay-per-click model, advertisers specify the words that should trigger their ads and the maximum amount they are willing to pay per click. When a user enters a search query, browses a Web page, or interacts with text, the advertisement platform will select and show relevant ads based on the text content in the query or the page. Though other context, such as location, time, and user profile, can be taken into consideration, text understanding remains the main technology. We will present an image-centric advertisement platform in which advertisers bid on images instead of keywords. For example, a toy seller could bid on the image of a related movie poster, while a restaurant could bid on the image of a cooking-magazine cover. Users would receive ads based on the content of images they recently browsed or used. Components of this platform include an advertisement editorial tool and image-content-understanding, image-matching, and user-understanding modules. This platform is suitable for application scenarios in which images are the main input or consumed content, such as Multimedia Messaging Service or content-based image retrieval.
Opinion data are almost everywhere. They exist in the form of user reviews on shopping and opinion sites, as posts on forums and blogs, and in direct user feedback. At the same time, many users want to use opinion data to help them conduct product research, understand the pains and gains of others, or improve products and services. But the current search engine is not effective in satisfying users opinion search needs; search results are not well organized, and the snippets are not descriptive enough to indicate users opinions. We will propose a new vertical search engine, Opinion Search, to provide a different search experience for users. The system will return documents holding opinions about the input query. The documents indexed by the system are collected from various sources, including user reviews, blogs, and news. The search results are ranked according to both their relevance to a particular query and the strength of the opinion about the query. The snippet text gives a brief summary of user opinion from each result document. For some queries, the system even gives a structured summary for opinions contained by all the result documents. The system also helps users filter results according to opinion polarity as in positive and negative opinions.
Renlifang: Web-Scale Entity Summarization
The need for collecting and understanding Web information about a real-world entity currently is fulfilled manually through search engines. But the information about a single entity might appear in thousands of Web pages. Even if a search engine could find all the relevant Web pages about an entity, the user would need to sift through all the pages to get a complete view of the entity. We will show Renlifang, a Web-scale entity-summarization system that efficiently generates summaries of Web entities from billions of crawled Web pages. Specifically, Renlifang automatically generates:
- A biography page for a person.
- A social-network graph for a person.
- A shortest-relationship path between two people.
- All titles of a person that are found on the Web.
- All the structured data we have in our local database about a person.
This project aims to enable a new generation of interactive systems that can reason about their surroundings and embed interaction deeply into the natural flow of everyday tasks, activities, and collaborations. As an initial sample challenge in this space, we have developed and will demonstrate a situated conversational agent that can act as a Microsoft front-desk receptionist, performing tasks such as making shuttle reservations and registering visitors. The system integrates a large number of artificial-intelligence technologies, such as speech recognition, person and group detection and tracking, intention recognition, and attention and engagement modeling into a conversational framework that enables it to engage in mixed-initiative, natural-language interaction with one or multiple participants.
Today, it’s easy to share a Web page or a blog post, because items on the Web have unique IDs: URLs. We don’t have this on the desktop. Social Desktop adds URLs to the files and folders on your desktop, letting you share anything on your computer with anyone who can click on a URL. Persons receiving links can either access via e-mail or comment, tag, and search across all shared items via our Web page. We implement this by using a .NET service, bus it is possible to create a universal namespace for every device and data source for a user, providing a universally addressable namespace with:
- Universal access. The same URL works from any device in the world.
- Universal sharing.
- Universal tagging and commenting.
- Freedom from legacy paths. Data isn’t limited by file-system concepts.
You can have a URL drill into a subportion of a document or a PowerPoint deck, or data could come from a Web service or a database. Social Desktop is a local service that maps the user’s local data into a .NET service bus service, enabling local data to be accessible through firewalls. Social Desktop also provides a Web-service view over the same data, with inherent RSS event streams for any container. New data sources can be mapped into the URL hierarchy, enabling a distributed view to be built. There are simple sharing paradigms that enable URLs to be shared temporarily or permanently.
Social Views of E-Mail
Incoming feeds of information become increasingly overwhelming as more people use social networks, e-mail, instant messaging, and other forms of communication. Our work automatically analyzes a users communication and organizes the feed into groups. Depending on when and with whom a user is communicating, a different stream of information can be presented as contextually appropriate. The goal of automatic group discovery is not only to detect the initial grouping, but also to discover slow changes to groups over time, thus freeing the user from manual group management. We also will show different ways to visualize an incoming stream of information, from a condensed overview for a small screen to a lush, immersive experience. Peripheral information can be squeezed into less space by employing time-sharing presentation: a ticker. Users can zoom into additional pages to get older messages in the thread or context. A search for information also can be done via a timeline-based presentation.
User-Interactions Advertising Platform
One ad, a world of interactions-we will introduce a new, dynamic, end-to-end user experience for online advertising. This new experience promotes user-centric interactions in the advertising ecosystem by engaging users with an exciting exploration experience, accompanied by comprehensive information encapsulated within the ad and readily accessible; empowering advertisers to manage the advertising experience smoothly and consistently into an extension of their Web sites; helping publishers keep more users on their Web pages; and providing agencies with tools to deliver additional value to advertisers. User-interaction zoom ads suggest a new way of content navigation, powered by Silverlight and Deep Zoom, to optimize the use of ad real estate. We also are introducing an intuitive, simple design tool for creating the ads. This tool enables designers to take advantage of built-in zoom functionality to embed an unlimited amount of content into a standard-sized ad. They dynamically can link different regions within an ad and configure menu-driven navigation. With zoom ads, getting drill-down information is only a mouse-wheel away.
Software, Theory and Security
Algorithms and Cryptography
The last 10 years have seen a marked increase in the use of sophisticated algorithms, as well as in the scope and breadth of the theory underlying them. This work has been spurred both by new problems arising in areas such as Web search and algorithms and by classic problems fruitfully revisited. Well present new algorithms for selecting which ads will be most profitable to display, for making use of survey-propagation techniques from statistical physics to cluster data according to similarity, and for building cryptographic systems that can withstand the disclosure of partial information about their secret keys.
Closed-Loop Control Systems for the Data Center
This demonstration shows a closed-loop, adaptive control system for Live Search that aims to minimize energy usage while guaranteeing a service-level agreement (SLA) for search response time. Power is a central issue in the design and management of data centers. Power consumption accounts for as much as 30 percent of a data centers operating costs. Idle machines consume a good fraction of this poweronly about 7.5 percent of CPU cycles executed in Microsofts data centers performs useful work. Minimizing power usage during periods of low workload could save Microsoft money. Applications deployed in the data centers, however, require a strong guarantee of their performance. For example, Live Search requires the response time of at least 99 percent of queries to be less than 300 milliseconds. A key challenge is to minimize power usage while meeting the desired SLAs. To address this challenge, we present an energy-aware prototype built using 100 low-power Atom processors that execute a Live Search benchmark with a scaled-down, 1-GB search index per node. To meet the 300-millisecond response time, we apply machine-learning techniques that model performance as a function of workload and that set power statesidle, sleep, hibernateacross nodes to save energy. Because transitions between different power states incur a latency of 15-30 seconds, our prototype provides a predictor module that switches processors to different power states in advance of workload transitions.
Concurrency Analysis Platform and Tools
Concurrency bugs are difficult to find and hard to reproduce. We will demo the Concurrency Analysis Platform (CAP), which provides predictable control over thread interleavings. When a concurrency bug is found, CAP can drive the program along the erroneous interleaving, providing an instantaneous repro. CAP enables multiple concurrency-analysis tools that will be useful for developers and testers. The demo will include CHESS, a systematic unit-testing tool for concurrency; Cuzz, a concurrency fuzzer for obtaining more coverage from existing stress tests; FeatherLite, a lightweight data-race detector,; and Sober, a tool for finding memory-model errors.
Gale-Berlekamp Light-Bulb Game
This demo uses a replica of the Gale-Berlekamp (GB) switching game to demonstrate a new, linear time-approximation scheme for approximating GB-game problems to within given precision. The physical model is a 10-by-10 array of light bulbs on which the adversary chooses an arbitrary subset of the light bulbs to be initially on. Next to every row and every column of light bulbs is a switch, which can be used to invert the state of every light bulb in that row or column. The protagonists task is to minimize the number of lit light bulbs by flipping switches.
Lightweight Software Transactions for Games
To realize the performance potential of multiple cores, software developers must architect their programs for concurrency. Unfortunately, for many applications, threads and locks are difficult to use efficiently and correctly. Lightweight software transactions provided by Object-based Runtime for Concurrent Systems (ORCS) are an attractive alternative for game programmers who want to exploit the performance potential of multicores without devising complicated locking protocols. With ORCS, programmers specify the tasks that execute in each frame and declare shared data using thread-safe template wrappers. ORCS then automatically executes the tasks concurrently and in isolation by replicating shared data and merging conflicting updates at the end of each frame. We will demonstrate how ORCS can achieve competitive multicore performance without burdening the programmer.
Social Media and Learning Theory
Networks and connectivity play a fundamental role in online behavior and in human society. Who is, or should be, connected to whom? What can we infer from the structure of social networks? What opportunities or pitfalls do different types of networks presentfor example, for advertising, for recruitment, or for building buzz? To cope with the behavior of the independent agents who constitute a social network, we use game theory to model them; to let computers adapt to constantly changing networks, we use machine-learning theory. Well demonstrate how to measure the value of a game (how favorable it is for each player), how to analyze similarity functions for use in learning and clustering, and how to design a recommendation system that collects and weights opinions across a social network.
Solver Foundation: Mathematical Optimization
Microsoft Solver Foundation is a new framework and managed-code runtime for mathematical programming, modeling, and optimization, with a focused goal of helping businesses make near-optimal, strategic decisions. The possible applications cover a vast range: real-time supply-chain optimization, data-center energy-profile management, online-advertizing profit maximization, logistics of large conference scheduling, and risk analysis of investment portfolios. There are also direct applications to graphics and machine learning for which Solver Foundation acts as a runtime for such systems. All of these decisions are encoded through a declarative model specification, one that focuses the modeler and developer of stating the what rather than the how of the business decision to be made. This rapidly accelerates solution engineering and increases the degree of what-if? analysis possible. Solver Foundation has several specific solvers that are good for one or more domain and modeling situations such as linear programming (simplex and interior-point-methods-based), SAT solving, CSP, and quadratic programming. Eventually, solvers will include constrained, convex non-linear programming.
Specification Inference for Security
The last decade has seen a proliferation of static and dynamic analysis tools for detecting security vulnerabilities in programs. Much of this interest is because of the large increase in the number of vulnerabilities, such as cross-site scripting and SQL injections. Tools checking for these vulnerabilities require a specification to operate, and the effectiveness of these tools is only as good as their input specifications. Unfortunately, writing a comprehensive specification is a major challenge. We will demo a new algorithm that automatically infers explicit information-flow security specifications from program code. Beginning with a data-propagation graph, which represents the flow of information in the program, and a partial specification of methods, Our algorithm aims to complete the specification using probabilistic inference. We experimentally validate the approach by applying it to 10 large, business-critical Web applications analyzed with CAT.NET, a state-of-the-art static analysis tool for .NET. We find a total of 167 new confirmed specifications, which result in a total of 302 additional vulnerabilities across these 10 benchmarks.
Systems, Networking and Databases
Predicting Performance Problems in the Data Center
The HiLighter analytics engine provides the ability to understand and manage the health, status, and behavior of a service or system. The approach is based on statistical machine learning, and it automatically extracts from measurements the relevant metrics for modeling performance. It also finds early indicators of performance crises for predicting performance problems. We have evaluated HiLighter on real data from November 2006 to May 2008 from the Dublin data center running Extended Hosted Services. The evaluation consisted of HiLighter making a prediction every 15 minutes and updating its models online, starting with data from one performance problem. HiLighter correctly predicted 50 of 64 performance crises, with a lead time of almost an hour, and had less than one false alarm per day.
Profiling the Performance of Distributed Systems
We will show a modular application designed for analyzing and troubleshooting the performance of large clusters running data-center services. It consists of four modules: 1) distributed-log collection and extraction, 2) a database storing the extracted data, 3) an interactive visualization tool for exploring the data, and 4) a plug-in interface and a set of sample plug-ins that enable users to implement data-analysis tools.
UI, Graphics and Media
Audio Spatialization and AEC for Teleconferencing
In multiparty conferencing, one hears voices of more than one remote participant. Current commercial systems mix them into a single mono audio stream, and thus, all voices of remote participants will appear to come from the same location. This is in sharp contrast to what happens in real life, in which each voice has a distinct location. We will demonstrate technologies to enhance the user experience in multiparty conferencing by using highly realistic, immersive spatial audio for both loudspeakers and headphones. This is proven to improve the conferencing experience significantly, because each participant easily can differentiate the current remote talker and focus on the content being discussed. Multichannel Acoustical Echo Cancellation (AEC) is a crucial component to enable a quality audio experience during conferencing, especially without a headset. We also will demonstrate the AEC capability in real time so that the remote side will hear only the near-end participants speech without their own echoes. Performance comparison among multiple AEC algorithms will be provided.
Commute UX: Dialog System for In-Car Infotainment
After deploying Blue&Me for Fiat and Sync for Ford, in-car dialog systems are morphing from cool gadgets that amaze people and sell more cars to integral parts of in-car infotainment. This raises the bar for the functionality, usability, and reliability of these systems. The presented in-car infotainment system contains novel technologies from Microsoft Research that enable natural-language input; expose a multimodal user interface including speech, a GUI, touch, and buttons; and use state-of-the-art sound-capture and processing technologies for improved speech recognition and sound quality.
Core Tools for Augmented Reality
This demo will explain the development of a new kind of image feature that can be used to for a variety of applications, ranging from image stitching to augmented reality. The features already are finding their way into Microsoft products and are being considered for many new applications. We will demo a fun example application: a treasure hunt. By using the posters and other graphics on display during TechFest, we will let users borrow a device that augments the world with virtual clues to find hidden treasure.
Digital Past to Digital Presence
We will show concepts related to digital archiving and communication:
- The first set will include Family Archive, an interactive, multitouch device for the home designed to help capture, manage, integrate, display, and store both digital and physical memorabilia. We also will present Time Card, a much simpler device for the home, designed to display a digital record of a person’s activities, one that can be navigated and sorted in various ways that highlight different renderings of that person’s past.
- The second set will show devices that can enable people to communicate in new, expressive ways. Wayve represents the culmination of a variety of home messaging concepts. It is an appliance that, based on a field trial in the United Kingdom, has shown itself able to connect people playfully and creatively, representing a new genre of communication for the home. A second concept is CellFrame, a small, standalone, wireless display and communication device, designed to bring people who have remained outside social networking into the experience and the benefits that it can provide.
Interactions with an Omni-Directional Projector
We will present a combination of a standard projector with a wide-angle lens capable of projecting data onto the entire, 360-degree surrounding environment from a single position. This setup provides an immersive experience similar to existing, much more expensive planetarium projectors or Virtual Reality CAVE projectors, on which all of the surfaces in the room can receive projections. We have added an infrared camera that shares the wide-angle lens with the projector and is capable of detecting a users hands and tracking freehand gestures in mid-air, without additional gloves or tracking objects. This demo integrates several Microsoft technologies into a stunning presentation: We will offer a hemispherical dome in which users can interact with data from Virtual Earth and WorldWide Telescope.
Mobile Content-Casting and Social Exchange
We will offer two insights into the problems posed by mobile ad hoc networks and the opportunities they provide:
- The first is a platform for seamless content dissemination across a mesh of mobile devices. The platform leverages user encounters for device-to-device data transfer, as well as occasional cloud connectivity. When devices meet, content is exchanged automatically. The goal is to optimize across multiple objectives: getting content the user values the most; best serving communitywide interests by enabling efficient, multihop content dissemination; and maximizing battery life. The content might span any type of social-networking-application content or data, such as Facebook feeds, photos, blogging, and news. The system even can be useful for dissemination of geo-local content, advertisements, alerts, and software updates.
- The second demonstrates three concepts that show how mobile social-networking applications can support and deepen face-to-face interactions. Digital Gifts will demonstrate how automatic Bluetooth device recognition, linked to Windows Live IDs, can enable users to undertake semi-automatic exchange of Windows Live photos with co-present, rather than remote, parties. Tangible Mesh will show how access rights to remote files can be exchanged via infrared or Bluetooth on mobiles registered with Live. Pictureplace will show how GPS data can be used to create indexes of and access rights to photos taken in the same locale but by different devices.
Real-Time Stitching of Mobile-Generated Videos
Mobile phones are becoming increasingly ubiquitous, and a large percentage of these devices have built-in cameras. This makes mobile phones great for capturing and sharing multimedia content. It is likely that in public and private settings, many users will be recording the same event simultaneously using their mobile phones cameras, perhaps from different viewpoints. Typically, any single video stream from one of these mobile phones would capture a rather small field-of-view (FOV) to maintain high resolution. We aim to enable multiple mobile phones to collaborate in recording an event and then, in real time or near real time, construct one higher-resolution video from the resulting video streams. We will show a real-time video-stitching system from mobile-phone cameras offering a wide FOV experience with high resolution. Potential applications could be in events such as emergencies, where people on-site offer a real-time feed for rescuers until they reach the place; virtual attendance of family gatherings by members living abroad; citizen journalism; and providing multimedia-to-multimedia links for multimedia-sharing sites.
Recognizing Characters Written in the Air
This is a proof-of-concept demo for recognizing characters written in the air. In order to input characters into devices such as Xbox and televisions equipped with a low-cost Webcam but no keyboard or mouse, one can just face the camera and write the intended character in the air by using ones finger or a handy object. The gestures captured by the camera will be fed into a robust handwriting recognizer. A short list of recognition results will be displayed on the screen for final selection. The recognition vocabulary consists of Chinese, Japanese, and Korean characters, English letters, and numerical digits.
Sticky Notes in Augmented Reality
Over time, we transform the areas surrounding our desktop computers into rich landscapes of information and interaction cues. Of the variety of at-hand physical media we use to support our virtual activities, none are as flexible and ubiquitous as the simple sticky note. Sticky notes can be placed on any surface, as prominent or as peripheral as desired, and can be created, posted, updated, and relocated according to the flow of our activities. When we engage in mobile computing, however, we lose the benefit of an inhabited interaction context. The sticky notes we create at our kitchen table cannot stay there, nor are they visible from the living-room sofa. Moreover, our willingness to share our notes with family and colleagues does not extend to the people around us in public places, such as coffee shops and libraries. Our system addresses these problems by enabling users to post and manipulate virtual sticky notes in real 3-D space, anchored relative to the screen of their portable computer. These interactive notes exist in mixed reality, viewed on the virtual screen of a portable computer by moving a Webcam or camera phone through the physical space around it. Whenever the camera is at rest, the computer screen reverts to its normal display. In this way, portable computer users periodically can scan their private virtual sticky notes wherever they are, taking advantage of a persistent interaction context created and manipulated through embodied physical interaction.
Tool Kit for Visualizing Large-Scale Data
Our tool kit provides a set of Silverlight/Ajax controls for visualizing large-scale structured data from various data sources. Those controls can be used to expose graphically the structure of the data, trends, and relationships of data properties. We also provide a platform that enables rapid development of a large-scale data-explorer, analysis, and reporting tools. We will demonstrate several individual controls and a demo application built atop our tool kit. This demo application lets users explore a large set of article, image, and video data directly and easily.
TechFest Overview – Video Montage
Microsoft Research’s annual innovation fair, TechFest, draws thousands of Microsoft employees to view futuristic projects that grow out of the company’s global investment in basic research. Take a quick journey through some innovations and cool technologies showcased during past events.
Principal Researcher Zhengyou Zhang talks about improving audioconferencing through the use of audio spatialization at this year’s Microsoft Research TechFest.
This demo explores how to add pointing input capabilities to very small screen devices. On first sight, touch-screens seem to allow for particular compactness, because they integrate input and screen into the same physical space. The opposite is true, however, because the users fingers occlude contents and prevent precision. Microsoft Research Redmond has created a 2.4 inch prototype that allows a back-of-device interface. This clip explains how and why it works.
Ivan Tashev, Y.C. Ju, and Mike Seltzer from Microsoft Research Redmond present novel technologies that enable natural-language input, expose a multimodel user interface including speech, a GUI, touch and buttons; and use the state-of-the-art sound-capture and processing technologies for improved speech recognition and sound quality.
Michael Cohen, Principal Researcher from Microsoft Research Redmond explains his TechFest 2009 treasure hunt demonstration using Core Tools for Augmented Reality.
From archiving your memorabilia to new ways to communicate with your family, Principal Researcher Richard Harper from Microsoft Research Cambridge, demonstrates a suite of new household technologies.
In existing online-advertisement platforms, the relevance between advertisers and users is decided largely by advanced keyword matching. With this, advertisers bid on images instead of keywords. For example, a toy seller could bid on the image of a related movie poster, while a restaurant could bid on the image of a cooking magazine cover. Users would receive ads based on the content of images they recently browsed or used.
Microsoft Research Director of Software Architecture Jim Larus demonstrates a prototype datacenter for studying the tradeoffs of low-power processors in the datacenter.
Microsoft Research exposes new natural language technologies that raise the bar for functionality, usability and reliability of in-car infotainment systems.
This demo will explain the development of a new kind of image feature that can be used within a variety of applications, ranging from image stitching to augmented reality. The features are already finding their way into Microsoft products and are being considered for many new applications. The demo shows a fun example application: A treasure hunt! By using the posters and other graphics on display during TechFest, the users then borrow a device that augments the workd wih virtual clues to find the hidden treasure.
The increasing availability of GPS-enabled devices is changing the way people interact with the web and brings a large number of GPS trajectories representing peoples’ location histories. GeoLife 2.0 is a location based social-networking service on Virtual Earth that enables people to build connections with each other using their location histories. By mining the similarity between location histories, this system can help a user automatically discover potential friends in a community with similar interests. This system also enables travel experts who can help plot the ideal sight-seeing locations for you to visit in any part of the world.
Today, it’s easy to share a webpage or a blog post, because items on the web have unique IDs: URLs. We don’t have this on the desktop. Social Desktop adds URLs to the files and folders on your desktop, letting you share anything on your computer with anyone who can click a URL. Persons receiving links can either access via e-mail or comment, tag and search across all shared items via the Social Desktop webpage
Lei MA from Microsoft Research Asia, shows how to input characters into devices such as Xboxes and TV’s using a low-cost Webcam using air gestures instead of standard keyboard & mouse input. The gestures captured by the camera will be fed into a robust handwriting recognizer.