This research project features two technologies:
- Rendered personalized head-related transfer functions (HRTFs), synthesized using anthropometric data tailored to an individual user’s audio input.
- Creation of an immersive audio experience using headphones and person/head tracking through rendered 3-D audio.
The project generates personalized HRTFs by scanning a person using a Kinect for Windows device, then using a headset to identify a predefined area. It enables the user to interact with a virtual set of physical objects-such as an AM radio, a manikin, a phone, or a television-that start to play music, speak, and ring. The user can move freely, rotate her head, and approach each individual sound source within a virtual experience.
Cortana, the world’s first truly personal digital assistant, is available soon on Windows Phone 8.1. Powered by Bing, Cortana is driven by state-of-the-art algorithms using natural language, machine learning, and contextual signals that benefit from advances incubated at Microsoft Research. This approach uses the massive clickstream-feedback loop of web search, along with additional semantics and data. This approach enables Cortana to grow in breadth of domains that cover the web: from the more common “head” queries and conversations to the less frequent “tail”-all at a personal level, empowering Cortana to provide robust personal assistance that becomes even more advanced over time.
These Technology for Nature projects aim to expand dramatically the amount and kind of data we can gather from the natural world. Zootracer uses vision and machine learning to track arbitrary objects from video. Designed to assist environmental scientists, Zootracer, a tool for general use, is complemented by Mataki, an unprecedentedly cheap, light (seven grams), and reprogrammable GPS tracking and sensing device. Uniquely, Mataki has peer-to-peer data sharing and, hence, data retrieval that can be achieved on entire collections of device-monitored of animals. The research also uses an unmanned aerial drone with an onboard camera to follow coordinates broadcast by a Mataki device attached to an animal.
Video | Project page
Many people are working on near-field user-interface devices that detect hover, gesture, and pose. But there is nothing in the space in front of a device to show the user what to do. This project introduces a floating display that hovers over the device, providing visual cues for gestural interactions.
Video | Project page
HereHere NYC is a research project that enables neighborhoods to generate opinions based on public data. The project summarizes how your neighborhood, or other New York City neighborhoods of interest, are doing via a daily email digest, neighborhood-specific Twitter feeds, and status updates on a map. The goals are to:
- Create compelling stories with data to engage larger communities.
- Invent light daily rituals for connecting to the hyperlocal.
- Using characterization as a tool to drive data engagement.
HereHere uses Project Sentient Data, an early-stage project to explore how interactions can be improved by understanding ecosystems of data in terms of characterization, personalities, and relationships. Sentient Data provides a server and a representational-state-transfer API that enables developers to assign personalities and translate data sets into their relative emotion states.
Video | Project page
Holograph is an interactive, 3-D data-visualization research platform that can render static and dynamic data above or below the plane of the display using a variety of 3-D stereographic techniques. The platform enables rapid exploration, selection, and manipulation of complex, multidimensional data to create and refine natural user-interaction techniques and technologies, with the goal of empowering everyone to understand the growing tide of large, complex data sets.
This project uses head-mounted displays, Kinect skeletal tracking, a custom hardware controller, and a large display for the public to watch vicariously. Individuals enter a virtual-reality environment created using the WorldWide Telescope and navigate through a virtual 3-D universe in orbit with the International Space Station-or inside a brain cell. The system delivers the capability for an external audience to watch avatars of the individuals exploring the environment and monitor their progress, while explorers can use Kinect and gestures to fly freely through the universe, select data and point out observations to the outside audience.
This project offers a method for creating 3-D scans of arbitrary environments in real time, utilizing only a single RGB camera as the input sensor. The camera could be one already available in a tablet or a phone, or it could be a cheap web camera. No additional input hardware is required. This removes the need for power-intensive active sensors that do not work robustly in natural outdoor lighting. In seconds, a user can generate a compelling 3-D model, which can be used in augmented reality, for 3-D printing, or in computer-aided design.
Naiad is a .NET-based platform for high-throughput, low-latency data analysis. It is suitable for traditional “big data” processing all the way through to stream processing on real-time data, complex graph analyses, and machine-learning tasks. Using Naiad on Azure enables an analyst to develop an application locally before deploying it seamlessly to the cloud. Several tools have been built atop Naiad, to use Azure to provide interactive analyses over massive data sets. Moreover, Naiad is built with extensibility in mind, providing data analysts with simple interfaces and enabling them to integrate custom business logic when required.
Societies and governments around the world want to know how the biosphere is likely to change and what we can do to avoid or mitigate against it. Current models used to provide that information, though, are akin to miniature computer games: black boxes that convey almost no sense of confidence in their reliability. This problem can be addressed objectively by combining machine learning with process-based modeling to enable assessment and comparison of alternative model formulations to identify key sources of uncertainty and, ultimately, to enable probabilistic predictions of the likely consequences of climate and environmental change. Tackling problems of this scale, researchers have designed a solution using a new model-building platform Distribution Modeller and F# can be used to build and share data-constrained, process-based models and deliver their probabilistic predictions on demand through Azure.
Presenting two new technologies that empower consumers to design and produce working physical devices customized to their particular needs:
- A highly interactive, touch-first 3-D printing app makes 3-D modeling accessible to a broad range of consumers. It uses the 3-D printer support in Windows 8.1 and adds an intuitive, block-based editor.
- A technique enabling users to create working interactive devices cheaply and easily, based on recent advances in conductive inks coupled with a new type of modular electronic components.
When combined, these two technologies give a glimpse of a future world where it is inexpensive and easy for users to build devices with customized form and function.
This research project enables users to browse the most interesting images on the web using semantically meaningful connections-as opposed to image similarity. The project determines the most interesting images via a data-mining algorithm that locates interesting sentences about the images, detects concepts in these interesting sentences, and uses them to build a graph over the images. The links in this graph enable a user to navigate the images intuitively and compellingly by clicking on concepts in the sentences.
Typing on a touchscreen device without having to look at the screen could be extremely useful in cases when it’s either dangerous, interruptive or impolite to text. Microsoft researchers developed a novel UX and robust decoder to shape write in groups of characters. To demonstrate the feasibility of this approach, the research team broke the Guinness World record for touchscreen and blind texting, and worked with the Windows Phone team to include the WordFlow feature in Windows Phone 8.1.
Projectors, tablets, and other devices, combined with depth cameras, enable applications to span multiple “screens.” This work enables webpages to be experienced outside of your PC monitors to take advantage of all your devices in concert. For example, a karaoke webpage could use your phone, tablet, big-screen television, and projectors to provide an awesome entertainment experience. Webpages also are enabled to “see” objects in the room and respond to them-all while preserving privacy from the owner of the webpage. Such a page could show you an ad near a Red Bull can, but nobody would be able to find out how much Red Bull you drink. Learn more >>
Tempe is an interactive system for exploring large data sets. It accelerates faster machine learning by facilitating quick, iterative feature engineering and data understanding. Tempe is a combination of three technologies:
- Trill: a high-speed, temporal, progressive-relational stream-processing engine 100 times faster than StreamInsight.
- WINQ: a layer that emulates LINQ but provides progressive queries-providing “best effort” partial answers.
- Stat: an interactive, C# integrated development environment that enables users to visualize progressive answers.
The combination of these technologies enables users to try and discard queries quickly, enabling much faster exploration of large data sets.
ViiBoard is a system for remote collaboration through a digital whiteboard (PPI) that gives participants an immersive, 3-D experience with enhanced touch capability. ViiBoard emulates writing side by side on a physical whiteboard or, alternatively, on a mirror, through 3-D processing of depth images and life-sized rendering. Additional vision techniques, such as hand-gesture recognition, are integrated to understand users’ intentions before they touch the board, simplifying the interaction with a PPI, especially for content editing and presenting. Compared with standard video conferencing, the ViiBoard provides participants with a better ability to estimate their remote partners’ eye-gaze direction, gesture direction, and intention. These capabilities translate into a heightened sense of being together and a more realistic experience.
Video #1 | Video #2 | Project page
Getting insights into what customers are saying about a product-and the people who are saying it-is important for companies big and small. This project presents an analytics platform atop a real-time social network that can mine user behavior and automatically identify segments where unusual patterns emerge-and provide a possible explanation for the pattern.
Urban air quality-the concentration of PM2.5-is of great importance in protecting human health. While there are limited air-quality-monitor-stations in a city, air quality varies by location significantly and is influenced by multiple complex factors, such as traffic flow and land use. Consequently, people cannot know the air quality of a location without a monitoring station. This project infers real-time, fine-grained air-quality information throughout a city, based on air-quality data reported by existing monitor stations and a variety of data sources observed in the city, such as meteorology, traffic flow, human mobility, the structure of road networks, and points of interest. This fine-grained air-quality information could help people figure out when and where to go jogging-or when they should shut the window or put on a face mask in locations where air quality is already a daily issue. This could lead to long-term solutions in predicting forthcoming air quality and identifying4 the root cause of air pollution.