From pushing the boundaries of computing beyond the screen, to helping make sense of large scale data sets for scientific discoveries, the development of new ideas and technologies is deeply woven into our DNA.
At the Silicon Valley TechFair 2014 we will share work that spans the use of big data to build local models which enable hyperlocal neighborhood interactions to scientific models intended to predict how changes in the environment will impact our world. Learn more about the research underpinnings behind Cortana, and explore how we see new user interfaces expanding beyond the screen.
Researchers presenting work
Here are a few of the researchers who will be presenting their latest research:
Larry Heck is a Distinguished Engineering in Microsoft Research. His research area is natural conversational interaction, focusing on open-domain NLP and dialog, machine learning, multimodal NUI, and inference/reasoning under uncertainty.
Curtis Wong, Principal Researcher at Microsoft Research, is responsible for basic and applied research in media and interaction. He has been granted more than 45 patents in areas such as data visualization, user interaction, interactive television, media browsing and automated cinematography, and is the primary inventor of the worldwide telescope. recently, Curtis has led the effort to enable interactive spatial temporal data visualization as a broad capability for everyone to gain insight into the growing tide of data that is being generated from devices and services.
Ivan Tashev, is a Principal Software Architect at Microsoft Research. His research focuses on multichannel audio signals processing, algorithms for arrays of transducers, processing of signals for enhancement, de-noising, de-reverberation, and statistical processing of audio, biological, radio signals. Ivan was responsible for the audio pipeline architecture, and DSP algorithms in Xbox Kinect, and Kinect for Windows, as well as audio enhancements to Xbox One.
For her work in both real world games and the early Internet of Things, Kati was named one of the “Top 35 Innovators Under 35” by MIT’s Technology Review Magazine (2010), “Top 100 Most Creative People in Business” by Fast Company Magazine (2011), and awarded the World Technology Network award in Entertainment (2011). She teaches the graduate course “Persuasive Technology: Designing the Human” at NYU’s ITP, and frequently speaks on online and offline engagement, economies, games, and sensors.
Her work has been covered by Businessweek, the New York Times, Wired, National Geographic, and Glamour Magazine, among others. She has worked with clients including the John S. and James L. Knight Foundation, Foursquare, the United Kingdom’s Department for Transport, the BBC, Channel 4, the Carnegie Institute, Disney Imagineering, Nike, Discovery Channel, CBS, MTV, and the Peter G. Peterson Foundation. Her work is represented in the permanent collection of MOMA and has been exhibited at the Design Museum of London and Museum of Science & Industry.
Kati is currently a Senior Researcher at Microsoft Research, FUSE (Future User Social Experiences) [Microsoft Research] / [FUSE]. Previously, she was Director of Product for Zynga New York and Vice President and Senior Producer at Area/Code (acquired by Zynga). In 2012 she became Innovator-in-Residence at USC’s Annenberg School, where she led workshops in Design Patterns for Autonomous Objects.
3-D Audio for Telepresence and Virtual Reality
This research project features two technologies:
- Rendered personalized head-related transfer functions (HRTFs), synthesized using anthropometric data tailored to an individual user’s audio input.
- Creation of an immersive audio experience using headphones and person/head tracking through rendered 3-D audio.
The project generates personalized HRTFs by scanning a person using a Kinect for Windows device, then using a headset to identify a predefined area. It enables the user to interact with a virtual set of physical objects-such as an AM radio, a manikin, a phone, or a television-that start to play music, speak, and ring. The user can move freely, rotate her head, and approach each individual sound source within a virtual experience.
Cortana's Research-Based Foundation
Cortana, the world’s first truly personal digital assistant, is available soon on Windows Phone 8.1. Powered by Bing, Cortana is driven by state-of-the-art algorithms using natural language, machine learning, and contextual signals that benefit from advances incubated at Microsoft Research. This approach uses the massive clickstream-feedback loop of web search, along with additional semantics and data. This approach enables Cortana to grow in breadth of domains that cover the web: from the more common “head” queries and conversations to the less frequent “tail”-all at a personal level, empowering Cortana to provide robust personal assistance that becomes even more advanced over time.
Film, Identify, Track, Tag, Sense, Fly
These Technology for Nature projects aim to expand dramatically the amount and kind of data we can gather from the natural world. Zootracer uses vision and machine learning to track arbitrary objects from video. Designed to assist environmental scientists, Zootracer, a tool for general use, is complemented by Mataki, an unprecedentedly cheap, light (seven grams), and reprogrammable GPS tracking and sensing device. Uniquely, Mataki has peer-to-peer data sharing and, hence, data retrieval that can be achieved on entire collections of device-monitored of animals. The research also uses an unmanned aerial drone with an onboard camera to follow coordinates broadcast by a Mataki device attached to an animal.
Many people are working on near-field user-interface devices that detect hover, gesture, and pose. But there is nothing in the space in front of a device to show the user what to do. This project introduces a floating display that hovers over the device, providing visual cues for gestural interactions.
HereHere NYC is a research project that enables neighborhoods to generate opinions based on public data. The project summarizes how your neighborhood, or other New York City neighborhoods of interest, are doing via a daily email digest, neighborhood-specific Twitter feeds, and status updates on a map. The goals are to:
- Create compelling stories with data to engage larger communities.
- Invent light daily rituals for connecting to the hyperlocal.
- Using characterization as a tool to drive data engagement.
HereHere uses Project Sentient Data, an early-stage project to explore how interactions can be improved by understanding ecosystems of data in terms of characterization, personalities, and relationships. Sentient Data provides a server and a representational-state-transfer API that enables developers to assign personalities and translate data sets into their relative emotion states.
Holograph is an interactive, 3-D data-visualization research platform that can render static and dynamic data above or below the plane of the display using a variety of 3-D stereographic techniques. The platform enables rapid exploration, selection, and manipulation of complex, multidimensional data to create and refine natural user-interaction techniques and technologies, with the goal of empowering everyone to understand the growing tide of large, complex data sets.
Immersive, Collaborative Data Visualization
This project uses head-mounted displays, Kinect skeletal tracking, a custom hardware controller, and a large display for the public to watch vicariously. Individuals enter a virtual-reality environment created using the WorldWide Telescope and navigate through a virtual 3-D universe in orbit with the International Space Station-or inside a brain cell. The system delivers the capability for an external audience to watch avatars of the individuals exploring the environment and monitor their progress, while explorers can use Kinect and gestures to fly freely through the universe, select data and point out observations to the outside audience.
MonoFusion: Scanning Objects in Real Time with a Single Web Camera
This project offers a method for creating 3-D scans of arbitrary environments in real time, utilizing only a single RGB camera as the input sensor. The camera could be one already available in a tablet or a phone, or it could be a cheap web camera. No additional input hardware is required. This removes the need for power-intensive active sensors that do not work robustly in natural outdoor lighting. In seconds, a user can generate a compelling 3-D model, which can be used in augmented reality, for 3-D printing, or in computer-aided design.
Naiad on Azure: Rich, Interactive Cloud Analytics
Naiad is a .NET-based platform for high-throughput, low-latency data analysis. It is suitable for traditional “big data” processing all the way through to stream processing on real-time data, complex graph analyses, and machine-learning tasks. Using Naiad on Azure enables an analyst to develop an application locally before deploying it seamlessly to the cloud. Several tools have been built atop Naiad, to use Azure to provide interactive analyses over massive data sets. Moreover, Naiad is built with extensibility in mind, providing data analysts with simple interfaces and enabling them to integrate custom business logic when required.
Societies and governments around the world want to know how the biosphere is likely to change and what we can do to avoid or mitigate against it. Current models used to provide that information, though, are akin to miniature computer games: black boxes that convey almost no sense of confidence in their reliability. This problem can be addressed objectively by combining machine learning with process-based modeling to enable assessment and comparison of alternative model formulations to identify key sources of uncertainty and, ultimately, to enable probabilistic predictions of the likely consequences of climate and environmental change. Tackling problems of this scale, researchers have designed a solution using a new model-building platform Distribution Modeller and F# can be used to build and share data-constrained, process-based models and deliver their probabilistic predictions on demand through Azure.
Presenting two new technologies that empower consumers to design and produce working physical devices customized to their particular needs:
- A highly interactive, touch-first 3-D printing app makes 3-D modeling accessible to a broad range of consumers. It uses the 3-D printer support in Windows 8.1 and adds an intuitive, block-based editor.
- A technique enabling users to create working interactive devices cheaply and easily, based on recent advances in conductive inks coupled with a new type of modular electronic components.
When combined, these two technologies give a glimpse of a future world where it is inexpensive and easy for users to build devices with customized form and function.
Semantic Browsing of Interesting Images on the Web
This research project enables users to browse the most interesting images on the web using semantically meaningful connections-as opposed to image similarity. The project determines the most interesting images via a data-mining algorithm that locates interesting sentences about the images, detects concepts in these interesting sentences, and uses them to build a graph over the images. The links in this graph enable a user to navigate the images intuitively and compellingly by clicking on concepts in the sentences.
Shape-Writing Enhancements for Windows Phone
Typing on a touchscreen device without having to look at the screen could be extremely useful in cases when it’s either dangerous, interruptive or impolite to text. Microsoft researchers developed a novel UX and robust decoder to shape write in groups of characters. To demonstrate the feasibility of this approach, the research team broke the Guinness World record for touchscreen and blind texting, and worked with the Windows Phone team to include the WordFlow feature in Windows Phone 8.1.
SurroundWeb: Spreading the Web to Multiple Screens
Projectors, tablets, and other devices, combined with depth cameras, enable applications to span multiple “screens.” This work enables webpages to be experienced outside of your PC monitors to take advantage of all your devices in concert. For example, a karaoke webpage could use your phone, tablet, big-screen television, and projectors to provide an awesome entertainment experience. Webpages also are enabled to “see” objects in the room and respond to them-all while preserving privacy from the owner of the webpage. Such a page could show you an ad near a Red Bull can, but nobody would be able to find out how much Red Bull you drink. Learn more >>
Tempe: Quick Answers from Large Data
Tempe is an interactive system for exploring large data sets. It accelerates faster machine learning by facilitating quick, iterative feature engineering and data understanding. Tempe is a combination of three technologies:
- Trill: a high-speed, temporal, progressive-relational stream-processing engine 100 times faster than StreamInsight.
- WINQ: a layer that emulates LINQ but provides progressive queries-providing “best effort” partial answers.
- Stat: an interactive, C# integrated development environment that enables users to visualize progressive answers.
The combination of these technologies enables users to try and discard queries quickly, enabling much faster exploration of large data sets.
ViiBoard: Vision-Enhanced Immersive Interaction
ViiBoard is a system for remote collaboration through a digital whiteboard (PPI) that gives participants an immersive, 3-D experience with enhanced touch capability. ViiBoard emulates writing side by side on a physical whiteboard or, alternatively, on a mirror, through 3-D processing of depth images and life-sized rendering. Additional vision techniques, such as hand-gesture recognition, are integrated to understand users’ intentions before they touch the board, simplifying the interaction with a PPI, especially for content editing and presenting. Compared with standard video conferencing, the ViiBoard provides participants with a better ability to estimate their remote partners’ eye-gaze direction, gesture direction, and intention. These capabilities translate into a heightened sense of being together and a more realistic experience.
WaveFour: Social Analytics Platform for Businesses
Getting insights into what customers are saying about a product-and the people who are saying it-is important for companies big and small. This project presents an analytics platform atop a real-time social network that can mine user behavior and automatically identify segments where unusual patterns emerge-and provide a possible explanation for the pattern.
When Urban Air Quality Meets Big Data
Urban air quality-the concentration of PM2.5-is of great importance in protecting human health. While there are limited air-quality-monitor-stations in a city, air quality varies by location significantly and is influenced by multiple complex factors, such as traffic flow and land use. Consequently, people cannot know the air quality of a location without a monitoring station. This project infers real-time, fine-grained air-quality information throughout a city, based on air-quality data reported by existing monitor stations and a variety of data sources observed in the city, such as meteorology, traffic flow, human mobility, the structure of road networks, and points of interest. This fine-grained air-quality information could help people figure out when and where to go jogging-or when they should shut the window or put on a face mask in locations where air quality is already a daily issue. This could lead to long-term solutions in predicting forthcoming air quality and identifying4 the root cause of air pollution.