Microsoft Research https://www.microsoft.com/en-us/research Fri, 28 Jul 2017 05:26:46 +0000 en-US hourly 1 https://wordpress.org/?v=4.8 AI with creative eyes amplifies the artistic sense of everyone https://www.microsoft.com/en-us/research/blog/ai-with-creative-eyes-amplifies-the-artistic-sense-of-everyone/ Thu, 27 Jul 2017 13:00:39 +0000 https://www.microsoft.com/en-us/research/?p=416570 By Gang Hua, Principal Researcher, Research Manager Recent advances in the branch of artificial intelligence (AI) known as machine learning are helping everyone, including artistically challenged people such as myself, transform images and videos into creative and shareable works of art. AI-powered computer vision techniques pioneered by researchers from Microsoft’s Redmond and Beijing research labs, […]

The post AI with creative eyes amplifies the artistic sense of everyone appeared first on Microsoft Research.

]]>

By Gang Hua, Principal Researcher, Research Manager

Recent advances in the branch of artificial intelligence (AI) known as machine learning are helping everyone, including artistically challenged people such as myself, transform images and videos into creative and shareable works of art.

AI-powered computer vision techniques pioneered by researchers from Microsoft’s Redmond and Beijing research labs, for example, provide new ways for people to transfer artistic styles to their photographs and videos as well as swap the visual style of two images, such as the face of a character from the movie Avatar and Mona Lisa.

The style transfer technique for photographs, known as StyleBank, shipped this June in an update to Microsoft Pix, a smartphone application that uses intelligent algorithms published in more than 20 research papers from Microsoft Research to help users get great photos with every tap of the shutter button.

The field of style transfer research explores ways to transfer an artistic style from one image to another, such as the style of post-impressionism onto a picture of your flower garden. For applications such as Microsoft Pix, a challenge is to offer users multiple styles to choose from and the ability to transfer styles to their images quickly and efficiently.

Our solution, StyleBank, explicitly represents visual styles as a set of convolutional filter banks, with each bank representing one style. To transfer an image to a specific style, an auto-encoder decomposes the input image into multi-layer feature maps that are independent of any styles. The corresponding filter bank for a chosen style is convolved with the feature maps and then go through a decoder to render the image in the chosen style.

The network completely decouples styles from the content. Because of this explicit representation, we can both train new styles and render stylized images more efficiently compared to existing offerings in this space.

The StyleBank research is a collaboration between Beijing lab researchers Lu Yuan and Jing Liao, intern Dongdong Chen and me. We collaborated closely with the broader Microsoft Pix team within Microsoft’s research organization to integrate the style transfer feature with the smartphone application. Our team presented the work at the 2017 Conference on Computer Vision and Pattern Recognition July 21-26 in Honolulu, Hawaii.

We are also extending the StyleBank technology to render stable stylized videos in an online fashion. Our technique is described in a paper to be presented at the 2017 International Conference on Computer Vision in Venice, Italy, October 22-29.

Our approach leverages temporal information about feature correspondences between consecutive frames to achieve consistent and stable stylized video sequences in near real time. The technique adaptively blends feature maps from the previous frame and the current frame to avoid ghosting artifacts, which are prevalent in techniques that render videos frame-by-frame.

A third paper that I co-authored with Jing Liao and Lu Yuan along with my Redmond colleague Sing Bing Kang for presentation at SIGGRAPH 2017 July 30 – August 2 in Los Angeles, describes a technique for visual attribute transfer across images with distinct appearances but with perceptually similar semantic structure – that is, the images contain similar visual content.

For example, the technique can put the face of a character from the movie Avatar onto an image of Leonardo da Vinci’s famous painting of Mona Lisa and the face of Mona Lisa onto the character from Avatar. We call our technique deep image analogy. It works by finding dense semantic correspondences between two input images.

We look forward to sharing more details about these techniques to transform images and videos into creative and shareable works of art at the premier computer vision conferences this summer and fall.

Related:

The post AI with creative eyes amplifies the artistic sense of everyone appeared first on Microsoft Research.

]]>
Transfer learning for machine reading comprehension https://www.microsoft.com/en-us/research/blog/transfer-learning-machine-reading-comprehension/ Wed, 26 Jul 2017 13:00:37 +0000 https://www.microsoft.com/en-us/research/?p=416867 By Xiaodong He, Principal Researcher, Microsoft Research For human beings, reading comprehension is a basic task, performed daily. As early as in elementary school, we can read an article, and answer questions about its key ideas and details. But for AI, full reading comprehension is still an elusive goal–but a necessary one if we’re going […]

The post Transfer learning for machine reading comprehension appeared first on Microsoft Research.

]]>

By Xiaodong He, Principal Researcher, Microsoft Research

For human beings, reading comprehension is a basic task, performed daily. As early as in elementary school, we can read an article, and answer questions about its key ideas and details.

But for AI, full reading comprehension is still an elusive goal–but a necessary one if we’re going to measure and achieve general intelligence AI.  In practice, reading comprehension is necessary for many real-world scenarios, including customer support, recommendations, question answering, dialog and customer relationship management. It has incredible potential for situations such as helping a doctor quickly find important information amid thousands of documents, saving their time for higher-value and potentially life-saving work.

Therefore, building machines that are able to perform machine reading comprehension (MRC) is of great interest. In search applications, machine comprehension will give a precise answer rather than a URL that contains the answer somewhere within a lengthy web page. Moreover, machine comprehension models can understand specific knowledge embedded in articles that usually cover narrow and specific domains, where the search data that algorithms depend upon is sparse.

Microsoft is focused on machine reading and is currently leading a competition in the field. Multiple projects at Microsoft, including Deep Learning for Machine Comprehension, have also set their sights on MRC. Despite great progress, a key problem has been overlooked until recently–how to build an MRC system for a new domain?

Recently, several researchers from Microsoft Research AI, including Po-Sen Huang,  Xiaodong He and intern David Golub, from Stanford University, developed a transfer learning algorithm for MRC to attack this problem. Their work is going to be presented at EMNLP 2017, a top natural language processing conference. This is a key step towards developing a scalable solution to extend MRC to a wider range of domains.

It is an example of the progress we are making toward a broader goal we have at Microsoft: creating technology with more sophisticated and nuanced capabilities. “We’re not just going to build a bunch of algorithms to solve theoretical problems. We’re using them to solve real problems and testing them on real data,” said Rangan Majumder in the machine reading blog.

Currently, most state-of-the-art machine reading systems are built on supervised training data–trained end-to-end on data examples, containing not only the articles but also manually labeled questions about articles and corresponding answers. With these examples, the deep learning-based MRC model learns to understand the questions and infer the answers from the article, which involves multiple steps of reasoning and inference.

However, for many domains or verticals, this supervised training data does not exist. For example, if we need to build a new machine reading system to help doctors find important information about a new disease, there could be many documents available, but there is a lack of manually labeled questions about the articles, and the corresponding answers. This challenge is magnified by both the need to build a  separate MRC system for each different disease, and that the volume of literature is increasing rapidly. Therefore, it is of crucial importance to figure out how to transfer an MRC system to a new domain where no manually labeled questions and answers are available, but there is a body of documents.

Microsoft researchers developed a novel model called “two stage synthesis network,” or SynNet, to address this critical need. In this approach, based on the supervised data available in one domain, the SynNet first learns a general pattern of identifying potential “interestingness” in an article. These are key knowledge points, named entities, or semantic concepts that are usually answers that people may ask for. Then, in the second stage, the model learns to form natural language questions around these potential answers, within the context of the article. Once trained, the SynNet can be applied to a new domain, read the documents in the new domain and then generate pseudo questions and answers against these documents. Then, it forms the necessary training data to train an MRC system for that new domain, which could be a new disease, an employee handbook of a new company, or a new product manual.

The idea of generating synthetic data to augment insufficient training data has been explored before. For example, for the target task of translation, Rico Sennrich and colleagues present a method in their paper to generate synthetic translations given real sentences to refine an existing machine translation system. However, unlike machine translation, for tasks like MRC, we need to synthesize both questions and answers for an article. Moreover, while the question is a syntactically fluent natural language sentence, the answer is mostly a salient semantic concept in the paragraph, such as a named entity, an action, or a number. Since the answer has a different linguistic structure than the question, it may be more appropriate to view answers and questions as two different types of data.

In our approach, we decompose the process of generating question-answer pairs into two steps: The answer generation conditioned on the paragraph and the question generation conditioned on the paragraph and the answer. We generate the answer first because answers are usually key semantic concepts, while questions can be viewed as a full sentence composed to inquire about the concept.

The SynNet is trained to synthesize the answer and the question of a given paragraph. The first stage of the model, an answer synthesis module, uses a bi-directional long short-term memory (LSTM) to predict inside-outside beginning (IOB) tags on the input paragraph, which mark out key semantic concepts that are likely answers. The second stage, a question synthesis module, uses a uni-directional LSTM to generate the question, while attending on embeddings of the words in the paragraph and IOB IDs. Although multiple spans in the paragraph could be identified as potential answers, we pick one span when generating the question.

Two examples of generated questions and answers from articles are illustrated below:

Using the SynNet, we were able to get more accurate results on a new domain without any additional training data, approaching to the performance of a fully supervised MRC system.

SynNet, trained on SQuAD (Wikipedia articles), performs almost as well on the NewsQA domain (news articles), as a system fully trained on NewsQA.

The SynNet is like a teacher, who, based on her experience in previous domains, creates questions and answers from articles in the new domain, and uses these materials to teach her students to perform reading comprehension in the new domain. Accordingly, Microsoft researchers also developed a set of neural machine reading models, including the recently developed ReasoNet that has shown a lot of promise, which are like the students who learn from the teaching materials to answer questions based on the article.

To our knowledge, this is the first attempt to apply MRC domain transferring. We are looking forward to developing scalable solutions that rapidly expand the capability of MRC to release the game-changing potential of machine reading!

Related:

 

The post Transfer learning for machine reading comprehension appeared first on Microsoft Research.

]]>
Researchers build nanoscale computational circuit boards with DNA https://www.microsoft.com/en-us/research/blog/researchers-build-nanoscale-computational-circuit-boards-dna/ Mon, 24 Jul 2017 16:19:36 +0000 https://www.microsoft.com/en-us/research/?p=416258 By Microsoft Research Human-engineered systems, from ancient irrigation networks to modern semiconductor circuitry, rely on spatial organization to guide the flow of materials and information. Living cells also use spatial organization to control and accelerate the transmission of molecular signals, for example by co-localizing the components of enzyme cascades and signaling networks. In a new […]

The post Researchers build nanoscale computational circuit boards with DNA appeared first on Microsoft Research.

]]>

By Microsoft Research

Human-engineered systems, from ancient irrigation networks to modern semiconductor circuitry, rely on spatial organization to guide the flow of materials and information. Living cells also use spatial organization to control and accelerate the transmission of molecular signals, for example by co-localizing the components of enzyme cascades and signaling networks. In a new paper published today by the journal Nature Nanotechnology, scientists at the University of Washington and Microsoft Research describe a method that uses spatial organization to build nanoscale computational circuits made of synthetic DNA. Called “DNA domino” circuits, they consist of DNA “domino” molecules that are positioned at regular intervals on a DNA surface. Information is transmitted when DNA dominoes interact with their immediate neighbors in a cascade.

For decades, scientists in the field of molecular programming have been studying how to use DNA molecules to compute. This includes developing algorithms that operate effectively at the molecular scale, and identifying fundamental principles of molecular computation. The components of these molecular devices are typically made from strands of synthetic DNA, where the sequence of the strands determines how they interact. Real-world applications of these devices could in the future include in vitro diagnostics of pathogens, biomanufacturing of materials, smart therapeutics and high-precision methods for imaging and probing biological experiments. So far, however, most of these devices have been designed to operate in a chemical soup, where billions of DNA molecules rely on the relatively slow process of random diffusion to bump into each other and execute a computational step. This limits the speed of the computation and the number of different components that can effectively be used. This is because the freely diffusing DNA molecules can collide with each other at random, so they must be carefully designed to avoid unintended computations when these random collisions occur.

DNA domino circuits represent an important advance. They were developed through a collaboration between Georg Seelig’s lab at the University of Washington in Seattle and Andrew Phillips’s Biological Computation group at Microsoft Research. Since DNA dominoes are positioned close to each other on a surface, they can quickly interact with their immediate neighbors without relying on random diffusion for each computational step. This can lead to an order of magnitude increase in speed compared to circuits where all the components are freely diffusing. In addition, DNA dominoes can be re-used in multiple locations with almost no interference since their physical location, in addition to their chemical specificity, determines what interactions can and cannot take place.

The scaffold that secures the DNA dominoes is assembled from hundreds of DNA strands using a technique called DNA origami that was first described in 2006. A long single strand of DNA, called the scaffold, is pinned into a rectangular shape by shorter DNA strands called staples. To build a nanoscale computational circuit on a DNA origami surface, individual DNA dominoes are incorporated into the origami during the folding process using special types of elongated staples. Each of these staples is precisely positioned on the same side of the origami scaffold, and folds over into a hairpin shape to form a DNA domino (see figure).

The researchers used this precise positioning to layout the DNA dominoes into signal transmission lines, similar to lines of real dominoes, and elementary Boolean logic gates that compute the logical AND and OR of two inputs. By linking these elementary gates together, the researchers created more complex circuits such as a two-input dual-rail XNOR circuit (see figure), which can in principle be used as the building block for a molecular computer. Freely diffusing DNA strands act as inputs to the circuits, while a single type of DNA fuel strand powers the transmission of signals between neighboring DNA dominoes. The researchers constructed detailed computational models of their designs and used extensive experimental measurements to identify the model parameters and quantify their uncertainty. This modeling allowed the researchers to accurately predict the behavior of more complex circuits, speeding up the design process.

This new approach lays the groundwork for using spatial constraints in molecular engineering more broadly and could help bring embedded molecular control circuits closer to practical applications in biosensing, nanomaterial assembly and therapeutic DNA robots.

Related:

The research was funded by the National Science Foundation, Office of Naval Research and Microsoft Research.

The paper, “A spatially localized architecture for fast and modular DNA computing,” was published on Nature Nanotechnology’s website on July 24, 2017, and will appear at a later date in the print issue of the journal. In addition to Phillips and Seelig, co-authors are Gourab Chatterjee, a doctoral student at the University of Washington, Richard Muscat, formerly a postdoctoral associate at the University of Washington and currently at Cancer Research UK, and Neil Dalchau of Microsoft Research.

 

The post Researchers build nanoscale computational circuit boards with DNA appeared first on Microsoft Research.

]]>
Second version of HoloLens HPU will incorporate AI coprocessor for implementing DNNs https://www.microsoft.com/en-us/research/blog/second-version-hololens-hpu-will-incorporate-ai-coprocessor-implementing-dnns/ Sun, 23 Jul 2017 07:00:17 +0000 https://www.microsoft.com/en-us/research/?p=415760 By Marc Pollefeys, Director of Science, HoloLens It is not an exaggeration to say that deep learning has taken the world of computer vision, and many other recognition tasks, by storm. Many of the most difficult recognition problems have seen gains over the past few years that are astonishing. Although we have seen large improvements […]

The post Second version of HoloLens HPU will incorporate AI coprocessor for implementing DNNs appeared first on Microsoft Research.

]]>

By Marc Pollefeys, Director of Science, HoloLens

It is not an exaggeration to say that deep learning has taken the world of computer vision, and many other recognition tasks, by storm. Many of the most difficult recognition problems have seen gains over the past few years that are astonishing.

Although we have seen large improvements in the accuracy of recognition as a result of Deep Neural Networks (DNNs), deep learning approaches have two well-known challenges: they require large amounts of labelled data for training, and they require a type of compute that is not amenable to current general purpose processor/memory architectures. Some companies have responded with architectures designed to address the particular type of massively parallel compute required for DNNs, including our own use of FPGAs, for example, but to date these approaches have primarily enhanced existing cloud computing fabrics.

But I work on HoloLens, and in HoloLens, we’re in the business of making untethered mixed reality devices. We put the battery on your head, in addition to the compute, the sensors, and the display. Any compute we want to run locally for low-latency, which you need for things like hand-tracking, has to run off the same battery that powers everything else. So what do you do?

You create custom silicon to do it.

 

First, a bit of background. HoloLens contains a custom multiprocessor called the Holographic Processing Unit, or HPU. It is responsible for processing the information coming from all of the on-board sensors, including Microsoft’s custom time-of-flight depth sensor, head-tracking cameras, the inertial measurement unit (IMU), and the infrared camera. The HPU is part of what makes HoloLens the world’s first–and still only–fully self-contained holographic computer.

Today, Harry Shum, executive vice president of our Artificial Intelligence and Research Group, announced in a keynote speech at CVPR 2017, that the second version of the HPU, currently under development, will incorporate an AI coprocessor to natively and flexibly implement DNNs. The chip supports a wide variety of layer types, fully programmable by us. Harry showed an early spin of the second version of the HPU running live code implementing hand segmentation.

The AI coprocessor is designed to work in the next version of HoloLens, running continuously, off the HoloLens battery. This is just one example of the new capabilities we are developing for HoloLens, and is the kind of thing you can do when you have the willingness and capacity to invest for the long term, as Microsoft has done throughout its history. And this is the kind of thinking you need if you’re going to develop mixed reality devices that are themselves intelligent. Mixed reality and artificial intelligence represent the future of computing, and we’re excited to be advancing this frontier.

Related:

The post Second version of HoloLens HPU will incorporate AI coprocessor for implementing DNNs appeared first on Microsoft Research.

]]>
Faculty Summit ’17 sessions available on-demand https://www.microsoft.com/en-us/research/blog/faculty-summit-17-on-demand/ Fri, 21 Jul 2017 16:00:32 +0000 https://www.microsoft.com/en-us/research/?p=401477 By Roy Zimmermann, Director, Microsoft Research The theme of this year’s Faculty Summit 2017, which occurred earlier this week, was The Edge of AI. The meeting on Microsoft’s sun-splashed Redmond campus involved more than 500 prominent AI academic and Microsoft researchers who brought depth and context to the theme with thought-provoking presentations and demos of leading-edge […]

The post Faculty Summit ’17 sessions available on-demand appeared first on Microsoft Research.

]]>
Faculty Summit 2017 wrap-up: Reflections from the edge

Ivan Tarapov, Senior Software Engineer, talks to attendees about Project InnerEye – Assistive AI for Cancer Treatment
Photo credit: Doug Ogle, Filmateria Digital

By Roy Zimmermann, Director, Microsoft Research

PsiBot Robot at Faculty Summit 2017

PsiBot from Mobile Directions Robot demo at the Technology Showcase

The theme of this year’s Faculty Summit 2017, which occurred earlier this week, was The Edge of AI. The meeting on Microsoft’s sun-splashed Redmond campus involved more than 500 prominent AI academic and Microsoft researchers who brought depth and context to the theme with thought-provoking presentations and demos of leading-edge research. We heard from leading luminaries in collaborative AI, deep learning, machine comprehension, deep neural nets and more. We saw demos of AI applications and services that demonstrated some aspects of AI are moving to the center of our digital lives. We also heard from some of our keynote speakers that while much progress has been made, much work remains if our AI systems are to become better at sensing, learning, reasoning and understanding natural language. And we were challenged to continue to seek out errors – not just solutions – on the path toward a more general artificial intelligence.

If you attended the Summit but missed some of the sessions, or if you didn’t attend but would like to explore some of the mind-expanding content we covered in two days of keynotes and break-out sessions, please enjoy our on-demand Faculty Summit video coverage.

Recent News:

The post Faculty Summit ’17 sessions available on-demand appeared first on Microsoft Research.

]]>
Faculty Summit 2017 focuses on technical breakthroughs and societal influences https://www.microsoft.com/en-us/research/blog/faculty-summit-2017-technical-breakthroughs-societal-influences/ Sun, 16 Jul 2017 15:00:46 +0000 https://www.microsoft.com/en-us/research/?p=400487 By Eric Horvitz, Technical Fellow and Managing Director, Microsoft We’re at an inflection point for AI technologies. Rising capabilities and possibilities have been catalyzed by jumps in the availability of data and computational power. Increasing competencies in such areas as face recognition, speech recognition, translation among languages, and semi-autonomous vehicles have been met with enthusiasm […]

The post Faculty Summit 2017 focuses on technical breakthroughs and societal influences appeared first on Microsoft Research.

]]>
Eric Horvitz at Faculty Summit 2017

Photo credit: Doug Ogle, Filmateria Digital

By Eric Horvitz, Technical Fellow and Managing Director, Microsoft

We’re at an inflection point for AI technologies. Rising capabilities and possibilities have been catalyzed by jumps in the availability of data and computational power. Increasing competencies in such areas as face recognition, speech recognition, translation among languages, and semi-autonomous vehicles have been met with enthusiasm by people and organizations. However, there have also been rising discussions about the uses of the systems, especially in high-stakes areas like transportation and criminal justice, and on the broader influences of AI advances on people and society.

The Microsoft Research Faculty Summit 2017The Edge of AI, kicked off today. We are hosting talks, panel discussions, and demos on recent developments. The program committee for the summit did a fabulous job bringing together an excellent and diverse group of folks. Participants include many long-term colleagues, as well as researchers just entering the field.

Presentations and discussions will cover a rich spectrum of topics that include advances in deep learning, reinforcement learning, and probabilistic graphical models—and approaches to intelligence that draw jointly on these and other methodologies. We’ll also be examining the state-of-the-art and future states of human-AI collaboration, machine reading, and models of integrative intelligence that jointly leverage speech recognition, conversational dialog, vision, and planning. Beyond technical methods, we’ll be discussing the ethical, legal, and societal issues around the influences of AI, including the importance of developing fair and accountable machine learning and classification. We’ll also explore directions where AI promises to have deep and beneficial impact, such as applications in agriculture, sustainability, accessibility, and biomedicine.

This morning, I kicked off the Faculty Summit with a talk on the challenges and opportunities with fielding AI advances in the open world. Before diving into my main presentation, I paused to describe our newly launched organization named Microsoft Research AI (MSR AI). We publicly announced MSR AI at an event in London last week.

MSR AI brings together about one hundred folks representing top talent across multiple important subdisciplines of AI. We’re focusing together on several aspirational pursuits, including tackling several difficult and persistent AI challenges. Aspirations for MSR AI include developing a deeper understanding of methods that could support more general artificial intelligence and advancing methods aimed at augmenting human cognition and amplifying human ingenuity. We believe that working together with more coordination on our shared aspirations will be valuable in making progress.

In my talk this morning, I ended with a consideration of issues around responsibility with AI, people, and society. It’s important that computer scientists and other experts, including social scientists, psychologists, ethicists, lawyers, and economists, collaborate closely to understand, track, and provide guidance on the best paths forward for AI technologies. We need to seize the opportunity to harness these evolving technologies to enhance the quality of life and to empower people and organizations in new ways across the world. We have to be mindful about rough edges and adverse outcomes as we go, and be on alert for inadvertent effects of AI systems—even from those systems and methods we might be most optimistic about.

Beyond our research on these issues, Microsoft recently announced the formation of the Aether advisory panel. Aether is an acronym for AI and ethics in engineering and research. The Aether panel includes representatives from every division of the company and works to advise Satya Nadella and the senior leadership team of the company. The goal of the panel is to work across the company on best practices around research, engineering, and fielding of AI technologies, and to work to spot issues and potential abuses of AI before they start.

A number of efforts around the world are focused on leveraging AI advances to address important social and societal challenges. Many of our researchers have been deeply motivated by these possibilities and have developed some magical and promising approaches in numerous realms. We’ve recently rolled out several efforts. Last week, we made Seeing AI freely available. The application provides the sight-impaired with AI eyes that can help them to interpret scenes, recognize people and their emotions, and read signs, documents—and menus. We’ve also just announced the AI for Earth program, aiming the power of AI on feeding people, fighting climate change and maintaining biodiversity. In the realm of healthcare, we will be showcasing Project InnerEye, which brings an assistive tool to oncologists for fighting cancer.

A primary take-away from our Faculty Summit event: We’re continuing to build upon our 25 years of research and innovation in this area. We are working hard to address foundational AI problems, rapidly pursuing the application of innovations in real-world systems and services that can empower people in new ways, and investing in developing a deeper understanding of the influences of AI on people and society.

We hope you are able to join us for our live stream (www.microsoftfacultysummit.com) today and tomorrow.

The post Faculty Summit 2017 focuses on technical breakthroughs and societal influences appeared first on Microsoft Research.

]]>
Path Guide: A New Approach to Indoor Navigation https://www.microsoft.com/en-us/research/blog/path-guide-new-approach-indoor-navigation/ Fri, 14 Jul 2017 17:09:04 +0000 https://www.microsoft.com/en-us/research/?p=399860 By Yuanchao Shu, Associate Researcher, and Börje Karlsson, Sr. Research Dev Lead, Microsoft Research Mobile outdoor GPS navigation apps have proven to be lifesavers to countless people. With a smartphone in hand, it is easy to find your way to a destination, even in an unfamiliar city. However, it is still easy to get lost […]

The post Path Guide: A New Approach to Indoor Navigation appeared first on Microsoft Research.

]]>

Montreal’s Underground City. Photo by Tourisme Montréal.

By Yuanchao Shu, Associate Researcher, and Börje Karlsson, Sr. Research Dev Lead, Microsoft Research

Mobile outdoor GPS navigation apps have proven to be lifesavers to countless people. With a smartphone in hand, it is easy to find your way to a destination, even in an unfamiliar city. However, it is still easy to get lost indoors, where GPS satellite signals are not accurately traceable for navigation applications.

How often have you had a hard time locating that new café inside in a bustling shopping center where you were supposed to meet your friends? How often have you wandered the hallways of some office building trying desperately to find that meeting room you were supposed to be in? How convenient would it be to have an app, similar to the ones used outdoors, that works indoors? We at Microsoft Research Asia’s Cloud & Mobile Research group recently launched Path Guide, a research-based application that provides low-cost, plug-and-play indoor navigation services. Users can easily find the correct path to their destinations by simply following traces created by a “leader,” or user who has been to the location before.

Microsoft Path Guide app for Android devices.

Indoor Navigation Anywhere: Is It Beyond Reach?

Let’s start by examining why GPS cannot be relied upon for indoor navigation. The first step in GPS navigation is positioning. Essentially, the receiver chip of the GPS system running on a handheld device picks up positioning signals from satellites and calculates the coordinates of the receiving device. Signals from a GPS satellite have poor penetration and are often blocked by building walls. Moreover, even with accurate positioning results, map information that most navigation applications depend on is not widely available for indoor scenarios. Real-time GPS-based indoor navigation is therefore out of the question.

So, what are the current applicable ideas for indoor navigation? A relatively well-known navigation approach is based on Bluetooth beacon positioning. Taking Apple’s iBeacon as an example, a smartphone app can roughly work out the device’s location on a map via signals from one or more iBeacons. Based on this information, the app then can calculate a route and navigate the user to her destination. However, this solution only works in buildings where iBeacons exist and the limited Bluetooth transmission range results in high costs for deployment and maintenance in large-sized indoor environments (shopping malls and office buildings, for instance).

Another popular indoor navigation approach is built upon Wi-Fi-based positioning. Wi-Fi signals are more commonly found in indoor environments than Bluetooth beacons. Similar to the Bluetooth method, this type of solution determines the approximate position of mobile devices through radio frequency (RF) signal characteristics and triangulation processes. Different positioning systems rely on signal strength, signal phase, transmission time, RF angle of arrival, channel state information, etc., but in general, they all leverage the differences and correlations between various Wi-Fi signals to determine  positioning.

These systems can also use signal propagation models and learning algorithms to build a fingerprint map of indoor areas, and then train the system with, let’s say, radio signal strength information for positioning. However, due to the complexity of indoor environments, Wi-Fi signals are easily affected by interference and can fluctuate widely. Keeping Wi-Fi signal data up to date can lead to high maintenance costs. Additionally, positioning accuracy is limited by other factors such as the deployment density of Wi-Fi routers, how often the indoor environment changes, and the effort required to train and calibrate the system.

There are also solutions based on dedicated equipment within various indoor locations, which require the deployment of a certain number of special-purpose sensing devices, including cameras, visible light communication systems, RFID, Ultra-Wideband (UWB), infrared, ultrasound, or even laser-based instruments gears.  These solutions can greatly improve system accuracy, but widespread deployments are heavily constrained by high hardware and labor costs.

Finally,  indoor navigation generally relies on indoor maps, but map collection, data representation and data manipulation in large-sized indoor spaces are outstanding and costly issues, placing a huge question mark over the universal application of indoor navigation technologies. And for smaller buildings, owners may not have the means to collect and expose the necessary data.

So, how can we achieve low-cost, plug-and-play, scalable indoor navigation?

Path Guide: A Flexible Solution to Indoor Navigation

Taking the above knowledge into account, we turned our eyes to the smartphones that everyone uses.  Could we rely only on what’s already available there?

After generations of upgrades, mobile phones today have an increasing variety of sensors, such as accelerometers, gyroscopes, electronic compasses, barometers, etc. Based on our previous research on making the most of such smartphone sensors and in using sensor data for indoor navigation, we decided to switch approaches. Instead of doing positioning first, why not focus only on navigation, as that’s our goal? Experiments have determined that the indoor geomagnetic field is disturbed by building structures, and that it is relatively stable inside buildings. This gave us the idea of creating an indoor navigation system based on the magnetic sensor data gathered from different locations, while leveraging the other phone sensors to support real time navigation instructions.

Combining the teams’ knowledge and expertise in mobile computing, pervasive computing, and intelligent sensing, we developed Path Guide. The Path Guide Android-based mobile app is user-friendly and can be installed directly onto a user’s smartphone, without the need for indoor maps or for building to have any special pre-installed hardware (including Wi-Fi routers).

To make the system work for anyone, we developed a peer-to-peer leader/follower model. Once a user goes to an indoor location using the app to record sensor data along a path, any other user can follow that path and get there. As more users collect data, different paths can be combined that make the system even more useful. This approach has two main benefits: First, the system is completely plug-and-play. Any two users in any building can use indoor navigation from scratch. Second, by combining data from multiple users, we can amplify the benefits of every single collected path, providing more navigation opportunities to more people with improved user experience.
The Path Guide app can be used in many scenarios. For example, if you are going to a large office building for the first time to attend a client meeting, a colleague who knows where the meeting room is can act as a “guide leader” and, using Path Guide, record a trace from the entrance of the building to the meeting room.

After arriving at the meeting room, the “leader” clicks the “finish recording” button to upload the path data trace to Path Guide’s backend in the cloud. Anyone who subsequently enters the building for the meeting and is using Path Guide can then follow the shared trace step-by-step to easily locate the correct room.

In more open areas, like shopping malls, any person or even shop owners can act as “guide leader” and share path traces from multiple locations (e.g., different entrances) to get to a certain destination, such as a relatively hidden restaurant or a clothing store.

Users of Path Guide can also record a trace and follow it backwards to its starting point. For instance, in an unfamiliar garage, you can record a trace from your parking spot to the elevator, and later follow it in reverse to find your car.

Another Path Guide feature is its support of annotations during trace recording. Text, audio, and photos can be added along a path, providing more information and interactivity.

Moreover, all traces that are uploaded to the cloud can be viewed from a web browser and shared with others using a unique trace ID. This way, shop owners can post wayfinding instructions on their own websites, and meeting coordinators can attach a route to an email meeting request.

Path Guide is a research project and admittedly still has rough edges. We hope you’ll download the app. Any feedback on improvements to the app UI, its usability, or in dealing with problematic situations are greatly welcome by the research team. You can either search for Path Guide in Google Play or download it directly from the project’s official website.

Related

The post Path Guide: A New Approach to Indoor Navigation appeared first on Microsoft Research.

]]>
Transportation Data Science at Microsoft https://www.microsoft.com/en-us/research/blog/transportation-data-science-microsoft/ Thu, 13 Jul 2017 20:00:39 +0000 https://www.microsoft.com/en-us/research/?p=399668 By Vani Mandava, Director, Data Science Outreach, Microsoft Research The National Science Foundation (NSF)-supported Big Data Innovation Hubs launched a National Transportation Data Challenge with a kickoff event in Seattle in May 2017. Microsoft Outreach, through its partnership with the Big Data Hubs organized an Azure workshop and participated in a panel discussion on ‘How […]

The post Transportation Data Science at Microsoft appeared first on Microsoft Research.

]]>

West Hub Steering Committee Member Professor Kristin Tufte moderates a panel at the launch of the National Transportation Data Challenge in Seattle, May 2017.

By Vani Mandava, Director, Data Science Outreach, Microsoft Research

The National Science Foundation (NSF)-supported Big Data Innovation Hubs launched a National Transportation Data Challenge with a kickoff event in Seattle in May 2017. Microsoft Outreach, through its partnership with the Big Data Hubs organized an Azure workshop and participated in a panel discussion on ‘How Cloud Computing Can Enable Transportation Data Science.’ The kickoff was the first in a series of events that are being organized across the US to launch this challenge. It is an activity that spans all four hubs, and is expected to reach all 50 states. Several teams across Microsoft contributed ideas on recent or ongoing work on transportation data science. Below is a summary of the all the contributions that were part of the event.

  • Microsoft’s engagement with the Challenge builds upon a foundation of prior work in public safety and metro data science. The Challenge launch event highlighted a collaboration between Microsoft’s Civic Technology Engagement (CTE) group within the Corporate, External and Legal Affairs (CELA) team and DataKind, Vision Zero, the New York City Department of Transportation, Seattle Department of Transportation, and the City of New Orleans’ Office of Performance and Accountability. The project enabled an ecosystem that helped cities assign limited resources to prioritized traffic safety issues.  Adam Hecktman and Kevin Wei from the CELA CTE team also built a cool interactive Power BI dashboard that demonstrates and visualizes 300M+ bike rides in the city of Chicago.
  • Microsoft Research’s Video Analytics Towards Vision Zero was represented on the panel by Franz Loewenherz, City of Bellevue, and was mentioned by both Daniel Morgan (Chief Data Officer, USDOT), and former governor, Chris Gregoire. On June 1st, Bellevue officially launched the Video Analytics Towards Vision Zero crowdsourcing initiative. In a collaboration with organizations across North America, Bellevue, the University of Washington and Microsoft are asking for the public’s help analyzing traffic camera footage to teach computers how to identify and track people using wheelchairs, bikes, and other modes of transportation as they navigate intersections. The more people who go online, the better we can “teach” computers to scan traffic videos and recognize near-collision events (see City of Bellevue Media Release). Microsoft Research scientists leading this effort are Victor Bahl and Ganesh Ananthnarayanan.
  • Wee Hyong Tok, Principal Data Science Manager in the Cloud AI Platform group built an Azure Machine Learning based predictive model for incident severity reporting based on the National Highway Traffic Safety Administration (NHTSA) Fatality Analysis Reporting System (FARS) data. The model has an accuracy of 68% and can be used to provide a baseline model for participants. Additionally, Patrick Baumgartner and the PowerBI team built a compelling interactive PowerBI visualization based on the dataset not only demonstrate analyses, various correlations but also dive deeper into visualizing point of impact and seating position.
  • Transportation Data Science efforts extend beyond the United States. Andrew Bradley, Principal Solution Specialist on the Microsoft UK Enterprise and Partner Group (EPG), shared how the UK team is engaged with the Department of Transport, UK, and are actively encouraging innovation in the region by supporting events, hackathons and challenges.
  • Microsoft’s Connected Vehicle Platform recognizes the digital transformation that is reshaping the automotive industry (100% of new cars by 2030 are projected to be connected) and is investing in building extensible, global, and scalable automotive solutions in partnership with organizations such as Nissan, Volvo, and BMW.

We look forward to engaging with the transportation data science community as the National Transportation Data Challenge takes shape over the coming months.

Learn more

The post Transportation Data Science at Microsoft appeared first on Microsoft Research.

]]>
Find out how humans and machines are collaborating at the 2017 Microsoft Research Faculty Summit https://www.microsoft.com/en-us/research/blog/humans-machines-collaborating-2017-microsoft-research-faculty-summit/ Thu, 13 Jul 2017 16:00:26 +0000 https://www.microsoft.com/en-us/research/?p=399476 By Christopher Bishop, Program Co-Chair of Faculty Summit, Technical Fellow & Laboratory Director, Microsoft Research Cambridge The development of machine intelligence that amplifies human capabilities and experiences is at the heart of our AI research at Microsoft, which is why I’m delighted by the tremendous lineup of keynotes and panels focused on human-machine collaboration at […]

The post Find out how humans and machines are collaborating at the 2017 Microsoft Research Faculty Summit appeared first on Microsoft Research.

]]>
2017 Microsoft Research Faculty Summit - The Edge of AI

By Christopher Bishop, Program Co-Chair of Faculty Summit, Technical Fellow & Laboratory Director, Microsoft Research Cambridge

The development of machine intelligence that amplifies human capabilities and experiences is at the heart of our AI research at Microsoft, which is why I’m delighted by the tremendous lineup of keynotes and panels focused on human-machine collaboration at the 2017 Microsoft Research Faculty Summit – The Edge of AI, July 17-18.

The shift toward building machines that are smart enough to collaborate with people as capable partners and assistants is a recent development in the history of AI, one that’s being pushed by the proliferation of computing devices in every imaginable facet of life. With computers everywhere, it’s important that they’re clever enough to work with us in groups as well as individually, Barbara J. Grosz from Harvard University will explain in her talk on July 17 at 9:10am.

Other human-machine collaboration talks and sessions will highlight how researchers are pushing the boundaries of AI to augment the capabilities of people with sensory disabilities, enabling new and empowering experiences. The AI for Earth initiative illustrates how the embrace of AI can enhance human efforts to mitigate and adapt to environmental and social challenges such as climate change, biodiversity loss, and food and water scarcity.

To help set the framework for the future of human-computer collaboration, a very thoughtful panel of distinguished AI experts will discuss the development and deployment of future AI systems that partner with people on complex and open-ended tasks. Microsoft’s Ece Kamar will chair the panel, which includes Microsoft’s Eric Horvitz along with Subbarao Kambhampati of Arizona State University and Milind Tambe of the University of Southern California. The panel starts at 2:00pm on July 18.

For those of you unable to attend the 2017 Microsoft Research Faculty Summit in person, I encourage you to watch the livestream of keynotes, speakers and Research in Focus interview segments. I’m particularly excited about a July 18 livestreamed talk by Amy Greenwald of Brown University on efforts to build AI agents that make effective decisions in multiagent – part human, part artificial – environments. Her research is currently being applied to renewable energy markets and wireless spectrum auctions.

Visit www.microsoftfacultysummit.com for more information and the full virtual event agenda.

I look forward to seeing you at the event in-person or virtually as we all get together to discuss the collaboration of humans and machines at the Edge of AI.

Related:

Watch the livestream of the 2017 Microsoft Research Faculty Summit—The Edge of AI

The post Find out how humans and machines are collaborating at the 2017 Microsoft Research Faculty Summit appeared first on Microsoft Research.

]]>
What problems will we solve with a quantum computer? https://www.microsoft.com/en-us/research/blog/problems-will-solve-quantum-computer/ Wed, 05 Jul 2017 15:55:27 +0000 https://www.microsoft.com/en-us/research/?p=395276 New paper suggests quantum computers will address problems that could have substantial scientific and economic impact With rapid recent advances in quantum technology, we have drawn ever closer to the threshold of quantum devices whose computational powers can exceed those of classical supercomputers. But when a useful, scalable general-purpose quantum computer arrives, what problems will […]

The post What problems will we solve with a quantum computer? appeared first on Microsoft Research.

]]>
New paper suggests quantum computers will address problems that could have substantial scientific and economic impact

The MoFe protein, left, and the FeMoco, right, would be able to be analyzed by quantum computing to help reveal the complex chemical system behind nitrogen fixation by the enzyme nitorgense.

With rapid recent advances in quantum technology, we have drawn ever closer to the threshold of quantum devices whose computational powers can exceed those of classical supercomputers.

But when a useful, scalable general-purpose quantum computer arrives, what problems will it solve?

Much work has already been done towards identifying areas where quantum computing provides a clear improvement over traditional classical approaches. Many suspect that quantum computers will one day revolutionize chemistry and materials science; the likely ability of quantum computers to predict specific properties of molecules and materials fits this outcome nicely.

However, a number of important questions remain. Not the least of these is the question of how exactly to use a quantum computer to solve an important problem in chemistry. The inability to point to a clear use case complete with resource and cost estimates is a major drawback. After all, even an exponential speedup may not lead to a useful algorithm if a typical, practical application requires an amount of time and memory that is beyond the reach of even a quantum computer.

Our paper published earlier this week at the Proceedings of the National Academy of Sciences confirms the feasibility of such a practical application, showing that a quantum computer can be employed to reveal reaction mechanisms in complex chemical systems, using the open problem of biological nitrogen fixation in nitrogenase as an example.

Today, we spend approximately 3 percent of the world’s total energy output on making fertilizer. This relies on a process developed in the early 1900s that is extremely energy intensive—the reaction gas required is taken from natural gas, which is in turn required in very large amounts. However, we know that a tiny anaerobic bacteria in the roots of plants performs this same process every day at very low energy cost using a specific molecule—nitrogenase.

This molecule is beyond the abilities of our largest supercomputers to analyze, but would be within the reach of a moderate scale quantum computer. Efficiently capturing carbon (to combat global warming) is in the same class of problem. The search for high-temperature superconductors is another example.

This paper shows that these kinds of necessary computations can be performed in reasonable time on realistic quantum computers—demonstrating that quantum computers will one day tackle important problems in chemistry without requiring exorbitant resources. This paper also gives us further confidence that quantum simulation will be able to provide answers to problems with a tremendous potential for scientific and economic impact.

Editor’s Note: The paper’s authors contributed to this post: Markus Reiher, Nathan Wiebe, Krysta Svore, Dave Wecker and Matthias Troyer.

Related:

The post What problems will we solve with a quantum computer? appeared first on Microsoft Research.

]]>