Microsoft Research https://www.microsoft.com/en-us/research Mon, 25 Sep 2017 21:06:52 +0000 en-US hourly 1 https://wordpress.org/?v=4.8.2 Nominations wanted: Microsoft Research PhD Fellowship Program 2018 https://www.microsoft.com/en-us/research/blog/applicants-wanted-microsoft-research-phd-fellowship-program-2018/ Tue, 05 Sep 2017 15:35:20 +0000 https://www.microsoft.com/en-us/research/?p=423018 By Sandy Blyth, Managing Director At Microsoft Research, we are on the lookout for exceptional students to apply for our two-year PhD fellowship program. Our fellowships are for students in computer science, electrical engineering and mathematics, as well as interdisciplinary studies intersecting with those domains such as computational biology, social sciences and economics. We encourage […]

The post Nominations wanted: Microsoft Research PhD Fellowship Program 2018 appeared first on Microsoft Research.

]]>

By Sandy Blyth, Managing Director

At Microsoft Research, we are on the lookout for exceptional students to apply for our two-year PhD fellowship program. Our fellowships are for students in computer science, electrical engineering and mathematics, as well as interdisciplinary studies intersecting with those domains such as computational biology, social sciences and economics. We encourage department heads at universities in the United States and Canada to start preparing applications now to nominate outstanding fellows for the 2018-2019 academic year.

Nominated PhD students must be in their second or third year of studies. Our award committee is particularly interested in students who are working on theses related to Systems & Networking or AI, emphasizing the disciplines of machine learning, computer vision and robotics.

This coveted fellowship provides 100 percent of tuition and fees for two consecutive academic years and provides an annual stipend of $28,000 plus $4,000 annually for professional conferences and seminars. All of our fellows are also offered the opportunity to intern with leading Microsoft researchers who are working on cutting-edge projects related to their fields of study.

The Microsoft Research PhD Fellowship Program has supported 122 fellows since the program was established in 2008. Many of our past fellows continue to work with us at Microsoft; others have gone on to perform pioneering research elsewhere within the technology industry or accept faculty appointments at leading universities. A sampling of past fellows includes:

  • Eric Chung, Carnegie Mellon University, 2009-2010. Now a core member of Project Catapult that is deploying Field Programmable Gate Arrays (FPGAs) in Microsoft datacenters worldwide to accelerate efforts in networking, security, cloud services and artificial intelligence.
  • Rashmi Vinayak, University of California, Berkeley, 2013-2014. Now an assistant professor in the computer science department at Carnegie Mellon University where her research interests lie in computer and networked systems with a focus on big data systems.
  • Yoav Artzi, University of Washington, 2014-2015. Now an assistant professor in the computer science department at Cornell University where his research interests lie at the intersection of natural language processing and machine learning.

Applicants to the PhD Fellowship Program must be nominated by the chair of the computer science, electrical engineering or mathematics department at their university. We’ll accept up to three applicants per eligible department, per university, for a maximum of nine applications per university. Applications will be accepted between October 2 and October 16, 2017. Finalists will be invited to Microsoft for in-person interviews. We’ll announce the fellowship awards in January 2018.

Learn more about the Microsoft Research PhD Fellowship Program

The post Nominations wanted: Microsoft Research PhD Fellowship Program 2018 appeared first on Microsoft Research.

]]>
Microsoft unveils Project Brainwave for real-time AI https://www.microsoft.com/en-us/research/blog/microsoft-unveils-project-brainwave/ Tue, 22 Aug 2017 19:01:42 +0000 https://www.microsoft.com/en-us/research/?p=421017 By Doug Burger, Distinguished Engineer, Microsoft Today at Hot Chips 2017, our cross-Microsoft team unveiled a new deep learning acceleration platform, codenamed Project Brainwave.  I’m delighted to share more details in this post, since Project Brainwave achieves a major leap forward in both performance and flexibility for cloud-based serving of deep learning models. We designed […]

The post Microsoft unveils Project Brainwave for real-time AI appeared first on Microsoft Research.

]]>
Hot Chips Stratix 10 boardBy Doug Burger, Distinguished Engineer, Microsoft

Today at Hot Chips 2017, our cross-Microsoft team unveiled a new deep learning acceleration platform, codenamed Project Brainwave.  I’m delighted to share more details in this post, since Project Brainwave achieves a major leap forward in both performance and flexibility for cloud-based serving of deep learning models. We designed the system for real-time AI, which means the system processes requests as fast as it receives them, with ultra-low latency.  Real-time AI is becoming increasingly important as cloud infrastructures process live data streams, whether they be search queries, videos, sensor streams, or interactions with users.

The Project Brainwave system is built with three main layers:

  1. A high-performance, distributed system architecture;
  2. A hardware DNN engine synthesized onto FPGAs; and
  3. A compiler and runtime for low-friction deployment of trained models.

First, Project Brainwave leverages the massive FPGA infrastructure that Microsoft has been deploying over the past few years.  By attaching high-performance FPGAs directly to our datacenter network, we can serve DNNs as hardware microservices, where a DNN can be mapped to a pool of remote FPGAs and called by a server with no software in the loop.  This system architecture both reduces latency, since the CPU does not need to process incoming requests, and allows very high throughput, with the FPGA processing requests as fast as the network can stream them.

Second, Project Brainwave uses a powerful “soft” DNN processing unit (or DPU), synthesized onto commercially available FPGAs.  A number of companies—both large companies and a slew of startups—are building hardened DPUs.  Although some of these chips have high peak performance, they must choose their operators and data types at design time, which limits their flexibility.  Project Brainwave takes a different approach, providing a design that scales across a range of data types, with the desired data type being a synthesis-time decision.  The design combines both the ASIC digital signal processing blocks on the FPGAs and the synthesizable logic to provide a greater and more optimized number of functional units.  This approach exploits the FPGA’s flexibility in two ways.  First, we have defined highly customized, narrow-precision data types that increase performance without real losses in model accuracy.  Second, we can incorporate research innovations into the hardware platform quickly (typically a few weeks), which is essential in this fast-moving space.  As a result, we achieve performance comparable to – or greater than – many of these hard-coded DPU chips but are delivering the promised performance today.

Intel Stratix 10

At Hot Chips, Project Brainwave was demonstrated using Intel’s new 14 nm Stratix 10 FPGA.

Third, Project Brainwave incorporates a software stack designed to support the wide range of popular deep learning frameworks.  We already support Microsoft Cognitive Toolkit and Google’s Tensorflow, and plan to support many others.  We have defined a graph-based intermediate representation, to which we convert models trained in the popular frameworks, and then compile down to our high-performance infrastructure.

We architected this system to show high actual performance across a wide range of complex models, with batch-free execution.  Companies and researchers building DNN accelerators often show performance demos using convolutional neural networks (CNNs).  Since CNNs are so compute intensive, it is comparatively simple to achieve high performance numbers.  Those results are often not representative of performance on more complex models from other domains, such as LSTMs or GRUs for natural language processing.  Another technique that DNN processors often use to boost performance is running deep neural networks with high degrees of batching.  While this technique is effective for throughput-based architectures—as well as off-line scenarios such as training—it is less effective for real-time AI.  With large batches, the first query in a batch must wait for all of the many queries in the batch to complete.  Our system, designed for real-time AI, can handle complex, memory-intensive models such as LSTMs, without using batching to juice throughput.

At Hot Chips, Eric Chung and Jeremy Fowers demonstrated the Project Brainwave system ported to Intel’s new 14 nm Stratix 10 FPGA.

Even on early Stratix 10 silicon, the ported Project Brainwave system ran a large GRU model—five times larger than Resnet-50—with no batching, and achieved record-setting performance.  The demo used Microsoft’s custom 8-bit floating point format (“ms-fp8”), which does not suffer accuracy losses (on average) across a range of models.  We showed Stratix 10 sustaining 39.5 Teraflops on this large GRU, running each request in under one millisecond.  At that level of performance, the Brainwave architecture sustains execution of over 130,000 compute operations per cycle, driven by one macro-instruction being issued each 10 cycles.  Running on Stratix 10, Project Brainwave thus achieves unprecedented levels of demonstrated real-time AI performance on extremely challenging models.  As we tune the system over the next few quarters, we expect significant further performance improvements.

We are working to bring this powerful, real-time AI system to users in Azure, so that our customers can benefit from Project Brainwave directly, complementing the indirect access through our services such as Bing.  In the near future, we’ll detail when our Azure customers will be able to run their most complex deep learning models at record-setting performance.  With the Project Brainwave system incorporated at scale and available to our customers, Microsoft Azure will have industry-leading capabilities for real-time AI.

 

Related:

 

The post Microsoft unveils Project Brainwave for real-time AI appeared first on Microsoft Research.

]]>
Real world interactive learning at cusp of enabling new class of applications https://www.microsoft.com/en-us/research/blog/real-world-interactive-learning-cusp-enabling-new-class-applications/ Tue, 22 Aug 2017 17:51:34 +0000 https://www.microsoft.com/en-us/research/?p=419964 By Alekh Agarwal and John Langford, Microsoft Research New York Clicks on Microsoft’s news website MSN.com increased 26 percent when a machine-learning system based on contextual-bandit algorithms was deployed in January 2016 to personalize news articles for individual users. The same real world interactive learning technology is now available as a Microsoft Cognitive Service called […]

The post Real world interactive learning at cusp of enabling new class of applications appeared first on Microsoft Research.

]]>

By Alekh Agarwal and John Langford, Microsoft Research New York

Clicks on Microsoft’s news website MSN.com increased 26 percent when a machine-learning system based on contextual-bandit algorithms was deployed in January 2016 to personalize news articles for individual users.

The same real world interactive learning technology is now available as a Microsoft Cognitive Service called the Custom Decision Service as well as via open source on GitHub. The core contextual-bandit algorithms are also available from Vowpal Wabbit, our team’s long-term fast-learning project.

We believe contextual-bandit learning is at the cusp of enabling a new class of applications just as new algorithms and large datasets powered breakthroughs in machine-learning tasks such as object recognition and speech recognition.

Tasks such as object recognition and speech recognition are based on a paradigm called supervised learning that uses tools such as classification and regression to successfully make predictions from large amounts of high-quality data. Crucially, each example is annotated with a label indicating the desired prediction such as the object contained in an image or the utterance corresponding to a speech waveform.

Supervised learning is a good fit for many practical questions, but fails in a far greater number of scenarios.

Consider, for example, a mobile health application that gives exercise recommendations to a user and judges the recommendation’s quality based on whether the recommendation was followed. Such feedback carries less information than a label in supervised learning; knowing that the user took an exercise recommendation does not immediately imply that it was the best one to give, nor do we find out whether a different recommendation would have been better.

A common work-around is to hire a labeler to label a few data points with the best recommendations. The success of such an approach relies on the ability of the labeler to intuit what the user might want, which is a tall order since the right answer might depend on factors such as how well the user slept the previous night or whether the user experienced a stressful commute.

Furthermore, the labeling approach ignores the readily available signal that the user provided by accepting or rejecting the recommendation. In addition, obtaining labels from a labeler can be a significant cost.

Exercise recommendation is a typical example of a scenario where supervised learning fails; many common applications exhibit a similar structure. Other examples include content recommendation for news, movies and products; displaying search results and ads; and building personalized text editors that autocorrect or autosuggest based on the history of a user and the task at hand.

Solving problems such as recommendation, ad and search result display, and personalization fall under the paradigm of interactive machine learning. In this paradigm, an agent first perceives the state of the world, then takes an action and observes the feedback. The feedback is typically interpreted as a reward for this action. The goal of an agent is to maximize its reward.

Reward-based learning is known as reinforcement learning. Recent reinforcement-learning successes include AlphaGo, a computer that beat the top-ranked human player of the ancient game of go, as well as computer agents that mastered a range of Atari video games such as Breakout.

Despite these game-playing breakthroughs, reinforcement learning remains notoriously hard to apply broadly across problems such as recommendation, ad and search result display, and personalization. That’s because reinforcement learning typically requires careful tuning, which limits success to narrow domains such as game playing.

Earlier this month, we presented our Tutorial on Real World Interactive Learning at the 2017 International Conference on Machine Learning in Sydney, Australia. The tutorial describes a paradigm for contextual-bandit learning, which is less finicky than general purpose reinforcement learning and yet significantly more applicable than supervised learning.

Contextual-bandit learning has been a focus of several researchers currently in the New York and Redmond labs of Microsoft Research. This paradigm builds on the observation that the key challenge in reinforcement learning is that an agent needs to optimize long-term rewards to succeed.

For example, a reinforcement-learning agent must make a large number of moves in the game of go before it finds out whether the game is won or lost. Once the outcome is revealed, the agent has little information about the role each individual move played in this outcome, a challenge known as credit assignment.

Contextual bandits avoid the challenge of credit assignment by assuming that all the relevant feedback about the quality of an action taken is summarized in a reward. Crucially, the next observation revealed by the world to the agent is not influenced by the preceding action. This might happen, for instance, in recommendation tasks where the choice presented to one user does not affect the experience of the next user.

Over the past decade, researchers now at Microsoft Research have gained a mature understanding of several foundational concepts key to contextual-bandit learning including a concept we call multiworld testing–the ability to collect the experience of a contextual-bandit-learning agent and predict what would have happened if the agent had acted in a different manner.

The most prevalent method to understand what would have happened is A/B testing, where the agent acts according to a pre-determined alternative behavior B some fraction of the time, such as recommending exercises to some users using a different rule. In multiworld testing, we can evaluate an alternative B without the agent ever explicitly acting according to it, or even knowing what it is in advance.

The science behind multiworld testing is well-established and now publicly available as a Microsoft Cognitive Service called the Custom Decision Service. The service applies to several of the scenarios discussed here including news recommendation, ad and search result display, and personalized text editors.

The Custom Decision Service is relatively easy to use. A few lines of Javascript code suffice to embed the service into an application, along with registering the application on a portal and specifying, in the simplest case, a RSS feed for the content to be personalized.

The Custom Decision Service uses other Microsoft Cognitive Services to extract features from the content automatically as well as information about a user such as location and browser. The contextual-bandit algorithms come up with a recommendation based on the user information. Real-time online learning algorithms update the Custom Decision Service’s internal state with feedback on the decision.

In addition, an application developer has access to all the data they collect via an Azure account they specify on the sign-up portal, which allows them to leverage the multiworld testing capabilities of contextual bandits to do things such as discover better features.

An open-source version of the Custom Decision Service is available on Github and the core contextual-bandit algorithms are available on Vowpal Wabbit.

Overall, contextual bandits fit many applications reasonably well and the techniques are mature enough that production-grade systems can be built on top of them for serving a wide array of applications. Just like the emergence of large datasets for supervised learning led to some practical applications, we believe the maturing of this area might be at the cusp of enabling a whole new class of applications.

Microsoft Researchers working on contextual bandit learning include John Langford, Rob Schapire, Miro Dudik, Alekh Agarwal and Alex Slivkins in the New York lab, and Sebastien Bubeck, Lihong Li and Adith Swaminathan in the Redmond lab.

 Related:  

The post Real world interactive learning at cusp of enabling new class of applications appeared first on Microsoft Research.

]]>
Microsoft researchers achieve new conversational speech recognition milestone https://www.microsoft.com/en-us/research/blog/microsoft-researchers-achieve-new-conversational-speech-recognition-milestone/ Mon, 21 Aug 2017 00:58:40 +0000 https://www.microsoft.com/en-us/research/?p=420618 By Xuedong Huang, Technical Fellow, Microsoft Last year, Microsoft’s speech and dialog research group announced a milestone in reaching human parity on the Switchboard conversational speech recognition task, meaning we had created technology that recognized words in a conversation as well as professional human transcribers. After our transcription system reached the 5.9 percent word error […]

The post Microsoft researchers achieve new conversational speech recognition milestone appeared first on Microsoft Research.

]]>
By Xuedong Huang, Technical Fellow, Microsoft

Last year, Microsoft’s speech and dialog research group announced a milestone in reaching human parity on the Switchboard conversational speech recognition task, meaning we had created technology that recognized words in a conversation as well as professional human transcribers.

Cortana Chat

After our transcription system reached the 5.9 percent word error rate that we had measured for humans, other researchers conducted their own study, employing a more involved multi-transcriber process, which yielded a 5.1 human parity word error rate. This was consistent with prior research that showed that humans achieve higher levels of agreement on the precise words spoken as they expend more care and effort. Today, I’m excited to announce that our research team reached that 5.1 percent error rate with our speech recognition system, a new industry milestone, substantially surpassing the accuracy we achieved last year. A technical report published this weekend documents the details of our system.

Switchboard is a corpus of recorded telephone conversations that the speech research community has used for more than 20 years to benchmark speech recognition systems. The task involves transcribing conversations between strangers discussing topics such as sports and politics.

We reduced our error rate by about 12 percent compared to last year’s accuracy level, using a series of improvements to our neural net-based acoustic and language models. We introduced an additional CNN-BLSTM (convolutional neural network combined with bidirectional long-short-term memory) model for improved acoustic modeling. Additionally, our approach to combine predictions from multiple acoustic models now does so at both the frame/senone and word levels.

Moreover, we strengthened the recognizer’s language model by using the entire history of a dialog session to predict what is likely to come next, effectively allowing the model to adapt to the topic and local context of a conversation.

Our team also has benefited greatly from using the most scalable deep learning software available, Microsoft Cognitive Toolkit 2.1 (CNTK), for exploring model architectures and optimizing the hyper-parameters of our models. Additionally, Microsoft’s investment in cloud compute infrastructure, specifically Azure GPUs, helped to improve the effectiveness and speed by which we could train our models and test new ideas.

Reaching human parity with an accuracy on par with humans has been a research goal for the last 25 years. Microsoft’s willingness to invest in long-term research is now paying dividends for our customers in products and services such as Cortana, Presentation Translator, and Microsoft Cognitive Services. It’s deeply gratifying to our research teams to see our work used by millions of people each day.

Presentation Translator

Advances in speech recognition have created services such as Speech Translator, which can translate presentations in real-time for multi-lingual audiences.

Many research groups in industry and academia are doing great work in speech recognition, and our own work has greatly benefitted from the community’s overall progress. While achieving a 5.1 percent word error rate on the Switchboard speech recognition task is a significant achievement, the speech research community still has many challenges to address, such as achieving human levels of recognition in noisy environments with distant microphones, in recognizing accented speech, or speaking styles and languages for which only limited training data is available. Moreover, we have much work to do in teaching computers not just to transcribe the words spoken, but also to understand their meaning and intent. Moving from recognizing to understanding speech is the next major frontier for speech technology.

Related:

The post Microsoft researchers achieve new conversational speech recognition milestone appeared first on Microsoft Research.

]]>
Program that repairs programs: how to achieve 78.3 percent precision in automated program repair https://www.microsoft.com/en-us/research/blog/program-repairs-programs-achieve-78-3-percent-precision-automated-program-repair/ Fri, 04 Aug 2017 17:08:59 +0000 https://www.microsoft.com/en-us/research/?p=418838 By Lily Sun, Research Program Manager of Microsoft Research Asia In February 2017, Microsoft and Cambridge University announced a DeepCoder algorithm that produces programs from problem inputs/outputs. DeepCoder, which operates on a novel yet greatly simplified programming language, cannot handle complex problems—general programming languages are still too hard for DeepCoder to master. So, currently, programmers […]

The post Program that repairs programs: how to achieve 78.3 percent precision in automated program repair appeared first on Microsoft Research.

]]>

By Lily Sun, Research Program Manager of Microsoft Research Asia

In February 2017, Microsoft and Cambridge University announced a DeepCoder algorithm that produces programs from problem inputs/outputs. DeepCoder, which operates on a novel yet greatly simplified programming language, cannot handle complex problems—general programming languages are still too hard for DeepCoder to master. So, currently, programmers don’t have to worry about being replaced by machines.

But programmers have plenty of other worries, including programming bugs. Could machines assist programmers by taking over the task of bug fixes?

To test this idea, researchers from Peking University, Microsoft Research Asia (MSRA) and University of Electronic Science and Technology of China (UESTC) have developed a new approach, called Accurate Condition System (ACS), to automatically repair defects in software systems without human intervention.

For example, consider the following piece of code from Apache Math, which is used to calculate the least common multiplier from two numbers. This piece of code uses Math.abs to ensure the return value is positive. However, code defects may still return negative results for some input values.

int lcm=Math.abs(mulAndCheck(a/gdc(a,b), b));
 return lcm;

The root cause of this error is that there is one more negative number than there are positive numbers in the range of signed integers. As a result, when the value passed to Math.abs is Integer.MIN_VALUE, Math.abs cannot convert the input into a positive number, causing a negative return. A correct implementation should throw ArithmeticException() at such cases.

We could create a test to capture this fault. The input of the test is a=Integer.MIN_VALUE and b=1, and the expected output is to throw ArithmeticException. Obviously, the program fails the test because no exception will be thrown.

But when we pass this program and the corresponding tests to ACS, the following path is generated, which precisely repairs the defect:

int lcm=Math.abs(mulAndCheck(a/gdc(a,b), b));
 + if (lcm == Integer.MIN_VALUE) {
 +  throw new ArithmeticException();
 + }
 return lcm;

This latest approach stems from a legacy of historic program repair approaches. Since Genprog in 2009, there have been many different approaches offered to repair programs. However, these techniques have a significant problem: typically only a very small portion of generated patches are correct, that is, low precision in patch generation. This is because traditional program repair approaches aim for passing all the tests. However, in real-world software systems, there are only a limited number of tests, and passing the tests does not mean that the program is correct.

For example, current approaches may generate the following patch for the above program:

 int lcm=Math.abs(mulAndCheck(a/gdc(a,b), b));
 + if (b == 1) {
 +  throw new ArithmeticException();
 + }
 return lcm;

Or even the following patch:

 int lcm=Math.abs(mulAndCheck(a/gdc(a,b), b));
 - return lcm;
 + throw new ArithmeticException();

All these patches can pass the test, but are far from being correct. In fact, we can easily construct hundreds of patches to pass the test in this example. Yet few would be of high precision.

Based on the findings of Prof. Martin Rinard’s group at MIT, revealed in an ISSTA 2015 paper, the precisions of mainstream program repair approaches are less than 10 percent. Though some improved approaches have been proposed, such as Prophet and Angelix, the precision of these approaches remains less than 40 percent. In other words, the patches generated by these approaches are mostly incorrect, rendering these approaches difficult to use in practice.

The precision of ACS is a significant improvement over previous approaches. Based on the result over the Defects4J benchmark, ACS produced 23 patches, of which 18 (almost 80 percent) are correct—a significantly better result than existing approaches. In addition, the number of defects correctly repaired by ACS is also the highest among the approaches evaluated on Defects4J.

The key to ACS’s high precision is the use of multiple information sources, especially the “big code” existing on the Internet. Compared with existing techniques, ACS uses three new types of information sources.

  • First, researchers noticed that the principle of locality holds on variable uses, and apply such information to sort the variables to be used in the patches.
  • Second, ACS uses natural language analysis techniques to analyze Javadoc, and then uses the information in Javadoc to filter incorrect patches.
  • Last, and most importantly, ACS performs statistical analysis on the open-source program on the Internet, discovers the conditional probabilities of the operations over the variables and further generates correct patches.

In the above example, ACS first learns from the failed test that a conditional check throwing ArithmeticException is missing, and then uses the principle of locality to determine that variable lcm should be used in the conditional check. Lastly, based on the conditional probability, it determines “==Integer.MIN_VALUE” should be applied to lcm, generating the whole patch.

Left: Shi Han, lead researcher, Microsoft Research Asia; Middle: Lily Sun, Research Program Manager, Microsoft Research Asia; Right: Prof. Yingfei Xiong, Peking University.

The paper describing ACS “Precise Condition Synthesis for Program Repair” has been published at ICSE 2017. The authors include Yingfei Xiong, Jie Wang, Guang Huang and Lu Zhang from Peking University, Runfa Yan from UESTC and Shi Han from Microsoft Research Asia.

Related:

The post Program that repairs programs: how to achieve 78.3 percent precision in automated program repair appeared first on Microsoft Research.

]]>
Creating intelligent water systems to unlock the potential of Smart Cities https://www.microsoft.com/en-us/research/blog/creating-intelligent-water-systems-unlock-potential-smart-cities/ Mon, 31 Jul 2017 20:50:21 +0000 https://www.microsoft.com/en-us/research/?p=417881 By Satish Sangameswaran, Principal Program Manager, and Vani Mandava, Director, Data Science The newspaper headlines about “Bangalore’s looming water crisis” have been ominous, with one urban planning expert proclaiming that Bangalore will become “unlivable” in a few years because of water scarcity. This is a critical issue that threatens the future of one of India’s […]

The post Creating intelligent water systems to unlock the potential of Smart Cities appeared first on Microsoft Research.

]]>

By Satish Sangameswaran, Principal Program Manager, and Vani Mandava, Director, Data Science

The newspaper headlines about “Bangalore’s looming water crisis” have been ominous, with one urban planning expert proclaiming that Bangalore will become “unlivable” in a few years because of water scarcity. This is a critical issue that threatens the future of one of India’s fastest-growing cities. In fact, water availability is a cause for worry in the entire country. According to an estimate by The Asian Development Bank, India will have a water deficit of 50% by the year 2030.

The primary source of water for Bangalore is the Cauvery river, which is located some 100 kilometers (about 62 miles) from the city. Because the monsoon does not always bring dependable rain, planners must maximize the efficiency of distribution of water from the source, and avoid depletion of the ground water table level. A key challenge in the area of water management in Bangalore is tracking consumption. Currently, a staggering one-third of the water pumped is unaccounted for, in terms of usage. This is the problem that a team of scientists at the Indian Institute of Science (IISc) is looking to solve. IISc is India’s oldest and among the most revered academic institutions, with a sprawling green campus in the heart of Bangalore. Under the aegis of a national initiative called Smart Cities, Professor Yogesh Simmhan and his fellow researchers have deployed an Internet of Things (IoT)-based network of sensors in the IISc campus to efficiently monitor the flow of water from source to consumption.

Using services from Microsoft’s Azure, the project harnesses the power of the cloud to collect and process data from the network of IoT sensors. An important facet of this effort is monitoring each node of this network to generate alerts and glean insights from the data. Azure Event Hubs enable such functions as receiving water quality incident alerts from specific locations via a smartphone app. Leveraging the advanced analytics capabilities available on Azure, the team can make decisions to ensure that available water is efficiently pumped to every building on campus. “We are looking at the Internet of Things as a technology platform for enabling smart cities…we use Microsoft Azure Cloud for collecting data, hosting it and processing it in the cloud”, says Simmhan, principal investigator for the project and assistant professor in the computational and data sciences at IISc.

 

The system consists of three components–physical sensors, networking components and cloud. The sensing is done through microcontrollers that communicate wirelessly to a gateway that then sends the data to the cloud. The analytics are enabled through an Apache Storm-based service on Azure. All the edge devices are visible in the Azure resource directory, helping monitor hardware malfunctions or failure.  Rashmi B., a project assistant on the team, is responsible for the execution and deployment of the network. She aspires to ultimately bring this product to the end user, who should be able to log in from anywhere and monitor the health of the water resource network and intervene to prevent water leakage and avert crises.

The government of India envisions designating 100 cities in the country as ‘Smart Cities’. The idea is to deliver citizen services in these cities comparable to the best global standards. Providing citizens with clean and safe water in the most efficient manner that meets the needs of the population will be an important part of this mission. The water management project at the IISc campus can serve as a model showcase of water services provisioning, which can be scaled out to these 100 cities, across the whole country. It offers a beacon of hope towards the goal of managing and conserving precious water resources and makes a lasting contribution in improving the lives of a billion people.

Related:

The post Creating intelligent water systems to unlock the potential of Smart Cities appeared first on Microsoft Research.

]]>
Summer Institute unpacks the future of IoT https://www.microsoft.com/en-us/research/blog/summer-institute-unpacks-future-iot/ Mon, 31 Jul 2017 18:52:58 +0000 https://www.microsoft.com/en-us/research/?p=418232 By John Roach, Writer, Microsoft Research Within the next 5 to 10 years, tens of billions of things will be connected to the internet. They’ll monitor rainfall in rain forests and engine performance in airplanes, guide robotic teachers around classrooms and robotic aids around nursing homes, keep tabs on milk in refrigerators and fans at […]

The post Summer Institute unpacks the future of IoT appeared first on Microsoft Research.

]]>

By John Roach, Writer, Microsoft Research

Within the next 5 to 10 years, tens of billions of things will be connected to the internet. They’ll monitor rainfall in rain forests and engine performance in airplanes, guide robotic teachers around classrooms and robotic aids around nursing homes, keep tabs on milk in refrigerators and fans at baseball games along with millions of other yet imagined scenarios for this future Internet of Things.

Leading researchers from across industry and academia are meeting this week in Snoqualmie, Washington, for the UW Allen School MSR Summer Institute 2017: Unpacking the Future of IoT to hash out their vision for this interconnected world and a plan to turn this vision into reality.

“The way you create the future is you envision what that future is and then once you envision it you identify these hard, technical challenges that will get you there,” said Victor Bahl, distinguished scientist and director of mobile and networking research at Microsoft’s research lab in Redmond, who is a co-organizer of the institute. “Then you start working on them.”

For example, a key challenge for IoT is energy supply, noted Bahl. With billions of devices, batteries that require regular replacement are a nonstarter. Several research groups are exploring workarounds such as energy scavenging from the environment, allowing battery-less computation and communication. “We have some prototypes, but more has to be done,” said Bahl.

Other challenges include connectivity – how and when do these devices connect to the internet outside the range of Wi-Fi and without the overhead of cellular networks – device security, and ensuring the security and privacy of data. Another discussion is around models for the intelligent edge – putting smarts in IoT devices on the fringes of networks, which eases bandwidth constraints and latency issues.

At the core of the institute, added Bahl, is a belief that IoT is a collaborative field. Social activities are planned to encourage networking and cross-institute collaborations. All presentations from the institute will also be made available to the broader IoT community.

“We are starting to work on these problems as a community and in the process,” noted Bahl, “we’ll end up helping define the future as opposed to the future defining us.”

The UW Allen School MSR Summer Institute 2017: Unpacking the Future of IoT is organized by Shyam Gollakota, Yoshi Kohno, Shwetak Patel and Joshua R. Smith from the Paul G. Allen School of Computer Science and Engineering at the University of Washington and Victor Bahl at Microsoft Research.

Related:

The post Summer Institute unpacks the future of IoT appeared first on Microsoft Research.

]]>
AI with creative eyes amplifies the artistic sense of everyone https://www.microsoft.com/en-us/research/blog/ai-with-creative-eyes-amplifies-the-artistic-sense-of-everyone/ Thu, 27 Jul 2017 13:00:39 +0000 https://www.microsoft.com/en-us/research/?p=416570 By Gang Hua, Principal Researcher, Research Manager Recent advances in the branch of artificial intelligence (AI) known as machine learning are helping everyone, including artistically challenged people such as myself, transform images and videos into creative and shareable works of art. AI-powered computer vision techniques pioneered by researchers from Microsoft’s Redmond and Beijing research labs, […]

The post AI with creative eyes amplifies the artistic sense of everyone appeared first on Microsoft Research.

]]>

By Gang Hua, Principal Researcher, Research Manager

Recent advances in the branch of artificial intelligence (AI) known as machine learning are helping everyone, including artistically challenged people such as myself, transform images and videos into creative and shareable works of art.

AI-powered computer vision techniques pioneered by researchers from Microsoft’s Redmond and Beijing research labs, for example, provide new ways for people to transfer artistic styles to their photographs and videos as well as swap the visual style of two images, such as the face of a character from the movie Avatar and Mona Lisa.

The style transfer technique for photographs, known as StyleBank, shipped this June in an update to Microsoft Pix, a smartphone application that uses intelligent algorithms published in more than 20 research papers from Microsoft Research to help users get great photos with every tap of the shutter button.

The field of style transfer research explores ways to transfer an artistic style from one image to another, such as the style of post-impressionism onto a picture of your flower garden. For applications such as Microsoft Pix, a challenge is to offer users multiple styles to choose from and the ability to transfer styles to their images quickly and efficiently.

Our solution, StyleBank, explicitly represents visual styles as a set of convolutional filter banks, with each bank representing one style. To transfer an image to a specific style, an auto-encoder decomposes the input image into multi-layer feature maps that are independent of any styles. The corresponding filter bank for a chosen style is convolved with the feature maps and then go through a decoder to render the image in the chosen style.

The network completely decouples styles from the content. Because of this explicit representation, we can both train new styles and render stylized images more efficiently compared to existing offerings in this space.

The StyleBank research is a collaboration between Beijing lab researchers Lu Yuan and Jing Liao, intern Dongdong Chen and me. We collaborated closely with the broader Microsoft Pix team within Microsoft’s research organization to integrate the style transfer feature with the smartphone application. Our team presented the work at the 2017 Conference on Computer Vision and Pattern Recognition July 21-26 in Honolulu, Hawaii.

We are also extending the StyleBank technology to render stable stylized videos in an online fashion. Our technique is described in a paper to be presented at the 2017 International Conference on Computer Vision in Venice, Italy, October 22-29.

Our approach leverages temporal information about feature correspondences between consecutive frames to achieve consistent and stable stylized video sequences in near real time. The technique adaptively blends feature maps from the previous frame and the current frame to avoid ghosting artifacts, which are prevalent in techniques that render videos frame-by-frame.

A third paper that I co-authored with Jing Liao and Lu Yuan along with my Redmond colleague Sing Bing Kang for presentation at SIGGRAPH 2017 July 30 – August 2 in Los Angeles, describes a technique for visual attribute transfer across images with distinct appearances but with perceptually similar semantic structure – that is, the images contain similar visual content.

For example, the technique can put the face of a character from the movie Avatar onto an image of Leonardo da Vinci’s famous painting of Mona Lisa and the face of Mona Lisa onto the character from Avatar. We call our technique deep image analogy. It works by finding dense semantic correspondences between two input images.

We look forward to sharing more details about these techniques to transform images and videos into creative and shareable works of art at the premier computer vision conferences this summer and fall.

Related:

The post AI with creative eyes amplifies the artistic sense of everyone appeared first on Microsoft Research.

]]>
Transfer learning for machine reading comprehension https://www.microsoft.com/en-us/research/blog/transfer-learning-machine-reading-comprehension/ Wed, 26 Jul 2017 13:00:37 +0000 https://www.microsoft.com/en-us/research/?p=416867 By Xiaodong He, Principal Researcher, Microsoft Research For human beings, reading comprehension is a basic task, performed daily. As early as in elementary school, we can read an article, and answer questions about its key ideas and details. But for AI, full reading comprehension is still an elusive goal–but a necessary one if we’re going […]

The post Transfer learning for machine reading comprehension appeared first on Microsoft Research.

]]>

By Xiaodong He, Principal Researcher, Microsoft Research

For human beings, reading comprehension is a basic task, performed daily. As early as in elementary school, we can read an article, and answer questions about its key ideas and details.

But for AI, full reading comprehension is still an elusive goal–but a necessary one if we’re going to measure and achieve general intelligence AI.  In practice, reading comprehension is necessary for many real-world scenarios, including customer support, recommendations, question answering, dialog and customer relationship management. It has incredible potential for situations such as helping a doctor quickly find important information amid thousands of documents, saving their time for higher-value and potentially life-saving work.

Therefore, building machines that are able to perform machine reading comprehension (MRC) is of great interest. In search applications, machine comprehension will give a precise answer rather than a URL that contains the answer somewhere within a lengthy web page. Moreover, machine comprehension models can understand specific knowledge embedded in articles that usually cover narrow and specific domains, where the search data that algorithms depend upon is sparse.

Microsoft is focused on machine reading and is currently leading a competition in the field. Multiple projects at Microsoft, including Deep Learning for Machine Comprehension, have also set their sights on MRC. Despite great progress, a key problem has been overlooked until recently–how to build an MRC system for a new domain?

Recently, several researchers from Microsoft Research AI, including Po-Sen Huang,  Xiaodong He and intern David Golub, from Stanford University, developed a transfer learning algorithm for MRC to attack this problem. Their work is going to be presented at EMNLP 2017, a top natural language processing conference. This is a key step towards developing a scalable solution to extend MRC to a wider range of domains.

It is an example of the progress we are making toward a broader goal we have at Microsoft: creating technology with more sophisticated and nuanced capabilities. “We’re not just going to build a bunch of algorithms to solve theoretical problems. We’re using them to solve real problems and testing them on real data,” said Rangan Majumder in the machine reading blog.

Currently, most state-of-the-art machine reading systems are built on supervised training data–trained end-to-end on data examples, containing not only the articles but also manually labeled questions about articles and corresponding answers. With these examples, the deep learning-based MRC model learns to understand the questions and infer the answers from the article, which involves multiple steps of reasoning and inference.

However, for many domains or verticals, this supervised training data does not exist. For example, if we need to build a new machine reading system to help doctors find important information about a new disease, there could be many documents available, but there is a lack of manually labeled questions about the articles, and the corresponding answers. This challenge is magnified by both the need to build a  separate MRC system for each different disease, and that the volume of literature is increasing rapidly. Therefore, it is of crucial importance to figure out how to transfer an MRC system to a new domain where no manually labeled questions and answers are available, but there is a body of documents.

Microsoft researchers developed a novel model called “two stage synthesis network,” or SynNet, to address this critical need. In this approach, based on the supervised data available in one domain, the SynNet first learns a general pattern of identifying potential “interestingness” in an article. These are key knowledge points, named entities, or semantic concepts that are usually answers that people may ask for. Then, in the second stage, the model learns to form natural language questions around these potential answers, within the context of the article. Once trained, the SynNet can be applied to a new domain, read the documents in the new domain and then generate pseudo questions and answers against these documents. Then, it forms the necessary training data to train an MRC system for that new domain, which could be a new disease, an employee handbook of a new company, or a new product manual.

The idea of generating synthetic data to augment insufficient training data has been explored before. For example, for the target task of translation, Rico Sennrich and colleagues present a method in their paper to generate synthetic translations given real sentences to refine an existing machine translation system. However, unlike machine translation, for tasks like MRC, we need to synthesize both questions and answers for an article. Moreover, while the question is a syntactically fluent natural language sentence, the answer is mostly a salient semantic concept in the paragraph, such as a named entity, an action, or a number. Since the answer has a different linguistic structure than the question, it may be more appropriate to view answers and questions as two different types of data.

In our approach, we decompose the process of generating question-answer pairs into two steps: The answer generation conditioned on the paragraph and the question generation conditioned on the paragraph and the answer. We generate the answer first because answers are usually key semantic concepts, while questions can be viewed as a full sentence composed to inquire about the concept.

The SynNet is trained to synthesize the answer and the question of a given paragraph. The first stage of the model, an answer synthesis module, uses a bi-directional long short-term memory (LSTM) to predict inside-outside beginning (IOB) tags on the input paragraph, which mark out key semantic concepts that are likely answers. The second stage, a question synthesis module, uses a uni-directional LSTM to generate the question, while attending on embeddings of the words in the paragraph and IOB IDs. Although multiple spans in the paragraph could be identified as potential answers, we pick one span when generating the question.

Two examples of generated questions and answers from articles are illustrated below:

Using the SynNet, we were able to get more accurate results on a new domain without any additional training data, approaching to the performance of a fully supervised MRC system.

SynNet, trained on SQuAD (Wikipedia articles), performs almost as well on the NewsQA domain (news articles), as a system fully trained on NewsQA.

The SynNet is like a teacher, who, based on her experience in previous domains, creates questions and answers from articles in the new domain, and uses these materials to teach her students to perform reading comprehension in the new domain. Accordingly, Microsoft researchers also developed a set of neural machine reading models, including the recently developed ReasoNet that has shown a lot of promise, which are like the students who learn from the teaching materials to answer questions based on the article.

To our knowledge, this is the first attempt to apply MRC domain transferring. We are looking forward to developing scalable solutions that rapidly expand the capability of MRC to release the game-changing potential of machine reading!

Related:

 

The post Transfer learning for machine reading comprehension appeared first on Microsoft Research.

]]>
Researchers build nanoscale computational circuit boards with DNA https://www.microsoft.com/en-us/research/blog/researchers-build-nanoscale-computational-circuit-boards-dna/ Mon, 24 Jul 2017 16:19:36 +0000 https://www.microsoft.com/en-us/research/?p=416258 By Microsoft Research Human-engineered systems, from ancient irrigation networks to modern semiconductor circuitry, rely on spatial organization to guide the flow of materials and information. Living cells also use spatial organization to control and accelerate the transmission of molecular signals, for example by co-localizing the components of enzyme cascades and signaling networks. In a new […]

The post Researchers build nanoscale computational circuit boards with DNA appeared first on Microsoft Research.

]]>

By Microsoft Research

Human-engineered systems, from ancient irrigation networks to modern semiconductor circuitry, rely on spatial organization to guide the flow of materials and information. Living cells also use spatial organization to control and accelerate the transmission of molecular signals, for example by co-localizing the components of enzyme cascades and signaling networks. In a new paper published today by the journal Nature Nanotechnology, scientists at the University of Washington and Microsoft Research describe a method that uses spatial organization to build nanoscale computational circuits made of synthetic DNA. Called “DNA domino” circuits, they consist of DNA “domino” molecules that are positioned at regular intervals on a DNA surface. Information is transmitted when DNA dominoes interact with their immediate neighbors in a cascade.

For decades, scientists in the field of molecular programming have been studying how to use DNA molecules to compute. This includes developing algorithms that operate effectively at the molecular scale, and identifying fundamental principles of molecular computation. The components of these molecular devices are typically made from strands of synthetic DNA, where the sequence of the strands determines how they interact. Real-world applications of these devices could in the future include in vitro diagnostics of pathogens, biomanufacturing of materials, smart therapeutics and high-precision methods for imaging and probing biological experiments. So far, however, most of these devices have been designed to operate in a chemical soup, where billions of DNA molecules rely on the relatively slow process of random diffusion to bump into each other and execute a computational step. This limits the speed of the computation and the number of different components that can effectively be used. This is because the freely diffusing DNA molecules can collide with each other at random, so they must be carefully designed to avoid unintended computations when these random collisions occur.

DNA domino circuits represent an important advance. They were developed through a collaboration between Georg Seelig’s lab at the University of Washington in Seattle and Andrew Phillips’s Biological Computation group at Microsoft Research. Since DNA dominoes are positioned close to each other on a surface, they can quickly interact with their immediate neighbors without relying on random diffusion for each computational step. This can lead to an order of magnitude increase in speed compared to circuits where all the components are freely diffusing. In addition, DNA dominoes can be re-used in multiple locations with almost no interference since their physical location, in addition to their chemical specificity, determines what interactions can and cannot take place.

The scaffold that secures the DNA dominoes is assembled from hundreds of DNA strands using a technique called DNA origami that was first described in 2006. A long single strand of DNA, called the scaffold, is pinned into a rectangular shape by shorter DNA strands called staples. To build a nanoscale computational circuit on a DNA origami surface, individual DNA dominoes are incorporated into the origami during the folding process using special types of elongated staples. Each of these staples is precisely positioned on the same side of the origami scaffold, and folds over into a hairpin shape to form a DNA domino (see figure).

The researchers used this precise positioning to layout the DNA dominoes into signal transmission lines, similar to lines of real dominoes, and elementary Boolean logic gates that compute the logical AND and OR of two inputs. By linking these elementary gates together, the researchers created more complex circuits such as a two-input dual-rail XNOR circuit (see figure), which can in principle be used as the building block for a molecular computer. Freely diffusing DNA strands act as inputs to the circuits, while a single type of DNA fuel strand powers the transmission of signals between neighboring DNA dominoes. The researchers constructed detailed computational models of their designs and used extensive experimental measurements to identify the model parameters and quantify their uncertainty. This modeling allowed the researchers to accurately predict the behavior of more complex circuits, speeding up the design process.

This new approach lays the groundwork for using spatial constraints in molecular engineering more broadly and could help bring embedded molecular control circuits closer to practical applications in biosensing, nanomaterial assembly and therapeutic DNA robots.

Related:

The research was funded by the National Science Foundation, Office of Naval Research and Microsoft Research.

The paper, “A spatially localized architecture for fast and modular DNA computing,” was published on Nature Nanotechnology’s website on July 24, 2017, and will appear at a later date in the print issue of the journal. In addition to Phillips and Seelig, co-authors are Gourab Chatterjee, a doctoral student at the University of Washington, Richard Muscat, formerly a postdoctoral associate at the University of Washington and currently at Cancer Research UK, and Neil Dalchau of Microsoft Research.

 

The post Researchers build nanoscale computational circuit boards with DNA appeared first on Microsoft Research.

]]>