Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Our open source commitment: The proof is in the projects

July 27, 2016 | By Microsoft blog editor

By Miran Lee, Principal Research Program Manager & Winnie Cui, Senior Research Program Manager, Microsoft Research Asia

Openness allows innovation to evolve in unforeseen, novel and exciting ways, and sometimes even provides solutions that no one ever imagined were possible.

Getting more done with crowdsourcing

One such innovation is GeoMission (geo-location-based mission), a crowdsourcing platform developed by MSRA and a team of researchers from the Hong Kong University of Science and Technology (HKUST). GeoMission lets users share and accept tasks based on where they are located.

Users submit location-based requests via GeoMission apps, which then push questions to other users near the target location (as long as they meet any additional criteria in the request.)

The project owner Professor Lei Chen from HKUST is introducing GeoMission to audience

The project owner Professor Lei Chen from HKUST is introducing GeoMission to audience

Developed for IOS and Android clients, the GeoMission server platform allows users to initiate requests by audio, video, photo or plain old texting.

All of GeoMission’s source code is hosted on GitHub, providing some critical benefits for a research-based project — like more people! Researchers can intricately study how users interact with the platform, and users can directly contribute to help make it better. Of course, making it open source extends the tools to the greatest possible number of spatial crowdsourcing researchers. Most importantly, we believe opening the source code helps us innovate faster and provide more ways to collaborate with other developers or just about anyone else who’s interested in the project. You can find more details about project at HKUST’s website.

Improving datacenter efficiency with Vortex

In the same spirit of openness, we’ve worked with Professor Byung-Gon Chun from Seoul National University (SNU) to develop Vortex in an effort to address the problem of wasted resources at datacenters. Tapping these sometimes vast computing resources — that remain largely unused outside of peak usage — represents a huge opportunity to improve datacenter efficiency and save energy.

Although current resource managers like Google’s Borg system and Apache Mesos attempt to reclaim idle resources for other tasks, they largely fall short when reclaimed resources are inevitably preempted by latency critical tasks. The more aggressively the resources are reclaimed, the more frequently they’re preempted due to conflict, resulting in transient resources.  The upshot of all this is that current data processing systems that rely on transient resources cannot efficiently complete jobs.

Vortex, on the other hand, maintains high performance despite frequent preemptions. Developed by SNU grad students, Yunseong Lee and Youngseok Yang during their internship at MSRA, the pair are continuing to work on Vortex after returning to school. Joining the project is SNU undergraduate student Geon-Woo Kim along with contributors from other institutions and Microsoft.

Vortex team in SNU (from left to right); Geon-Woo Kim, Youngseok Yang, Byung-Gon Chun, and Yunseong Lee

Vortex team in SNU (from left to right); Geon-Woo Kim, Youngseok Yang, Byung-Gon Chun, and Yunseong Lee

Experimental evaluations have been conducted on Microsoft Azure to measure the Vortex system’s effectiveness. The results show that Vortex can scale out much better with frequently preempted transient resources than Apache Spark. In certain cases, Apache Spark failed to complete jobs.

Hosted on GitHub, Vortex has been developed as an application of Apache REEF — an open source library for big data applications — in what has since proved to be a mutually beneficial project.  Vortex is succeeding in leveraging the Apache methods of growing open source projects: Development issues were openly discussed and pull requests were thoroughly reviewed. Meanwhile, the Apache REEF community was able to closely observe how Vortex uses Apache REEF as well as learn about the overall Vortex requirements.

Vortex

Vortex and GeoMission — as well as other projects like them — clearly have the potential to succeed in the marketplace. However, we believe that releasing them as open source opens the way to greater long term value for the global community of researchers and developers whose collaborative efforts can sometimes trigger unimaginable breakthroughs. At Microsoft Research Asia, we see a future that includes many more opportunities to collaborate with the open source community — to the benefit of all.

Learn more

Up Next

Data management, analysis and visualization

Microsoft and Tsinghua University Work Together on Open Academic Data Research

In a recent collaboration, Microsoft and China’s Tsinghua University released an academic graph, named Open Academic Graph (OAG). This billion-scale academic graph integrates the current Microsoft Academic Graph (MAG) and Tsinghua’s AMiner academic graph. Specifically, it contains the metadata information of 155 million academic paper metadata from AMiner and 166 million papers from MAG. By […]

Microsoft blog editor

Data management, analysis and visualization, Systems and networking

Creating intelligent water systems to unlock the potential of Smart Cities

By Satish Sangameswaran, Principal Program Manager, and Vani Mandava, Director, Data Science The newspaper headlines about “Bangalore’s looming water crisis” have been ominous, with one urban planning expert proclaiming that Bangalore will become “unlivable” in a few years because of water scarcity. This is a critical issue that threatens the future of one of India’s […]

Microsoft blog editor

Data management, analysis and visualization

Microsoft continues to support data science research with $3M cloud credits to NSF BIGDATA program

By Vani Mandava, Director, Data Science, Microsoft Research The National Science Foundation has launched a new solicitation in 2017 for the advancement of data science research and applications. The solicitation, titled Critical Techniques, Technologies and Methodologies for Advancing Foundations and Applications of Big Data Sciences and Engineering (BIGDATA), is inviting proposals under two categories: Foundations […]

Microsoft blog editor