Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Big-Data Analytics, from Theory to Systems

June 19, 2013 | Posted by Microsoft Research Blog

Posted by Rob Knies

Big Data Analytics 2013 logo

Computing today is generating and capturing a wealth of data previously unimaginable. Such information has great promise for unlocking some of society’s most elusive secrets, but how can those secrets be unearthed and identified?

That pursuit provided the impetus behind Big Data Analytics 2013, a first-ever workshop held at Microsoft Research Cambridge on May 23-24. More than 130 participants from academia and industry—including a strong contingent from the hosting lab, Microsoft Research Redmond, Microsoft Research Silicon Valley, and Advanced Technology Labs Europe—gathered to discuss and identify the most important and challenging directions for the evolution of algorithms and systems for big data.

“The organization of the workshop was prompted by a surge of interest and activity in the area of big-data analytics,” says Milan Vojnovic, co-organizer of the event and senior researcher in the Cambridge Systems and Networking group, “including platforms for various kinds of processing, such as batch processing and querying of massive data sets, real-time analytics, streaming computations, and analytics on special data structures such as graphical data.

“The organization was also prompted by the rising activity in the big-data-analytics space across diverse communities, such as the theory of computation, working on the foundations of algorithms, and the systems community, working on the design of new platforms and infrastructures.”

The workshop was co-organized by Artur Czumaj, head of the Department of Computer Science at the University of Warwick, just outside of Coventry, U.K., and Jingren Zhou, partner development manager for the Bing Search Infrastructure team. They helped attract experts with varied backgrounds to discuss interesting challenges in big data.

“One of the goals was to bring together experts working in the area of big-data analytics to discuss the state-of-the-art research and the most important challenges for future research,” Vojnovic says, “bringing in one place those working on the theory side with those on the systems side who usually do not often meet.

“I think that this mix of profiles, which is rather unusual at standard conference venues, worked rather well and everybody appreciated and learned something new.“

The event featured three keynotes, from: 

  • Surajit Chaudhuri, Microsoft distinguished scientist and managing director of Microsoft Research’s eXtreme Computing Group. His talk, titled Big Data and Enterprise Analytics, discussed key trends that characterize the field of big data with respect to enterprise requirements.
  • Sanjeev Khanna, Henry Salvatori Professor of Computer and Information Science at the University of Pennsylvania, who spoke on Fast Algorithms for Perfect Matchings in Regular Bipartite Graphs. Khanna explained how a sequence of improvements over the years has culminated in a linear-time algorithm to solve the problem of finding perfect matching in a regular bipartite graph.
  • Gerhard Weikum, research director at the Max-Planck Institute for Informatics, located in Saarbrücken, Germany. In a keynote called From Text to Entities and from Entities to Insight: a Perspective on Unstructured Big Data, Weikum addressed the huge amounts of valuable content in the form of speech and text produced by news, social media, websites, and enterprise sources.

“Another goal,” Vojnovic says, “was to serve as a summit for researchers across Microsoft Research’s worldwide labs working in this area, with a strong participation from Microsoft and universities’ computer-science and other departments.”

That goal certainly seems to have been met. In addition to the keynotes, the workshop featured 17 presentations, ranging from big-data analytics in life sciences to foundations of algorithms for large-scale graph analysis. Posters were on display, and attendees got an opportunity to browse through a set of technical demonstrations. A highlight of the second day was a panel discussion called Big-Data Analytics: A Happy Marriage of Systems and Theory?, moderated by Graham Cormode of the University of Warwick and featuring Chaudhuri, Sudipto Guha of the University of Pennsylvania, Sergei Vassilvitskii of Google, and Zhou.

“The top-level takeaway for attendees was that big-data analytics is an area where important innovations can happen by a joint effort of the theory and systems community,” Vojnovic says. “It was appreciated that there is a need for developing suitable abstractions both in analyzing important theoretical problems, as well on the side of computation and programming.

“The event reconfirmed my belief that impactful research and innovation would result from a marriage of systems and theory. The event turned out to be a great success, and I am looking forward to new editions.”