Big-Data Analytics, from Theory to Systems

Published June 19, 2013

Share this page

Posted by Rob Knies

Computing today is generating and capturing a wealth of data previously unimaginable. Such information has great promise for unlocking some of society’s most elusive secrets, but how can those secrets be unearthed and identified?

That pursuit provided the impetus behind Big Data Analytics 2013 (opens in new tab), a first-ever workshop held at Microsoft Research Cambridge (opens in new tab) on May 23-24. More than 130 participants from academia and industry—including a strong contingent from the hosting lab, Microsoft Research Redmond (opens in new tab), Microsoft Research Silicon Valley (opens in new tab), and Advanced Technology Labs Europe (opens in new tab)—gathered to discuss and identify the most important and challenging directions for the evolution of algorithms and systems for big data.

“The organization of the workshop was prompted by a surge of interest and activity in the area of big-data analytics,” says Milan Vojnovic (opens in new tab), co-organizer of the event and senior researcher in the Cambridge Systems and Networking (opens in new tab) group, “including platforms for various kinds of processing, such as batch processing and querying of massive data sets, real-time analytics, streaming computations, and analytics on special data structures such as graphical data.

“The organization was also prompted by the rising activity in the big-data-analytics space across diverse communities, such as the theory of computation, working on the foundations of algorithms, and the systems community, working on the design of new platforms and infrastructures.”

The workshop was co-organized by Artur Czumaj (opens in new tab), head of the Department of Computer Science at the University of Warwick, just outside of Coventry, U.K., and Jingren Zhou (opens in new tab), partner development manager for the Bing Search Infrastructure team. They helped attract experts with varied backgrounds to discuss interesting challenges in big data.

“One of the goals was to bring together experts working in the area of big-data analytics to discuss the state-of-the-art research and the most important challenges for future research,” Vojnovic says, “bringing in one place those working on the theory side with those on the systems side who usually do not often meet.

“I think that this mix of profiles, which is rather unusual at standard conference venues, worked rather well and everybody appreciated and learned something new.“

The event featured three keynotes, from:

Surajit Chaudhuri (opens in new tab), Microsoft distinguished scientist and managing director of Microsoft Research’s eXtreme Computing Group (opens in new tab). His talk, titled Big Data and Enterprise Analytics, discussed key trends that characterize the field of big data with respect to enterprise requirements.
Sanjeev Khanna (opens in new tab), Henry Salvatori Professor of Computer and Information Science at the University of Pennsylvania, who spoke on Fast Algorithms for Perfect Matchings in Regular Bipartite Graphs. Khanna explained how a sequence of improvements over the years has culminated in a linear-time algorithm to solve the problem of finding perfect matching in a regular bipartite graph.
Gerhard Weikum (opens in new tab), research director at the Max-Planck Institute for Informatics, located in Saarbrücken, Germany. In a keynote called From Text to Entities and from Entities to Insight: a Perspective on Unstructured Big Data, Weikum addressed the huge amounts of valuable content in the form of speech and text produced by news, social media, websites, and enterprise sources.

“Another goal,” Vojnovic says, “was to serve as a summit for researchers across Microsoft Research’s worldwide labs working in this area, with a strong participation from Microsoft and universities’ computer-science and other departments.”

That goal certainly seems to have been met. In addition to the keynotes, the workshop featured 17 presentations, ranging from big-data analytics in life sciences to foundations of algorithms for large-scale graph analysis. Posters were on display, and attendees got an opportunity to browse through a set of technical demonstrations. A highlight of the second day was a panel discussion called Big-Data Analytics: A Happy Marriage of Systems and Theory?, moderated by Graham Cormode (opens in new tab) of the University of Warwick and featuring Chaudhuri, Sudipto Guha (opens in new tab) of the University of Pennsylvania, Sergei Vassilvitskii (opens in new tab) of Google, and Zhou.

“The top-level takeaway for attendees was that big-data analytics is an area where important innovations can happen by a joint effort of the theory and systems community,” Vojnovic says. “It was appreciated that there is a need for developing suitable abstractions both in analyzing important theoretical problems, as well on the side of computation and programming.

“The event reconfirmed my belief that impactful research and innovation would result from a marriage of systems and theory. The event turned out to be a great success, and I am looking forward to new editions.”

Microsoft Research Blog

The AI Revolution in Medicine, Revisited