Big Data with Stratosphere

Date

May 23, 2013

Speaker

Volker Markl

Affiliation

Technische Universität Berlin

Overview

The talk will present a programming model for big data analytics, with a particular focus on our research in a massively parallel data processor in the Stratosphere project. We will present a new flavor of data processor that goes beyond the popular map/reduce paradigm. We propose a programming model based on second order functions that describe what we call parallelization contracts (PACTs). PACTs are a generalization of the map/reduce programming model, extending it with additional higher order functions and output contracts that give guarantees about the behavior of a function. A PACT program is transformed into a data flow for a massively parallel execution engine, which executes its sequential building blocks in parallel and provides communication, synchronization and fault tolerance. The concept of PACTs allows the system to abstract parallelization from the specification of the data flow and thus enables several types of optimizations on the data flow. The system as a whole is as generic as map/reduce systems, but can provide higher performance through optimization and adaptation of the system to changes in the execution environment. Moreover, it enables the execution of tasks that traditional map/reduce systems cannot execute without mixing data flow program specification and parallelization, like joins, time-series analysis or data mining operations. We will present our research vision and research results that we have achieved during the last year. We will also highlight our research agenda for the upcoming year.

Speakers

Volker Markl

Dr. Markl is currently working for the Bavarian Research Center for Knowledge-Based Systems in Munich. He is heading an international research effort to investigate with industry partners the application of multidimensional access methods to relational database systems. The participating commercial partners are SAP AG, NEC, Hitachi, Teijin Systems Technology, TransAction Software, GfK, the European Union, Microsoft. The FORWISS team includes 4 researchers and an average of 10 master students of the Munich University of Technology.

Volker Markl is a graduate of the Munich University of Technology. He completed his Ph.D. thesis in Computer Science in March 1999 under the supervision of Rudolf Bayer. His dissertation was in “Relational Queries Processing Using a Multidimensional Access Technique.”He earned a degree in Business Administration from the University Hagen, Germany in 1995. His research interests are on physical data modeling and query optimization but also include data warehousing, electronic commerce and web based systems.

Dr. Markl’s professional experience include software engineer for a virology laboratory, as part of his military service; instructor for a computer school; and consultant for a forwarding agency. He was awarded a sponsorship by Siemens AG, Munich and also worked as an international intern with Benefit Panel Services, Los Angeles.