Machine Learning Work Shop – Session 1 – Carlos Guestrin – “GraphLab: Large-scale Machine Learning on Natural Graphs”
- Carlos Guestrin | University of Washington
GraphLab: Large-scale Machine Learning on Natural Graphs
Today, machine learning (ML) methods play a central role in industry and science. The growth of the Web and improvements in sensor data collection technology have been rapidly increasing the magnitude and complexity of the ML tasks we must solve. This growth is driving the need for scalable, parallel ML algorithms that can handle “Big Data.” In this talk, I’ll first present some recent advances in large-scale algorithms for tackling such huge problems.
Unfortunately, implementing efficient parallel ML algorithms is challenging. Existing high-level parallel abstractions such as MapReduce and Pregel are insufficiently expressive to achieve the desired performance, while low-level tools such as MPI are difficult to use, leaving ML experts repeatedly solving the same design challenges. In this talk, I will also describe the GraphLab framework, which naturally expresses asynchronous, dynamic graph computations that are key for state-of-the-art ML algorithms. When these algorithms are expressed in our higher-level abstraction, GraphLab will effectively address many of the underlying parallelism challenges, including data distribution, optimized communication, and guaranteeing sequential consistency, a property that is surprisingly important for many ML algorithms. On a variety of large-scale tasks, GraphLab provides 20-100x performance improvements over Hadoop. In recent months, GraphLab has received thousands of downloads, and is being actively used by a number of startups, companies, research labs and universities.
This talk represents joint work with Yucheng Low, Joey Gonzalez, Aapo Kyrola, Jay Gu, Danny Bickson, and Joseph Bradley.
Speaker Details
Carlos Guestrin’s current research spans the areas of planning, reasoning and learning in uncertain dynamic environments, focusing on applications in sensor networks. He is an assistant professor in the Machine Learning and in the Computer Science Departments at Carnegie Mellon University. Previously, he was a senior researcher at the Intel Research Lab in Berkeley. Carlos received his MSc and PhD in Computer Science from Stanford University in 2000 and 2003, respectively, and a Mechatronics Engineer degree from the Polytechnic School of the University of Sao Paulo, Brazil, in 1998. Carlos Guestrin received best paper awards at the Knowledge Discovery and Data Mining (KDD-2007), the Information Processing in Sensor Networks (IPSN) in 2005 and 2006, the Very Large Data Bases (VLDB-2004), and the Neural Information Processing Systems (NIPS-2003) conferences, runner-up best paper awards at the Uncertainty in Artificial Intelligence (UAI-2005) and Machine Learning (ICML-2005) conferences, and the 2007 IJCAII-JAIR Best Paper prize in the Journal of Artificial Intelligence Research (JAIR). He is also a recipient of the NSF Career Award, Alfred P. Sloan Fellowship, IBM Faculty Fellowship, the Siebel Scholarship and the Stanford Centennial Teaching Assistant Award.
-
-
Jeff Running
-
Watch Next
-
-
-
-
Accelerating MRI image reconstruction with Tyger
- Karen Easterbrook,
- Ilyana Rosenberg
-
-
-
-
From Microfarms to the Moon: A Teen Innovator’s Journey in Robotics
- Pranav Kumar Redlapalli
-
-