Hadoop Architecture and its usage at Facebook
- Dhruba Borthakur | Facebook
This talk introduces the origin of the Hadoop Project and a overview of Hadoop File System Architecture. Then it talks about the Hadoop environment at Facebook, the configuration of hardware and software in our Hadoop cluster, size and volume of datasets, characteristics of jobs and the processes we have built on top of Hadoop to keep the data pipeline alive and active.
Speaker Details
Dhruba Borthakur is the Project Lead for the Open Source Apache Hadoop Distributed File System. He has been associated with Hadoop almost since its inception while working for Yahoo. He currently works for Facebook in Palo Alto, California. Earlier, he was a Senior Lead Engineer at Veritas Software (Symantec) and was responsible for the design and development of software for the Veritas San File System. He was the Team Lead for developing the Mendocino Continuous Data Protection Software Appliance at a startup named Mendocino Software. Prior to Mendocino Software, he was the Chief Architect at Oreceipt.com, an e-commerce startup based in Sunnyvale, California. Earlier, he was a Senior Engineer at IBM-Transarc Labs where he was responsible for the development of Andrew File System (AFS) which is a part of IBM’s e-commerce initiative WebSphere. Prior to his experience in the United States, Dhruba developed call processing software for Digital Switching Systems at C-DOT Delhi. Dhruba has an M.S. in Computer Science from the University of Wisconsin, Madison and a B.S. in Computer Science from the Birla Institute of Technology and Science (BITS), Pilani, India. He has 10 issued patents and 11 patents pending.
-
-
Jeff Running
-
Watch Next
-
-
Accelerating MRI image reconstruction with Tyger
- Karen Easterbrook,
- Ilyana Rosenberg
-
-
-
-
-
-
-
-