A new understanding of the world through grassroots Data Science education at UC Berkeley
By Vani Mandava, Director, Data Science, Microsoft Research
While some may regard data science as an easy passport to a job for the tech savvy, Luis Macias has different ideas. The fourth-year undergraduate student, who is majoring in American Studies at University of California, Berkeley (UC Berkeley), wants to turn the hype of data science into hope for low-income communities like the one he grew up in.
Luis was among the first students to take UC Berkeley’s innovative Foundations of Data Science, an introductory data science course designed for freshman and sophomore students of all majors. The course is a key component of a multi-year university effort to forge a broader, more diverse, and inclusive scope for the emerging discipline of data science.
Recalling an assignment to gauge how water consumption data might relate to socioeconomic conditions, Luis explained how having the power to get and analyze data about his own ZIP Code ignited an understanding that data science can yield new insights capable of solving some of society’s most complex problems.
“The income level was around $25K, a number that was powerful for me in particular, because I think it explained a lot of the social issues and problems my community had,” he said.
Berkeley’s Data Science Education Program aims to make data science an integral feature of a liberal education and a core interdisciplinary capacity available to all Berkeley undergraduates. This is a bold experiment that will equip thousands of Berkeley students across campus with a fundamental education in data-driven thinking empowered by advanced statistical and computational techniques.
“The team of educators see their role as making it possible for students to bring to bear data science in all the ways they wish to use it in the world,” notes History professor Cathryn Carson. Carson is one of the faculty members leading the effort to build a diverse curriculum that includes advanced classes as well as connector courses that provide a bridge between familiar academic subjects and newly available data science techniques. “The energy and enthusiasm of students in the courses clearly demonstrate that the initiative will put data science to work in a breadth of domains that serve society, and UC Berkeley will play a particularly powerful role as a public university in this new data rich era.”
This year, the program is enabling more than a thousand students across 56 different undergraduate majors to learn critical computational and analytical skills demanded by the projected half million jobs in data science by 2018. That’s a lot of potentially unfilled jobs, an opportunity highlighted in numerous media accounts over the past couple of years. In 2015, Forbes wrote about the urgent need for qualified data science workers who bring different skills, expertise, and experiences to the discipline. Some of them will no doubt emerge from Berkeley’s unique program of connector courses that represent students from a diverse range of skills and disciplines.
To succeed, the program had to be accessible to students beyond the realm of computer science. One way the program does this is through a flexible and scalable technology infrastructure that enables students to quickly set up labs for hands-on practice—they don’t have to spend time installing programs or learning nuances of complicated applications.
“By hosting it in Azure, we can control the environment,” said Ryan Lovett, Systems Manager for the Department of Statistics at UC Berkeley. “Students just log in and they’re ready to go.”
David Culler, professor of Electrical Engineering and Computer Sciences at UC Berkeley, believes the program can extend computational thinking to benefit more disciplines. He anticipates the program will equip students with the ability to extract their own insights from the world’s information and build tools that benefit people in society. He likens the ability to understand an increasingly complex world to a new form of perception — combining mathematical thinking and the arts with computational tools for new forms of expression.
Such data science projects include classic and new problems like music genre classification, text analysis of famous literary works, identifying insights from bike sharing data in San Francisco, or analyzing jury selection in Alameda County.
Berkeley prides itself as a place where the world’s brightest minds explore, ask questions and improve the world. Thanks to the Data Science Education Program, thousands of Berkeley students are better critical thinkers.
Note: Microsoft partners closely with UC Berkeley in support of its Data Science Education Program. Since 2015, Microsoft Research, through its Azure for Research program provided $235K in Azure credits to enable the Foundations of Data Science course along with $260K in research credits and $5K in training credits. Microsoft also provided $75K in unrestricted gift funding towards UC Berkeley’s Data Program.