Microsoft Research Blog

Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

A new understanding of the world through grassroots Data Science education at UC Berkeley

March 9, 2017 | By Microsoft blog editor

By Vani Mandava, Director, Data Science, Microsoft Research

Data Science education at UC Berkeley

Students at UC Berkeley Foundations of Data Science Program

While some may regard data science as an easy passport to a job for the tech savvy, Luis Macias has different ideas. The fourth-year undergraduate student, who is majoring in American Studies at University of California, Berkeley (UC Berkeley), wants to turn the hype of data science into hope for low-income communities like the one he grew up in.

Luis was among the first students to take UC Berkeley’s innovative Foundations of Data Science, an introductory data science course designed for freshman and sophomore students of all majors. The course is a key component of a multi-year university effort to forge a broader, more diverse, and inclusive scope for the emerging discipline of data science.

Recalling an assignment to gauge how water consumption data might relate to socioeconomic conditions, Luis explained how having the power to get and analyze data about his own ZIP Code ignited an understanding that data science can yield new insights capable of solving some of society’s most complex problems.

“The income level was around $25K, a number that was powerful for me in particular, because I think it explained a lot of the social issues and problems my community had,” he said.

Berkeley’s Data Science Education Program aims to make data science an integral feature of a liberal education and a core interdisciplinary capacity available to all Berkeley undergraduates. This is a bold experiment that will equip thousands of Berkeley students across campus with a fundamental education in data-driven thinking empowered by advanced statistical and computational techniques.

“The team of educators see their role as making it possible for students to bring to bear data science in all the ways they wish to use it in the world,” notes History professor Cathryn Carson. Carson is one of the faculty members leading the effort to build a diverse curriculum that includes advanced classes as well as connector courses that provide a bridge between familiar academic subjects and newly available data science techniques. “The energy and enthusiasm of students in the courses clearly demonstrate that the initiative will put data science to work in a breadth of domains that serve society, and UC Berkeley will play a particularly powerful role as a public university in this new data rich era.”

This year, the program is enabling more than a thousand students across 56 different undergraduate majors to learn critical computational and analytical skills demanded by the projected half million jobs in data science by 2018. That’s a lot of potentially unfilled jobs, an opportunity highlighted in numerous media accounts over the past couple of years.  In 2015, Forbes wrote about the urgent need for qualified data science workers who bring different skills, expertise, and experiences to the discipline. Some of them will no doubt emerge from Berkeley’s unique program of connector courses that represent students from a diverse range of skills and disciplines.

To succeed, the program had to be accessible to students beyond the realm of computer science. One way the program does this is through a flexible and scalable technology infrastructure that enables students to quickly set up labs for hands-on practice—they don’t have to spend time installing programs or learning nuances of complicated applications.

“By hosting it in Azure, we can control the environment,” said Ryan Lovett, Systems Manager for the Department of Statistics at UC Berkeley. “Students just log in and they’re ready to go.”

David Culler, professor of Electrical Engineering and Computer Sciences at UC Berkeley, believes the program can extend computational thinking to benefit more disciplines. He anticipates the program will equip students with the ability to extract their own insights from the world’s information and build tools that benefit people in society. He likens the ability to understand an increasingly complex world to a new form of perception — combining mathematical thinking and the arts with computational tools for new forms of expression.

Such data science projects include classic and new problems like music genre classification, text analysis of famous literary works, identifying insights from bike sharing data in San Francisco, or analyzing jury selection in Alameda County.

Berkeley prides itself as a place where the world’s brightest minds explore, ask questions and improve the world. Thanks to the Data Science Education Program, thousands of Berkeley students are better critical thinkers.

Note: Microsoft partners closely with UC Berkeley in support of its Data Science Education Program. Since 2015, Microsoft Research, through its Azure for Research program provided $235K in Azure credits to enable the Foundations of Data Science course along with $260K in research credits and $5K in training credits. Microsoft also provided $75K in unrestricted gift funding towards UC Berkeley’s Data Program.

Learn more

Up Next

Data management, analysis and visualization

Changing the world with data science

Alan Turing asked the question “can machines think?” in 1950 and it still intrigues us today. At The Alan Turing Institute, the United Kingdom’s national institute for data science in London, more than 150 researchers are pursuing this question by bringing their thinking to fundamental and real-world problems to push the boundaries of data science. […]

Kenji Takeda

Director, Azure for Research

Data management, analysis and visualization

Transportation Data Science at Microsoft

By Vani Mandava, Director, Data Science Outreach, Microsoft Research The National Science Foundation (NSF)-supported Big Data Innovation Hubs launched a National Transportation Data Challenge with a kickoff event in Seattle in May 2017. Microsoft Outreach, through its partnership with the Big Data Hubs organized an Azure workshop and participated in a panel discussion on ‘How […]

Microsoft blog editor

Data management, analysis and visualization

Microsoft continues to support data science research with $3M cloud credits to NSF BIGDATA program

By Vani Mandava, Director, Data Science, Microsoft Research The National Science Foundation has launched a new solicitation in 2017 for the advancement of data science research and applications. The solicitation, titled Critical Techniques, Technologies and Methodologies for Advancing Foundations and Applications of Big Data Sciences and Engineering (BIGDATA), is inviting proposals under two categories: Foundations […]

Microsoft blog editor