Microsoft Research Blog

Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

A new understanding of the world through grassroots Data Science education at UC Berkeley

March 9, 2017 | By Microsoft blog editor

By Vani Mandava, Director, Data Science, Microsoft Research

Data Science education at UC Berkeley

Students at UC Berkeley Foundations of Data Science Program

While some may regard data science as an easy passport to a job for the tech savvy, Luis Macias has different ideas. The fourth-year undergraduate student, who is majoring in American Studies at University of California, Berkeley (UC Berkeley), wants to turn the hype of data science into hope for low-income communities like the one he grew up in.

Luis was among the first students to take UC Berkeley’s innovative Foundations of Data Science, an introductory data science course designed for freshman and sophomore students of all majors. The course is a key component of a multi-year university effort to forge a broader, more diverse, and inclusive scope for the emerging discipline of data science.

Recalling an assignment to gauge how water consumption data might relate to socioeconomic conditions, Luis explained how having the power to get and analyze data about his own ZIP Code ignited an understanding that data science can yield new insights capable of solving some of society’s most complex problems.

“The income level was around $25K, a number that was powerful for me in particular, because I think it explained a lot of the social issues and problems my community had,” he said.

Berkeley’s Data Science Education Program aims to make data science an integral feature of a liberal education and a core interdisciplinary capacity available to all Berkeley undergraduates. This is a bold experiment that will equip thousands of Berkeley students across campus with a fundamental education in data-driven thinking empowered by advanced statistical and computational techniques.

“The team of educators see their role as making it possible for students to bring to bear data science in all the ways they wish to use it in the world,” notes History professor Cathryn Carson. Carson is one of the faculty members leading the effort to build a diverse curriculum that includes advanced classes as well as connector courses that provide a bridge between familiar academic subjects and newly available data science techniques. “The energy and enthusiasm of students in the courses clearly demonstrate that the initiative will put data science to work in a breadth of domains that serve society, and UC Berkeley will play a particularly powerful role as a public university in this new data rich era.”

This year, the program is enabling more than a thousand students across 56 different undergraduate majors to learn critical computational and analytical skills demanded by the projected half million jobs in data science by 2018. That’s a lot of potentially unfilled jobs, an opportunity highlighted in numerous media accounts over the past couple of years.  In 2015, Forbes wrote about the urgent need for qualified data science workers who bring different skills, expertise, and experiences to the discipline. Some of them will no doubt emerge from Berkeley’s unique program of connector courses that represent students from a diverse range of skills and disciplines.

To succeed, the program had to be accessible to students beyond the realm of computer science. One way the program does this is through a flexible and scalable technology infrastructure that enables students to quickly set up labs for hands-on practice—they don’t have to spend time installing programs or learning nuances of complicated applications.

“By hosting it in Azure, we can control the environment,” said Ryan Lovett, Systems Manager for the Department of Statistics at UC Berkeley. “Students just log in and they’re ready to go.”

David Culler, professor of Electrical Engineering and Computer Sciences at UC Berkeley, believes the program can extend computational thinking to benefit more disciplines. He anticipates the program will equip students with the ability to extract their own insights from the world’s information and build tools that benefit people in society. He likens the ability to understand an increasingly complex world to a new form of perception — combining mathematical thinking and the arts with computational tools for new forms of expression.

Such data science projects include classic and new problems like music genre classification, text analysis of famous literary works, identifying insights from bike sharing data in San Francisco, or analyzing jury selection in Alameda County.

Berkeley prides itself as a place where the world’s brightest minds explore, ask questions and improve the world. Thanks to the Data Science Education Program, thousands of Berkeley students are better critical thinkers.

Note: Microsoft partners closely with UC Berkeley in support of its Data Science Education Program. Since 2015, Microsoft Research, through its Azure for Research program provided $235K in Azure credits to enable the Foundations of Data Science course along with $260K in research credits and $5K in training credits. Microsoft also provided $75K in unrestricted gift funding towards UC Berkeley’s Data Program.

Learn more

Up Next

Artificial intelligence, Data platforms and analytics

Calling all aspiring women in Data Science

What started as a one-day conference organized by Stanford University in 2015, Women in Data Science (WiDS) has blossomed into a movement bringing together women data scientists and aspiring data scientists via a series of over 150 virtual and in-person events worldwide, ultimately culminating in the March 4, 2019 main event at Stanford. Microsoft is […]

Vani Mandava

Director, Data Science Outreach

Data platforms and analytics

Changing the world with data science

Alan Turing asked the question “can machines think?” in 1950 and it still intrigues us today. At The Alan Turing Institute, the United Kingdom’s national institute for data science in London, more than 150 researchers are pursuing this question by bringing their thinking to fundamental and real-world problems to push the boundaries of data science. […]

Kenji Takeda

Director, Health and AI Partnerships (Academic)

NSF Big Data Innovation Hubs collaboration

Artificial intelligence, Data platforms and analytics, Ecology and environment, Medical, health and genomics

NSF Big Data Innovation Hubs collaboration — looking back after one year

By Vani Mandava, Director, Data Science Significant technical advancements in cloud computing have led to lower infrastructure costs, making possible big storage and big computing. Big data technology, though, requires cross-discipline research within and beyond non-computing domains. This is where domain experts collaborate with computing teams, industry, and government agencies to discover new insights that […]

Microsoft blog editor