Microsoft Research Blog

Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

A new understanding of the world through grassroots Data Science education at UC Berkeley

March 9, 2017 | By Microsoft blog editor

By Vani Mandava, Director, Data Science, Microsoft Research

Data Science education at UC Berkeley

Students at UC Berkeley Foundations of Data Science Program

While some may regard data science as an easy passport to a job for the tech savvy, Luis Macias has different ideas. The fourth-year undergraduate student, who is majoring in American Studies at University of California, Berkeley (UC Berkeley), wants to turn the hype of data science into hope for low-income communities like the one he grew up in.

Luis was among the first students to take UC Berkeley’s innovative Foundations of Data Science, an introductory data science course designed for freshman and sophomore students of all majors. The course is a key component of a multi-year university effort to forge a broader, more diverse, and inclusive scope for the emerging discipline of data science.

Recalling an assignment to gauge how water consumption data might relate to socioeconomic conditions, Luis explained how having the power to get and analyze data about his own ZIP Code ignited an understanding that data science can yield new insights capable of solving some of society’s most complex problems.

“The income level was around $25K, a number that was powerful for me in particular, because I think it explained a lot of the social issues and problems my community had,” he said.

Berkeley’s Data Science Education Program aims to make data science an integral feature of a liberal education and a core interdisciplinary capacity available to all Berkeley undergraduates. This is a bold experiment that will equip thousands of Berkeley students across campus with a fundamental education in data-driven thinking empowered by advanced statistical and computational techniques.

“The team of educators see their role as making it possible for students to bring to bear data science in all the ways they wish to use it in the world,” notes History professor Cathryn Carson. Carson is one of the faculty members leading the effort to build a diverse curriculum that includes advanced classes as well as connector courses that provide a bridge between familiar academic subjects and newly available data science techniques. “The energy and enthusiasm of students in the courses clearly demonstrate that the initiative will put data science to work in a breadth of domains that serve society, and UC Berkeley will play a particularly powerful role as a public university in this new data rich era.”

This year, the program is enabling more than a thousand students across 56 different undergraduate majors to learn critical computational and analytical skills demanded by the projected half million jobs in data science by 2018. That’s a lot of potentially unfilled jobs, an opportunity highlighted in numerous media accounts over the past couple of years.  In 2015, Forbes wrote about the urgent need for qualified data science workers who bring different skills, expertise, and experiences to the discipline. Some of them will no doubt emerge from Berkeley’s unique program of connector courses that represent students from a diverse range of skills and disciplines.

To succeed, the program had to be accessible to students beyond the realm of computer science. One way the program does this is through a flexible and scalable technology infrastructure that enables students to quickly set up labs for hands-on practice—they don’t have to spend time installing programs or learning nuances of complicated applications.

“By hosting it in Azure, we can control the environment,” said Ryan Lovett, Systems Manager for the Department of Statistics at UC Berkeley. “Students just log in and they’re ready to go.”

David Culler, professor of Electrical Engineering and Computer Sciences at UC Berkeley, believes the program can extend computational thinking to benefit more disciplines. He anticipates the program will equip students with the ability to extract their own insights from the world’s information and build tools that benefit people in society. He likens the ability to understand an increasingly complex world to a new form of perception — combining mathematical thinking and the arts with computational tools for new forms of expression.

Such data science projects include classic and new problems like music genre classification, text analysis of famous literary works, identifying insights from bike sharing data in San Francisco, or analyzing jury selection in Alameda County.

Berkeley prides itself as a place where the world’s brightest minds explore, ask questions and improve the world. Thanks to the Data Science Education Program, thousands of Berkeley students are better critical thinkers.

Note: Microsoft partners closely with UC Berkeley in support of its Data Science Education Program. Since 2015, Microsoft Research, through its Azure for Research program provided $235K in Azure credits to enable the Foundations of Data Science course along with $260K in research credits and $5K in training credits. Microsoft also provided $75K in unrestricted gift funding towards UC Berkeley’s Data Program.

Learn more

Up Next

Data platforms and analytics

Microsoft Investigator fellows accelerate scientific and teaching impact with Azure cloud computing

I am pleased to announce the winners of the new Microsoft Investigator Fellowship. This fellowship is designed to empower researchers of all disciplines who plan to make an impact with research and teaching using the Microsoft Azure cloud computing platform. Each fellowship provides $100,000 USD annually for two years and various training and community events. […]

Jamie Harper

Vice-President, US Education

Artificial intelligence, Data platforms and analytics

Calling all aspiring women in Data Science

What started as a one-day conference organized by Stanford University in 2015, Women in Data Science (WiDS) has blossomed into a movement bringing together women data scientists and aspiring data scientists via a series of over 150 virtual and in-person events worldwide, ultimately culminating in the March 4, 2019 main event at Stanford. Microsoft is […]

Vani Mandava

Director, Data Science Outreach

Artificial intelligence, Data platforms and analytics

Cloud computing aids researchers in solving the unsolvable in medical data labeling

It’s not uncommon for physicians to disagree about a diagnosis. That’s why people often seek a second or third opinion when faced with a serious or complex health concern. What if instead of a second opinion, hundreds of expert opinions could be collated? What if those experts were a combination of both humans and AI […]

Vani Mandava

Director, Data Science Outreach