Data Driven Student Feedback For MOOCs: Global Scale Education for the 21st century


March 19, 2014


In recent years an increasing number of students have turned to online resources, such as massive open online courses (MOOCs) for learning. But while these online courses give teachers more coverage, student-teacher ratios can often be ten thousand to one or worse. With such ratios, students no longer get the type of feedback they need to really understand the material. Codewebs is a system that I have been developing which addresses the problem of scalability in providing student feedback for online programming-intensive courses. Codewebs analyzes a massive code corpora of historical student submissions and uses it to provide instant, useful and detailed student feedback to tens of thousands of students in the same course. By relying on a statistical approach, the quality of feedback increases as our system sees more data and the feedback is automatically tailored for each assignment. I will present a novel data driven technique to discover shared “parts” amongst multiple student submission, a problem that is complicated by the fact that there are always many ways to accomplish the same functionality in code. Throughout, I will demonstrate results on Coursera’s Machine Learning course, which received over 1 million code submissions in its first run.

Finally, I will highlight the emerging issues of scalability and sustainability of education, why these issues require insight from computer scientists and discuss specific problems in this domain that my future research program will address.


Jonathan Huang

Jonathan Huang is an NSF Computing Innovation (CI) postdoctoral fellow at the geometric computing group at Stanford University. He completed his Ph.D. in 2011 with the School of Computer Science at Carnegie Mellon University where he also received a Masters degree in 2008. He received his B.S. degree in Mathematics from Stanford University in 2005. His research interests lie primarily in statistical machine learning and reasoning with combinatorially structured data with applications such as analyzing real world education data. His research has resulted in a number of publications in premier machine learning conferences and journals, receiving a paper award in NIPS 2007 for his work on applying group theoretic Fourier analysis to probabilistic reasoning with permutations. His webpage is: photo: