July 15, 2016

Software Engineering Mix Volume 2: Large-scale Data Analysis of Software Repositories

8:30 AM – 3:30 PM

Location: Redmond, WA, USA

Microsoft Conference Centre, Hood

Software Engineering Mix was part of the Microsoft Research Faculty Summit 2016.

Software Engineering Mix (SE-MIX) provided a forum for our colleagues from academia to interact directly with Microsoft engineers. The program featured talks from academics: highlights of published research that is highly relevant for Microsoft and blue sky talks summarizing emerging research areas. In addition, practitioners gave presentations about theoretical and pragmatic engineering challenges they face, soliciting help from academia. A coffee round table setting was used to facilitate discussions. This session built on the success of SEIF Days, which provided a discussion forum about the future of software engineering.

The topic of this year’s SE-MIX was the large-scale data analysis of software repositories (like GitHub for example). Many teams are using GitHub for their OSS projects and would like to have a richer understanding and insight into that activity. While some projects like GHTorrent (opens in new tab) and GitHub Archive (opens in new tab) exist, and some insights are available for analyzing a single project, everyone touching this topic sees an enormous potential in the data. The SE-MIX was intended to jumpstart connections between academia and Microsoft on the vast opportunities in leveraging GitHub data and data from other software repositories to develop software more efficiently.