In this paper, we present the results from two surveys related to data science applied to software engineering. The first survey solicited questions that software engineers would like to ask data scientists to investigate about software, software processes and practices, and about software engineers. Our analysis resulted in a list of 145 questions grouped into 12 categories. The second survey asked a different pool of software engineers to rate the 145 questions and identify the most important ones to work on first. Respondents favored questions that focus on how customers typically use their applications. We also see opposition to questions that assess the performance of individual employees or compare them to one another. Our categorization and catalog of 145 questions will help researchers, practitioners, and educators to more easily focus their efforts on topics that are important to the software industry.
This technical report has been published at the ICSE 2014 conference. For the definitive version, please refer to the version published in the conference proceedings:
The data appendix for this paper is here: