Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus announcements about noteworthy events, scholarships, and fellowships designed for academic and scientific communities.

Stopping Bugs Before They Sneak into Software

July 15, 2014 | Posted by Microsoft Research Blog

Posted by Rob Knies

Analyzing developers' task difficultySoftware development is not for the faint-hearted. Programmers often work long hours, typing code while staring at computer monitors. Computer software can include millions of lines of code, so given the nature and the volume of the work involved, mistakes are unavoidable.

Those mistakes—known in tech circles as “bugs”—can cause serious consequences for customers. Eliminating coding bugs is well-nigh impossible, but for software companies, reducing their numbers by any reasonable means is a high priority.

Now, Microsoft researcher Andrew Begel and a few academic and industrial colleagues are trying a novel approach to reduce coding errors: tracking the eye movements and other mental and physical characteristics of developers as they work. He will be discussing his work on July 15 during the second day of the 15th annual Microsoft Research Faculty Summit, in an afternoon session called "And how does that make you feel?"

Existing work to analyze the causes for bugs has focused on detecting correlations between bug fixes and code after the bugs are detected. But in his paper titled Using Psycho-Physiological Measures to Assess Task Difficulty in Software Development—which he wrote with Thomas Fritz, Sebastian C. Müller, and Manuela Züger of the University of Zurich, and Serap Yigit-Elliott of the engineering and scientific consulting firm Exponent—the researchers suggest a new approach: detect when developers are struggling as they work, thereby enabling them to try to prevent bugs before they can be introduced.

Classifying Difficulty

The research, presented in Hyderabad, India, on June 5 during the 36th International Conference on Software Engineering, aims to classify the difficulty of code tasks using data from psycho-physiological sensors.

Begel, who also served as the co-program chair of the related International Conference on Program Comprehension, explains.

“A research field called Mining Software Repositories,” he says, “looks for correlations between software-process metrics and bugs. For example, code that is edited often is more likely to have a bug in it than infrequently edited code. But this kind of result isn’t actionable—if you stop editing code, you will stop causing bugs—and have no software to ship!

“My idea is that if the software developers are writing the code and causing the bugs, we should measure attributes of the developers themselves. If we can figure out what cognitive or emotional issues lead to buggy code or lowered productivity, we can try to intervene and stop them from causing developers to make mistakes in the first place.”

The paper outlines three questions the researchers asked to determine whether psycho-physiological measurements can be used to determine whether a code-comprehension task is perceived as easy or difficult:

  • Can we acquire psycho-physiological measures from eye-tracking technology, electrodermal-activity [EDA] sensors, and electroencephalogram [EEG] sensors to make an accurate prediction of whether a task is difficult or easy?
  • Which combination of psycho-physiological sensors and associated features best predict task difficulty?
  • Can we use psycho-physiological measures to predict task difficulty as the developer is working?

EDA measures changes in the skin’s ability to conduct electricity, while EEG evaluates electrical activity in the brain.

The researchers conducted a study of 15 professional developers to see how well this approach can predict if developers will find a task difficult. The results were encouraging: For new developers, task difficulty could be predicted with a precision of nearly 65 percent. For new tasks, the number was even higher: almost 85 percent.

What isn’t known yet is how developers will react if their actions are approaching bug-potential levels and an intervention is deemed necessary.

“We haven’t tried any interventions yet,” Begel says, “but one I’ve thought about can help absent-minded developers, such as those who just came back from lunch and aren’t paying much attention to their code. If we reduce the contrast on the display and make the fonts harder to read, the developer will be forced to apply more brainpower to read and understand the code and will be less likely to slip up as a result.”

Significant work remains before such techniques can be deployed on a broad scale, but the promising results the researchers outline in their paper brings the community closer to a reliable way to measure the difficulty of software-engineering tasks, and that could help drive the next generation of tools to support overtaxed programmers.

“We’re still at the experimental stage, learning to understand what all these sensors are telling us about the software developer,” Begel says. “If we can successfully learn a pattern that produces appropriate interventions at the right times, then the proof will be in the utility of the resulting tool.”