The Uniqueness of Changes: Characteristics and Applications

MSR-TR-2014-149 |

Published by Microsoft

Changes in software development come in many forms. Some
changes are frequent, idiomatic, or repetitive (e.g. adding checks for nulls or
logging important values) while others are unique. We hypothesize that
unique changes are different from the more common similar (or non-unique)
changes in important ways; they may require more expertise or represent code
that is more complex or prone to mistakes. As such, these changes are worthy of
study. In this paper, we present a definition of unique changes and provide a
method for identifying them in software project history. Based on the results
of applying our technique on the Linux kernel and two large projects at
Microsoft, we present an empirical study of unique changes. We explore how
prevalent unique changes are and characterize where they occur along the
architecture of the project. We further investigate developers’ contribution
towards uniqueness of changes. We also describe potential applications of
leveraging the uniqueness of change and implement two such applications,
evaluating the risk of changes based on uniqueness and providing change
recommendations for non-unique changes.