Test suites for software products vary in size from hundreds of test cases for small programs to millions for a large software product like Microsoft Windows. One problem in the product life cycle is selection of diverse test cases and elimination of duplicates. This becomes of primary importance during maintenance when changes to an already deployed product have to be re-tested thoroughly but typically cannot be validated with the full suite of tests due to resource and time constraints. The problem compounds in cases where the product has been evolving over many years and the test suite still contains not well understood legacy tests from earlier versions. In addition, maintenance activities might be done by a different group of people than the original development team. All these factors create an environment when the body of tests is too large and understanding of tests not complete enough to perform reasonably accurate elimination of redundant test cases by means of manual analysis.

Our work aims at automatically identifying redundant test cases using statistical test suite clustering. To achieve this goal we introduce a set of carefully chosen test case distance metrics which will quantitatively describe similarity of any arbitrary test case pair in an existing test suite and allow for clustering of test cases into similarity buckets. We combine program profiles coming from execution of tests, static analysis of executables and make use of statistical techniques to describe all important aspects of test execution and show that it is possible to reach a point at which automatic test redundancy detection is possible with acceptable risk of missing future defects.