Abstract

We show in an empirical study of 3241 Red Hat packages that software vulnerabilities correlate with dependencies between packages. With formal concept analysis and statistical hypothesis testing, we identify dependencies that decrease the risk of vulnerabilities (beauties) or increase the risk (beasts). Using support vector machines on dependency data, our prediction models successfully and consistently catch about two thirds of vulnerable packages (median recall of 0.65). When our models predict a package as vulnerable, it is correct more than eight out of ten times (median precision of 0.83). Our findings help developers to choose new dependencies wisely and make them aware of risky dependencies.