Data Science Summer School 2019 – Replicating “An Empirical Analysis of Racial Differences in Police Use of Force”


January 30, 2020


Emeka Mbazor, Roymil Terrero, Adnan Hoq, Harpreet Gaur, Etta Rapp, Cindy Muso, Naomi Moreira, Brenda Fried


Lehman College, St. Joseph’s College, St. Joseph’s College, CUNY City Tech, Yeshiva University, St. John’s University, St. Joseph’s College, Brooklyn College


This project replicates and extends a recent paper on racial bias in police use of force. We selected this paper because it is both widely read and also an ideal candidate for a data analysis replication. It uses relatively simple methodology that seems straightforward to implement and check, relies on two publicly available datasets, and contains more than 100 pages between the main text and extensive appendices. Despite this nearly ideal setting, completing the data analysis replication turned out to be much more complicated than expected and took several weeks itself, mainly for reasons that centered around how the original data were cleaned and featurized.

These challenges came despite the extensive documentation in the paper and its appendix, but they also helped uncover insights that might not have been obvious from simply reading the paper. We extended the paper’s results through the addition of map and census information as well as predictive checks of the underlying models used in the paper.

In this talk we discuss the various challenges we faced in replicating the results and the insights that the replication revealed.