Data Science Summer School 2014 – An Empirical Analysis of Stop-and-Frisk in New York City


August 12, 2014


Siobhan Wilmot-­Dunbar, Derek Sanz, Md.Afzal Hossain, and Khanna Pugach


Pace University, CUNY Brooklyn College, New York City College of Technology, Bernard M. Baruch College


Between 2006 and 2012, the New York City Police Department made roughly four million stops as part of the city’s controversial stop-and-frisk program. We empirically study two aspects of the program by analyzing a large public dataset released by the police department that records all documented stops in the city. First, by comparing to block-level census data, we estimate stop rates for various demographic subgroups of the population. In particular, we find, somewhat remarkably, that the average annual number of stops of young, black men exceeds the number of such individuals in the general population. This disparity is even more pronounced when we account for geography, with the number of stops of young black men in certain neighborhoods several times greater than their number in the local population. Second, we statistically analyze the reasons recorded in our data that officers state for making each stop (e.g., “furtive movements” or “sights and sounds of criminal activity”). By comparing which stated reasons best predict whether a suspect is ultimately arrested, we develop simple heuristics to aid officers in making better stop decisions. We believe our results will help both the general population and the police department better understand the burden of stop-and-frisk on certain subgroups of the population, and that the guidelines we have developed will help improve stop-and-frisk programs in New York City and across the country.


Siobhan Wilmot-­Dunbar, Derek Sanz, Md.Afzal Hossain, and Khanna Pugach

Siobhan Wilmot­-Dunbar is a Junior at Pace University studying Computer Science and minoring in Digital Design. She is also pursuing a five year program, where she intends on graduating with her Masters in Information Systems as well. While doing that, Siobhan is part of Seidenberg Creative Labs, a web development and research group at Pace, and has done coding in Java, HTML, and CSS. She is looking forward to learning as much as she can about Python and processing large data sets in her time at the Microsoft DS3 program. Besides that, she plays piano, acoustic guitar, and steel drums, and has high hopes of one day combining her ability in computing with her love for music.

Derek Sanz. I was born in 1993 to Dominican parents in Brooklyn, New York. It was 2011 when I entered Brooklyn College and I took the introductory computer science course and have entered a non-­-stop frenzy of hard work and love for learning. 2013 was a year of fun: 9 computer science courses, a one-­-month study abroad trip to China, one internship and one fellowship. 2014 will have a lot to do with how the early years of my computer science career will pan out. Overloading on courses in hopes of graduating early in December along with this opportunity at MSR will keep me busy while I make up my mind as to whether I will enter industry or continue my studies in graduate school.

Md Afzal Hossain is a junior in New York City College of Technology. He has an associate degree in Computer science and now he is studying Applied Mathematics in Finance. His goal is to study Data science in graduate school.

My name is Khanna Pugach. I’m a junior at Baruch College majoring in Computer Information Systems with a Math minor. Besides, I am an international student from Russia and this is my third year in the US. I like Nora Ephron’s books and tennis 🙂