Likelihood of Arrests for Violent Crime Incidents in America: An exploratory study using logistic regression and random forest methods
MetadataShow full item record
- Master Thesis 
The use of policing algorithms to predict for arrest is rising in America. However, research indicates that these algorithms may be biased against certain populations. These false perceptions of who commits these crimes, and who is impacted by them is also skewed by the media. Hence, it is important to understand which demographic and situational characteristics of a violent crime incident impact the likelihood of arrest. In this thesis, I will predict for arrest in incidents of violent crime as reported in the National Incident Based Reporting System 2014. The outcome of arrest was predicted using two types of classification methods, logistic regression and random forest. The models that were built for the aggregate of all violent crime, as well as the subsets of offense types had a good predictive power with an accuracy of greater than 50%. Additionally, adjusted models were built to address class imbalance and leveraged cross-validation methods. Using odds ratios from the logistic regression results, and the variable importance plots from the random forest - likelihood of arrest was ascertained. The results indicate that generally the likelihood of arrest increases under certain conditions. These conditions are: in incidents where the race of the offender is white, in incidents where the race of the victim is white, in incidents where the offender is a female (for aggravated assault instances), and in incidents where if the victim of a violent crime is a female. Generally, the likelihood of arrest decreases as the age of the offender increases, and the likelihood of arrest increases as the age of the victim increases. The likelihood of arrest decreases for incidents where the offender is armed with a deadly weapon, and where the offender and victim are strangers. Additionally, the likelihood of arrest increases for all violent crimes if the incident takes place at night time compared to day time, and in incidents where the offender is using substances. The results show that media perceptions, and predictive policing algorithms are skewed. These typically represent black individuals as more dangerous more likely to be incarcerated than white offenders. However, the results from this thesis show the converse relationship. Additionally, this thesis also shows that variables such as time of day, substance use, and the age of the victim and offender can be leveraged to make more powerful predictions on the likelihood of arrest.