Correcting Witness Reports through Machine Learning : An empirical study of machine learning applied to incident reports
MetadataShow full item record
- Master Thesis 
In this thesis we investigate the possibility of using machine learning models to correct witness testimony. Using data from the National Incident-Based Reporting System, we build a model to predict the race of offenders on data of arrests and compare the model predictions to that of witness guesses in non-arrest incidents. We find that witness reports are erroneous in 16.17% of the incidents, and that the error in witness reports lead to an expected yearly police cost of $8.2 million dollars for the crimes: burglary, robbery, assault, rape, and homicides. We suggest several ways the machine learning model can be used to correct witness reports. First, the model prediction can be used directly to correct reports. For instance, values can be imputed for unknown offenders, and the labels where there is a disagreement between the model and witness guesses can be replaced with model predictions. We find that witness error can be reduced to 8.77% if all labels are replaced with model predictions, saving $4.5 million in yearly police cost. An alternative to be considered is combining witness guesses with model predictions to improve predictive accuracy. The model predictions can also be used indirectly to correct reports as an alarm tool to identify the possibility of error. The reports which are labelled likely to be erroneous can then in turn be investigated by humans. Finally, the model can be used to correct the confidence of the eyewitness identification, by i) comparing the eyewitness prediction to a continuous prediction made by an accurate model, or ii) to quantify the amount of expected error in the testimony.