Machine Learning for Predicting Voluntary Audits : How do loan and risk factors influence small private commercial U.S. banks’ decision to get no audit?
MetadataShow full item record
- Master Thesis 
Commercial banks and other financial institutions are essential to the modern economy, and government agencies and regulators strive to identify and counteract risks in banking institutions. With an emphasis on loan and risk based factors, this thesis explores what influences small private commercial U.S. banks' decision to get no voluntary external audit. Using bank regulatory data spanning 10 years from 2010 to 2020, we predict audit choice using four machine learning algorithms for classification; logistic regression, LASSO, random forest, and LightGBM. The models make use of 16 specially selected independent features. This thesis analyzes the machine learning algorithms based on various performance metrics (accuracy, specificity, precision, recall, and F l ) and studies the feature importance measured by each model. To verify the results, the thesis uses two methods of feature selection; ANOVA and Mutual Information. Our findings suggest that the proportion of agricultural loans to the total sum of loans is an important factor in predicting audit choice. Bank size and asset quality are also important factors in the banks' audit decisions. The best models are the tree-based models, with random forest being considered the best. Random forest predicts with a high level of accuracy and argues that the relationship between audit choice and the bank data is nonlinear.