Machine learning in bankruptcy prediction : utilizing machine learning for improved bankruptcy predictions in the Norwegian market with an emphasis on financial, management and sector statements
Abstract
In this thesis, we create a new multi-year model for predicting bankruptcies in the Norwegian
market. Our emphasis is on utilizing all parts of the financial statements and related information,
rather than previously utilized ratios, to predict whether or not companies go bankrupt within
the next three years.
Our analysis is based on a database that stems from a collaboration of previous research from
the Norwegian School of Economics. After thorough cleaning, our final dataset contains 3 327
405 observations with 159 features related to financial, management and sector statements.
We perform our analysis utilizing nine models based on nine different machine learning
techniques. For evaluation, we optimize our models toward the percentage of correct bankruptcy
predictions.
Our best model is Random Forest, which yields an overall accuracy and a class independent
accuracy of 78%, where the model is able to correctly predict 4/5 bankrupt firms and 4/5
non-bankrupt firms ahead of time. The results we obtain from Neural Network and Mixture
Discriminant Analysis are slightly inferior, while the remaining models perform even worse.
Our Random Forest model outperforms other models built on a class distribution that is highly
imbalanced. Furthermore, other studies often use ratios as features, and we find that our
model assigns considerable importance to some of the individual components of their ratios,
in particular, components related to liquidity. We also find components and features that are
deemed important which have been neglected in the past ratio-focused research, such as cash
flows, sector features and board features.