Sailing into the Storm? Utilizing machine learning to predict defaults in the Norwegian shipping sector
Abstract
In this thesis, we develop multi-year models to predict defaults in the Norwegian shipping
industry. Our primary objective is to create a model suited for Norwegian shipping companies
with high predictive accuracy of default. By incorporating shipping specific and macro
variables in the model we aim to better capture the dynamics of this highly volatile and
globally influenced industry. In the study we utilize two different machine learning techniques
and the more traditional logit method and investigate the difference in accuracy between them.
To further assess the performance of our model’s, we compare them with the SEBRA-model,
used by the Norwegian Central Bank to predict defaults of Norwegian companies. We base
our analysis on a dataset retrieved from the Norwegian Corporate Accounts which after
thorough cleaning contains 889 shipping companies whereof 19 are defaulted.
Our best performing model is the Random Forest, yielding an AUC of 87%, predicting defaults
one year in advance, a performance comparable to the original SEBRA-model. For predictions
two years prior, our AUC reduce to 76%. While the results from the other two models are
slightly inferior, they are both better than our replicated SEBRA-model. Further findings
indicate that oil price is the most important macro variable in our Random Forest model, a
variable neglected in earlier research. Our prediction model is intended to be used by investors,
banks, and other stakeholders involved in the Norwegian shipping industry. Although the
models yield a high AUC they are estimated on an imbalanced dataset with few defaults, and
this is a limitation which need to be considered when utilizing the models.