Portfolio Modelling: Man vs Machine : An empirical study of machine learning efficiency and variable selection in portfolio modelling on the Oslo Stock Exchange
Master thesis
Permanent lenke
https://hdl.handle.net/11250/3012638Utgivelsesdato
2022Metadata
Vis full innførselSamlinger
- Master Thesis [4379]
Sammendrag
We investigated and compared the performance of machine learning methods in the context
of empirical asset pricing. We used seven different algorithms and 83 firm characteristics,
comparing the models’ monthly predictive accuracy and variable importance on Norwegian
stock and accounting data. Additionally, we investigated the models’ ability to generate
excess returns in monthly-rebalanced, long-short and long-only portfolios.
We found that the XGBoost algorithm has the highest prediction accuracy of 53.16%,
and that it more heavily weights momentum variables. Furthermore, we found excess
risk-adjusted returns when constructing portfolios free of market frictions. A long-only
portfolio with predictions from the XGBoost model outperformed the index, on average,
by around 0.5% each month in the out-of-sample period. When accounting for market
frictions an institutional investor might encounter, the returns are diminished to the point
of significantly underperforming. When presenting a strategy that a retail investor could
implement, we found excess returns. The XGBoost model’s net returns outperformed the
index by 0.16% and 0.67% over the period, after excluding the largest 25% and 50% firms,
respectively. Upon investigating the explanation for this possible market inefficiency, we
found that the returns are largely driven by highly illiquid stocks. We suggest that these
returns likely are unattainable because of the high degree of illiquidity, and therefore
could be impossible to arbitrage away in the way we would expect the market to do when
it discovers an inefficiency. We call this phenomenon "rainbow-returns", as they are likely
only observable and unattainable.
Our findings support the efficient market hypothesis, in that one cannot beat the market
using public available information, and adds to existing literature in the emerging field of
empirical asset pricing through machine learning.