Portfolio Modelling: Man vs Machine : An empirical study of machine learning efficiency and variable selection in portfolio modelling on the Oslo Stock Exchange
MetadataShow full item record
- Master Thesis 
We investigated and compared the performance of machine learning methods in the context of empirical asset pricing. We used seven different algorithms and 83 firm characteristics, comparing the models’ monthly predictive accuracy and variable importance on Norwegian stock and accounting data. Additionally, we investigated the models’ ability to generate excess returns in monthly-rebalanced, long-short and long-only portfolios. We found that the XGBoost algorithm has the highest prediction accuracy of 53.16%, and that it more heavily weights momentum variables. Furthermore, we found excess risk-adjusted returns when constructing portfolios free of market frictions. A long-only portfolio with predictions from the XGBoost model outperformed the index, on average, by around 0.5% each month in the out-of-sample period. When accounting for market frictions an institutional investor might encounter, the returns are diminished to the point of significantly underperforming. When presenting a strategy that a retail investor could implement, we found excess returns. The XGBoost model’s net returns outperformed the index by 0.16% and 0.67% over the period, after excluding the largest 25% and 50% firms, respectively. Upon investigating the explanation for this possible market inefficiency, we found that the returns are largely driven by highly illiquid stocks. We suggest that these returns likely are unattainable because of the high degree of illiquidity, and therefore could be impossible to arbitrage away in the way we would expect the market to do when it discovers an inefficiency. We call this phenomenon "rainbow-returns", as they are likely only observable and unattainable. Our findings support the efficient market hypothesis, in that one cannot beat the market using public available information, and adds to existing literature in the emerging field of empirical asset pricing through machine learning.