Man vs. Machine: An applied study comparing a man-made lexicon, a machine learned lexicon, and OpenAI's GPT for sentiment analysis.
MetadataShow full item record
- Master Thesis 
Sentiment analysis, at scale, has become an essential tool in the methodological toolbox of finance. In this thesis, we construct a sentiment lexicon using a supervised machine learning model by Taddy (2013) and compare it to the traditional finance lexicon by Loughran and McDonald (2011). Additionally, a state-of-the-art AI natural language processing model from OpenAI's GPT family is introduced to challenge both of these classical lexical sentiment analysis approaches. Utilizing unbalanced panel data regressions, we compare the different approaches in a "horse race". First, we find that textual sentiment significantly explains stock returns. Secondly, we find that GPT outperforms both lexical approaches in terms of economic and statistical significance, with an adjusted R2 of 3.9% versus 2.5% and 2.2% for the machine learned and Loughran and McDonald lexicon, respectively. Thirdly, we find that by fine-tuning GPT models for detecting sentiment, the performance increases significantly. Lastly, we find that the current optimal available GPT model for financial sentiment analysis in the GPT model library is GPT-3.5-Turbo.