Man vs. Machine: An applied study comparing a man-made lexicon, a machine learned lexicon, and OpenAI's GPT for sentiment analysis.

2023

Sentiment analysis, at scale, has become an essential tool in the methodological toolbox of

finance. In this thesis, we construct a sentiment lexicon using a supervised machine learning

model by Taddy (2013) and compare it to the traditional finance lexicon by Loughran and

McDonald (2011). Additionally, a state-of-the-art AI natural language processing model from

OpenAI's GPT family is introduced to challenge both of these classical lexical sentiment analysis

approaches. Utilizing unbalanced panel data regressions, we compare the different approaches

in a "horse race". First, we find that textual sentiment significantly explains stock returns.

Secondly, we find that GPT outperforms both lexical approaches in terms of economic and

statistical significance, with an adjusted R2 of 3.9% versus 2.5% and 2.2% for the machine

learned and Loughran and McDonald lexicon, respectively. Thirdly, we find that by fine-tuning

GPT models for detecting sentiment, the performance increases significantly. Lastly, we find

that the current optimal available GPT model for financial sentiment analysis in the GPT model

library is GPT-3.5-Turbo.