Dissecting Poisson based prediction models in association football: A comprehensive look at methodology, assumptions, and accuracy using data from the main European Leagues (2011 – 2022)
Abstract
As the access to broader and better data increases, data analytics, statistical modeling,
and data science generally find ever-growing interest in sports analytics, including
association football. It is no secret that both clubs and even higher governing bodies in
the sport implement data-driven strategies to give them insights and a competitive
advantage in play. Recognizing the importance of the sport as a fan and from the point
of view of an analyst, this work seeks to contribute to the current body of literature by
offering a thorough investigation of one of the most elegant approaches to sports
analytics in association football; The Poisson goal model. Based on the simple and
intuitive idea that goals in football are rare discrete events that follow the Poisson
distribution while conditional on team performance, the concept has been appealing to
many researchers. At the same time, a simplistic idea at its core, its application to realworld
data, has been met with much discussion regarding underlying assumptions and
methodology. Much of the discussion in the last 40 years since the idea was formalized
concerns addressing assumptions such as the applicability of the Poisson distribution,
score interdependence, overdispersion, and parameter stability. In the present work, we
take a step back and reexamine the idea, methodology, and assumptions in the light of
the most recent data from Europe’s major leagues. Furthermore, we examine sone novel
concept such as considering xG (expected goals). Overall, some changing dynamics are
revealed and some of the propositions made for the model do not hold given the recent
developments in the sport.