The value of interpretable machine learning in wind power prediction : an emperical study using shapley addidative explanations to interpret a complex wind power prediction model

Tenfjord, Ulrik Slinning; Strand, Thomas Vågø

Tenfjord, Ulrik Slinning; Strand, Thomas Vågø

Master thesis

Åpne

masterthesis.pdf (2.807Mb)

Permanent lenke

https://hdl.handle.net/11250/2770073

Utgivelsesdato

2020

Metadata

Vis full innførsel

Samlinger

Master Thesis [4372]

Sammendrag

The main objective of this thesis is to evaluate if interpretable machine learning provides valuable insight into TrønderEnergi’s wind power prediction models. As we will see, interpretable machine learning provides explanations at different levels. The main objective is therefore answered by dividing the analysis into three different sections based on the scope of explanations. The sections are global, local, and grouped explanations. Global explanations seek to interpret the whole model at once, local explanations aim to explain individual observations and the grouped explanations aims to uncover observations with similar explanation structure. To quantify these explanations, we use Shapley Additive Explanations (SHAP). This approach takes a complex machine learning model and estimates a separate explanation model from which each feature´s marginal contribution to the predicted output is estimated.

The global analysis shows that wind speed is the biggest contributor to the prediction, while wind direction contributes to a lower degree. However, wind direction SHAP-dependence plot shows why wind direction is an important feature in wind power predictions. When including wind direction as a feature, random forest seems to take speed-up effects and wake effects into account.

In the local explanations we examine the observation with the highest prediction error and the one with highest imbalance cost. Inaccurate wind speed forecasts seem to be the cause of the observation´s large prediction error. An underestimation of the real production and a large spread between the spot price and RK-price seems to be the main contributor to the observation with highest imbalance cost.

In the cluster analysis, we see that when Numerical Weather Prediction (NWP) models predict different wind speeds for the same observation, the model tends to perform worse in terms of RMSE. Observations where NWP-models all predict either high or low wind speeds for the same observation, performs significantly better, with less than half as low RMSE.

We also discuss how these three explanation frameworks can be used to gain business benefits. We find that there are many potential benefits but some of the more prominent are legal, control and trust.