Vis enkel innførsel

dc.contributor.advisorAndersson, Jonas
dc.contributor.authorJensen, Henriette Nøkleberg
dc.contributor.authorRafn, Mathias William
dc.date.accessioned2024-02-08T14:33:11Z
dc.date.available2024-02-08T14:33:11Z
dc.date.issued2018
dc.identifier.urihttps://hdl.handle.net/11250/3116438
dc.description.abstractThe aim of this thesis is to determine whether the prediction accuracy of a model can be improved by using a data-driven method to bin continuous variables and group the levels of categorical variables. We use data on the policyholders of one of Gjensidige's insurance products to perform our analysis, and specifically aim to improve Gjensidige's Poisson regression model for predicting claim frequency, where the predictors are binned and grouped manually today. We analyze the effect of using a regularization framework that combines the Lasso method and generalizations of the method that have been adapted to nominal and ordinal predictors. These generalizations constrain coefficients and the differences between them, effectively fusing and selecting predictor levels. By optimizing the resulting objective function in R using the newly developed smurf package (Reynkens, Devriendt & Antonio, 2018), we estimate a penalized Poisson regression model. We reestimate a Poisson regression model using the selected and fused predictor levels as input in order to reduce the bias of the estimates. The resulting model is compared with the model Gjensidige currently uses for predicting claim frequency, to determine the effect of using the data-driven approach. We validate the performance of the prediction models using MSE and AIC as performance measures and find that our reestimated model performs slightly better in terms of prediction accuracy, in addition to reducing the number of parameters used in the model. We conclude that regularization can be used as a data-driven method of binning and grouping predictor levels to improve prediction accuracy.en_US
dc.language.isoengen_US
dc.subjectbusiness analyticsen_US
dc.subjectbusiness analysisen_US
dc.subjectperformance managementen_US
dc.titleMethod for Fusing Predictor Levels with Application to Insurance Dataen_US
dc.typeMaster thesisen_US
dc.description.localcodenhhmasen_US


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel