Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing

This study proposes an efficacious approach to analyze the over-dispersed insurance frequency data as it is imperative for the insurers to have decisive informative insights for precisely underwriting and pricing insurance products, retaining existing customer base and gaining an edge in the highly...

Full description

Bibliographic Details
Main Author: Simon CK Lee
Format: Article
Language:English
Published: MDPI AG 2020-02-01
Series:Risks
Subjects:
Online Access:https://www.mdpi.com/2227-9091/8/1/19
_version_ 1818307752931360768
author Simon CK Lee
author_facet Simon CK Lee
author_sort Simon CK Lee
collection DOAJ
description This study proposes an efficacious approach to analyze the over-dispersed insurance frequency data as it is imperative for the insurers to have decisive informative insights for precisely underwriting and pricing insurance products, retaining existing customer base and gaining an edge in the highly competitive retail insurance market. The delta boosting implementation of the negative binomial regression, both by one-parameter estimation and a novel two-parameter estimation, was tested on the empirical data. Accurate parameter estimation of the negative binomial regression is complicated with considerations of incomplete insurance exposures, negative convexity, and co-linearity. The issues mainly originate from the unique nature of insurance operations and the adoption of distribution outside the exponential family. We studied how the issues could significantly impact the quality of estimation. In addition to a novel approach to simultaneously estimate two parameters in regression through boosting, we further enrich the study by proposing an alteration of the base algorithm to address the problems. The algorithm was able to withstand the competition against popular regression methodologies in a real-life dataset. Common diagnostics were applied to compare the performance of the relevant candidates, leading to our conclusion to move from light-tail Poisson to negative binomial for over-dispersed data, from generalized linear model (GLM) to boosting for non-linear and interaction patterns, from one-parameter to two-parameter estimation to reflect more closely the reality.
first_indexed 2024-12-13T07:03:23Z
format Article
id doaj.art-d3af0d8095374203b167beceaee516b3
institution Directory Open Access Journal
issn 2227-9091
language English
last_indexed 2024-12-13T07:03:23Z
publishDate 2020-02-01
publisher MDPI AG
record_format Article
series Risks
spelling doaj.art-d3af0d8095374203b167beceaee516b32022-12-21T23:55:52ZengMDPI AGRisks2227-90912020-02-01811910.3390/risks8010019risks8010019Delta Boosting Implementation of Negative Binomial Regression in Actuarial PricingSimon CK Lee0Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam Road, Hong KongThis study proposes an efficacious approach to analyze the over-dispersed insurance frequency data as it is imperative for the insurers to have decisive informative insights for precisely underwriting and pricing insurance products, retaining existing customer base and gaining an edge in the highly competitive retail insurance market. The delta boosting implementation of the negative binomial regression, both by one-parameter estimation and a novel two-parameter estimation, was tested on the empirical data. Accurate parameter estimation of the negative binomial regression is complicated with considerations of incomplete insurance exposures, negative convexity, and co-linearity. The issues mainly originate from the unique nature of insurance operations and the adoption of distribution outside the exponential family. We studied how the issues could significantly impact the quality of estimation. In addition to a novel approach to simultaneously estimate two parameters in regression through boosting, we further enrich the study by proposing an alteration of the base algorithm to address the problems. The algorithm was able to withstand the competition against popular regression methodologies in a real-life dataset. Common diagnostics were applied to compare the performance of the relevant candidates, leading to our conclusion to move from light-tail Poisson to negative binomial for over-dispersed data, from generalized linear model (GLM) to boosting for non-linear and interaction patterns, from one-parameter to two-parameter estimation to reflect more closely the reality.https://www.mdpi.com/2227-9091/8/1/19boosting treesgradient boostingpredictive modelinginsurancemachine learningnegative binomial
spellingShingle Simon CK Lee
Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing
Risks
boosting trees
gradient boosting
predictive modeling
insurance
machine learning
negative binomial
title Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing
title_full Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing
title_fullStr Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing
title_full_unstemmed Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing
title_short Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing
title_sort delta boosting implementation of negative binomial regression in actuarial pricing
topic boosting trees
gradient boosting
predictive modeling
insurance
machine learning
negative binomial
url https://www.mdpi.com/2227-9091/8/1/19
work_keys_str_mv AT simoncklee deltaboostingimplementationofnegativebinomialregressioninactuarialpricing