smooth_coef: Smooth coefficients in the model

Description

Apply smoothing on the risk factors used in the model. smooth_coef() must always be followed by update_glm().

Usage

smooth_coef(
  model,
  x_cut,
  x_org,
  degree = NULL,
  breaks = NULL,
  smoothing = "spline",
  k = NULL,
  weights = NULL
)

Value

Object of class smooth

Arguments

model

object of class glm/smooth

x_cut

column name with breaks/cut

x_org

column name where x_cut is based on

degree

order of polynomial

breaks

numerical vector with new clusters for x

smoothing

choose smoothing specification (all the shape constrained smooth terms (SCOP-splines) are constructed using the B-splines basis proposed by Eilers and Marx (1996) with a discrete penalty on the basis coefficients:

'spline' (default)
'mpi': monotone increasing SCOP-splines
'mpd': monotone decreasing SCOP-splines
'cx': convex SCOP-splines
'cv': concave SCOP-splines
'micx': increasing and convex SCOP-splines
'micv': increasing and concave SCOP-splines
'mdcx': decreasing and convex SCOP-splines
'mdcv': decreasing and concave SCOP-splines
'gam': spline based smooth (thin plate regression spline)

k

number of basis functions be computed

weights

weights used for smoothing, must be equal to the exposure (defaults to NULL)

Author

Martin Haringa

Details

Although smoothing could be applied either to the frequency or the severity model, it is more appropriate to impose the smoothing on the premium model. This can be achieved by calculating the pure premium for each record (i.e. expected number of claims times the expected claim amount), then fitting an "unrestricted" Gamma GLM to the pure premium, and then imposing the restrictions in a final "restricted" Gamma GLM.

Examples

Run this code

if (FALSE) {
library(insurancerating)
library(dplyr)

# Fit GAM for claim frequency
age_policyholder_frequency <- fit_gam(data = MTPL,
                                      nclaims = nclaims,
                                      x = age_policyholder,
                                      exposure = exposure)

# Determine clusters
clusters_freq <- construct_tariff_classes(age_policyholder_frequency)

# Add clusters to MTPL portfolio
dat <- MTPL |>
  mutate(age_policyholder_freq_cat = clusters_freq$tariff_classes) |>
  mutate(across(where(is.character), as.factor)) |>
  mutate(across(where(is.factor), ~biggest_reference(., exposure)))

# Fit frequency and severity model
freq <- glm(nclaims ~ bm + age_policyholder_freq_cat, offset = log(exposure),
 family = poisson(), data = dat)
sev <- glm(amount ~ bm + zip, weights = nclaims,
 family = Gamma(link = "log"), data = dat |> filter(amount > 0))

# Add predictions for freq and sev to data, and calculate premium
premium_df <- dat |>
  add_prediction(freq, sev) |>
  mutate(premium = pred_nclaims_freq * pred_amount_sev)

# Fit unrestricted model
burn_unrestricted <- glm(premium ~ zip + bm + age_policyholder_freq_cat,
                         weights = exposure,
                         family = Gamma(link = "log"),
                         data = premium_df)

# Impose smoothing and create figure
burn_unrestricted |>
  smooth_coef(x_cut = "age_policyholder_freq_cat",
              x_org = "age_policyholder",
              breaks = seq(18, 95, 5)) |>
  autoplot()

# Impose smoothing and refit model
burn_restricted <- burn_unrestricted |>
  smooth_coef(x_cut = "age_policyholder_freq_cat",
              x_org = "age_policyholder",
              breaks = seq(18, 95, 5)) |>
  update_glm()

# Show new rating factors
rating_factors(burn_restricted)
}

Run the code above in your browser using DataLab