SpliceFitPareto: Splicing of mixed Erlang and Pareto

Description

Fit spliced distribution of a mixed Erlang distribution and Pareto distribution(s). The shape parameter(s) of the Pareto distribution(s) is determined using the Hill estimator.

Usage

SpliceFitPareto(X, const = NULL, tsplice = NULL, M = 3, s = 1:10, trunclower = 0, 
                truncupper = Inf, EVTtruncation = FALSE, ncores = NULL, 
                criterium = c("BIC","AIC"), reduceM = TRUE,
                eps = 10^(-3), beta_tol = 10^(-5), maxiter = Inf)
              
SpliceFitHill(X, const = NULL, tsplice = NULL, M = 3, s = 1:10, trunclower = 0, 
              truncupper = Inf, EVTtruncation = FALSE, ncores = NULL,
              criterium = c("BIC","AIC"), reduceM = TRUE,
              eps = 10^(-3), beta_tol = 10^(-5), maxiter = Inf)

Value

A SpliceFit object.

Arguments

X: Data used for fitting the distribution.
const: Vector of length \(l\) containing the probabilities of the quantiles where the distributions will be spliced (splicing points). The ME distribution will be spliced with \(l\) Pareto distributions. Default is NULL meaning the input from tsplice is used.
tsplice: Vector of length \(l\) containing the splicing points. The ME distribution will be spliced with \(l\) Pareto distributions. Default is NULL meaning the input from const is used.
M: Initial number of Erlang mixtures, default is 3. This number can change when determining an optimal mixed Erlang fit using an information criterion.
s: Vector of spread factors for the EM algorithm, default is 1:10. We loop over these factors when determining an optimal mixed Erlang fit using an information criterion, see Verbelen et al. (2016).
trunclower: Lower truncation point. Default is 0.
truncupper: Upper truncation point. Default is Inf (no upper truncation). When truncupper=Inf and EVTtruncation=TRUE, the truncation point is estimated using the approach of Beirlant et al. (2016).
EVTtruncation: Logical indicating if the \(l\)-th Pareto distribution is a truncated Pareto distribution. Default is FALSE.
ncores: Number of cores to use when determining an optimal mixed Erlang fit using an information criterion. When NULL (default), max(nc-1,1) cores are used where nc is the number of cores as determined by detectCores.
criterium: Information criterion used to select the number of components of the ME fit and s. One of "AIC" and "BIC" (default).
reduceM: Logical indicating if M should be reduced based on the information criterion, default is TRUE.
eps: Covergence threshold used in the EM algorithm (ME part). Default is 10^(-3).
beta_tol: Threshold for the mixing weights below which the corresponding shape parameter vector is considered neglectable (ME part). Default is 10^(-5).
maxiter: Maximum number of iterations in a single EM algorithm execution (ME part). Default is Inf meaning no maximum number of iterations.

Author

Tom Reynkens with R code from Roel Verbelen for fitting the mixed Erlang distribution.

Details

See Reynkens et al. (2017), Section 4.3.1 of Albrecher et al. (2017) and Verbelen et al. (2015) for details. The code follows the notation of the latter. Initial values follow from Verbelen et al. (2016).

The SpliceFitHill function is the same function but with a different name for compatibility with old versions of the package.

Use SpliceFiticPareto when censoring is present.

References

Albrecher, H., Beirlant, J. and Teugels, J. (2017). Reinsurance: Actuarial and Statistical Aspects, Wiley, Chichester.

Beirlant, J., Fraga Alves, M.I. and Gomes, M.I. (2016). "Tail fitting for Truncated and Non-truncated Pareto-type Distributions." Extremes, 19, 429--462.

Reynkens, T., Verbelen, R., Beirlant, J. and Antonio, K. (2017). "Modelling Censored Losses Using Splicing: a Global Fit Strategy With Mixed Erlang and Extreme Value Distributions". Insurance: Mathematics and Economics, 77, 65--77.

Verbelen, R., Gong, L., Antonio, K., Badescu, A. and Lin, S. (2015). "Fitting Mixtures of Erlangs to Censored and Truncated Data Using the EM Algorithm." Astin Bulletin, 45, 729--758.

Verbelen, R., Antonio, K. and Claeskens, G. (2016). "Multivariate Mixtures of Erlangs for Density Estimation Under Censoring." Lifetime Data Analysis, 22, 429--455.

Examples

Run this code

if (FALSE) {

# Pareto random sample
X <- rpareto(1000, shape = 2)

# Splice ME and Pareto
splicefit <- SpliceFitPareto(X, 0.6)



x <- seq(0, 20, 0.01)

# Plot of spliced CDF
plot(x, pSplice(x, splicefit), type="l", xlab="x", ylab="F(x)")

# Plot of spliced PDF
plot(x, dSplice(x, splicefit), type="l", xlab="x", ylab="f(x)")



# Fitted survival function and empirical survival function 
SpliceECDF(x, X, splicefit)

# Log-log plot with empirical survival function and fitted survival function
SpliceLL(x, X, splicefit)

# PP-plot of empirical survival function and fitted survival function
SplicePP(X, splicefit)

# PP-plot of empirical survival function and 
# fitted survival function with log-scales
SplicePP(X, splicefit, log=TRUE)

# Splicing QQ-plot
SpliceQQ(X, splicefit)
}

Run the code above in your browser using DataLab