Learn R Programming

evmix (version 1.0)

kdengpdcon: Kernel Density Estimation Using Normal Kernel and GPD Tail Extreme Value Mixture Model with Single Continuity Constraint

Description

Density, cumulative distribution function, quantile function and random number generation for the kernel density estimation using normal kernel for the bulk distribution upto the threshold and conditional GPD above threshold and continuous at threshold. The parameters are the bandwidth lambda, threshold u GPD and shape xi and tail fraction phiu.

Usage

dkdengpdcon(x, kerncentres, lambda = NULL,
    u = as.vector(quantile(kerncentres, 0.9)), xi = 0,
    phiu = TRUE, log = FALSE)

  pkdengpdcon(q, kerncentres, lambda = NULL,
    u = as.vector(quantile(kerncentres, 0.9)), xi = 0,
    phiu = TRUE, lower.tail = TRUE)

  qkdengpdcon(p, kerncentres, lambda = NULL,
    u = as.vector(quantile(kerncentres, 0.9)), xi = 0,
    phiu = TRUE, lower.tail = TRUE)

  rkdengpdcon(n = 1, kerncentres, lambda = NULL,
    u = as.vector(quantile(kerncentres, 0.9)), xi = 0,
    phiu = TRUE)

Arguments

x
quantile
kerncentres
kernel centres (typically sample data)
lambda
bandwidth for normal kernel (standard deviation of normal)
u
threshold
xi
shape parameter
phiu
probability of being above threshold [0,1]
log
logical, if TRUE then log density
q
quantile
lower.tail
logical, if FALSE then upper tail probabilities
p
cumulative probability
n
sample size (non-negative integer)

Value

Details

Extreme value mixture model combining kernel density estimation using normal kernel for the bulk below the threshold and GPD for upper tail, with a constraint to be continuous at the threshold. The user can pre-specify phiu permitting a parameterised value for the tail fraction $\phi_u$. Alternatively, when phiu=TRUE the tail fraction is estimated as the tail fraction from the normal bulk model. The cumulative distribution function with tail fraction $\phi_u$ defined by the upper tail fraction of the kernel density estimation using normal kernel (phiu=TRUE), upto the threshold $x \le u$, given by: $$F(x) = H(x)$$ and above the threshold $x > u$: $$F(x) = H(u) + [1 - H(u)] G(x)$$ where $H(x)$ and $G(X)$ are the kernel and conditional GPD cumulative distribution functions (i.e. mean(pnorm(x, kerncentres, lambda)) and pgpd(x, u, sigmau, xi)). The cumulative distribution function for pre-specified $\phi_u$, upto the threshold $x \le u$, is given by: $$F(x) = (1 - \phi_u) H(x)/H(u)$$ and above the threshold $x > u$: $$F(x) = \phi_u + [1 - \phi_u] G(x)$$ Notice that these definitions are equivalent when $\phi_u = 1 - mean(H(u))$. The continuity constraint means that $(1 - \phi_u) h(u)/H(u) = \phi_u g(u)$ where $h(x)$ and $g(x)$ are the KDE and conditional GPD density functions. The resulting GPD scale parameter is then: $$\sigma_u = \phi_u H(u) / [1 - \phi_u] h(u)$$. In the special case of where the tail fraction is defined by the bulk model this reduces to $$\sigma_u = [1 - H(u)] / h(u)$$. See gpd for details of GPD upper tail component and dkden for details of KDE of bulk component.

References

http://en.wikipedia.org/wiki/Normal_distribution http://en.wikipedia.org/wiki/Generalized_Pareto_distribution Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360. Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179. MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.

See Also

kdengpd, kden, gpd and dnorm Other kdengpdcon: fkdengpdcon, lkdengpdcon, nlkdengpdcon

Examples

Run this code
par(mfrow=c(2,2))
kerncentres=rnorm(500, 0, 1)
xx = seq(-4, 4, 0.01)
hist(kerncentres, breaks = 100, freq = FALSE)
lines(xx, dkdengpdcon(xx, kerncentres, u = 1.2, xi = 0.1))

plot(xx, pkdengpdcon(xx, kerncentres), type = "l")
lines(xx, pkdengpdcon(xx, kerncentres, xi = 0.3), col = "red")
lines(xx, pkdengpdcon(xx, kerncentres, xi = -0.3), col = "blue")
legend("topleft", paste("xi =",c(0, 0.3, -0.3)),
      col=c("black", "red", "blue"), lty = 1, cex = 0.5)

kerncentres=rnorm(1000, 0, 1)
x = rkdengpdcon(1000, kerncentres, phiu = 0.1, u = 1.2, xi = 0.1)
xx = seq(-4, 6, 0.01)
hist(x, breaks = 100, freq = FALSE, xlim = c(-4, 6))
lines(xx, dkdengpdcon(xx, kerncentres, phiu = 0.1))

plot(xx, dkdengpdcon(xx, kerncentres, xi=0, phiu = 0.2), type = "l")
lines(xx, dkdengpdcon(xx, kerncentres, xi=-0.2, phiu = 0.2), col = "red")
lines(xx, dkdengpdcon(xx, kerncentres, xi=0.2, phiu = 0.2), col = "blue")
legend("topleft", c("xi = 0", "xi = 0.2", "xi = -0.2"),
      col=c("black", "red", "blue"), lty = 1)

Run the code above in your browser using DataLab