
Density, cumulative distribution function, quantile function and
random number generation for the extreme value mixture model with
boundary corrected kernel density estimate for bulk
distribution upto the threshold and conditional GPD above threshold with continuity at
threshold. The parameters are the bandwidth lambda
, threshold u
GPD shape xi
and tail fraction phiu
.
dbckdengpdcon(x, kerncentres, lambda = NULL,
u = as.vector(quantile(kerncentres, 0.9)), xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian", bcmethod = "simple", proper = TRUE,
nn = "jf96", offset = NULL, xmax = NULL, log = FALSE)pbckdengpdcon(q, kerncentres, lambda = NULL,
u = as.vector(quantile(kerncentres, 0.9)), xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian", bcmethod = "simple", proper = TRUE,
nn = "jf96", offset = NULL, xmax = NULL, lower.tail = TRUE)
qbckdengpdcon(p, kerncentres, lambda = NULL,
u = as.vector(quantile(kerncentres, 0.9)), xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian", bcmethod = "simple", proper = TRUE,
nn = "jf96", offset = NULL, xmax = NULL, lower.tail = TRUE)
rbckdengpdcon(n = 1, kerncentres, lambda = NULL,
u = as.vector(quantile(kerncentres, 0.9)), xi = 0, phiu = TRUE,
bw = NULL, kernel = "gaussian", bcmethod = "simple", proper = TRUE,
nn = "jf96", offset = NULL, xmax = NULL)
quantiles
kernel centres (typically sample data vector or scalar)
bandwidth for kernel (as half-width of kernel) or NULL
threshold
shape parameter
probability of being above threshold TRUE
bandwidth for kernel (as standard deviations of kernel) or NULL
kernel name (default = "gaussian"
)
boundary correction method
logical, whether density is renormalised to integrate to unity (where needed)
non-negativity correction method (simple boundary correction only)
offset added to kernel centres (logtrans only) or NULL
upper bound on support (copula and beta kernels only) or NULL
logical, if TRUE then log density
quantiles
logical, if FALSE then upper tail probabilities
cumulative probabilities
sample size (positive integer)
dbckdengpdcon
gives the density,
pbckdengpdcon
gives the cumulative distribution function,
qbckdengpdcon
gives the quantile function and
rbckdengpdcon
gives a random sample.
See dbckden
for details of BCKDE methods.
The "simple"
, "renorm"
, "beta1"
, "beta2"
, "gamma1"
and "gamma2"
boundary correction methods may require renormalisation using
numerical integration which can be very slow. In particular, the numerical integration
is extremely slow for the kernel="uniform"
, due to the adaptive quadrature in
the integrate
function
being particularly slow for functions with step-like behaviour.
Based on code by Anna MacDonald produced for MATLAB.
Extreme value mixture model combining boundary corrected kernel density (BCKDE) estimate for the bulk below the threshold and GPD for upper tail with continuity at threshold. The user chooses from a wide range of boundary correction methods designed to cope with a lower bound at zero and potentially also both upper and lower bounds.
Some boundary correction methods require a secondary correction for negative density estimates of which two methods are implemented. Further, some methods don't necessarily give a density which integrates to one, so an option is provided to renormalise to be proper.
It assumes there is a lower bound at zero, so prior transformation of data is required for a alternative lower bound (possibly including negation to allow for an upper bound).
The user can pre-specify phiu
permitting a parameterised value for the
tail fraction phiu=TRUE
the tail fraction
is estimated as the tail fraction from the BCKDE bulk model.
The alternate bandwidth definitions are discussed in the
kernels
, with the lambda
as the default.
The bw
specification is the same as used in the
density
function.
The possible kernels are also defined in kernels
with the "gaussian"
as the default choice.
The cumulative distribution function with tail fraction phiu=TRUE
), upto the threshold
The cumulative distribution function for pre-specified
The continuity constraint means that
Unlike the standard KDE, there is no general rule-of-thumb bandwidth for all the
BCKDE, with only certain methods having a guideline in the literature, so none
have been implemented. Hence, a bandwidth must always be specified and you should
consider using fbckdengpdcon
of
fbckden
function for cross-validation
MLE for bandwidth.
See gpd
for details of GPD upper tail component and
dbckden
for details of BCKDE bulk component.
http://en.wikipedia.org/wiki/Kernel_density_estimation
http://en.wikipedia.org/wiki/Generalized_Pareto_distribution
Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf
Bowman, A.W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353-360.
Duin, R.P.W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C25(11), 1175-1179.
MacDonald, A., Scarrott, C.J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Computational Statistics and Data Analysis 55(6), 2137-2157.
MacDonald, A., C. J. Scarrott, and D. S. Lee (2011). Boundary correction, consistency and robustness of kernel densities using extreme value theory. Submitted. Available from: http://www.math.canterbury.ac.nz/~c.scarrott.
Wand, M. and Jones, M.C. (1995). Kernel Smoothing. Chapman && Hall.
gpd
, kernels
,
kfun
,
density
, bw.nrd0
and dkde
in ks
package.
Other kden kdengpd kdengpdcon bckden bckdengpd bckdengpdcon
fkden fkdengpd fkdengpdcon fbckden fbckdengpd fbckdengpdcon: bckdengpd
,
bckden
, fbckden
,
fkden
, kdengpdcon
,
kdengpd
, kden
# NOT RUN {
set.seed(1)
par(mfrow = c(2, 2))
kerncentres=rgamma(500, shape = 1, scale = 2)
xx = seq(-0.1, 10, 0.01)
hist(kerncentres, breaks = 100, freq = FALSE)
lines(xx, dbckdengpdcon(xx, kerncentres, lambda = 0.5, bcmethod = "reflect"),
xlab = "x", ylab = "f(x)")
abline(v = quantile(kerncentres, 0.9))
plot(xx, pbckdengpdcon(xx, kerncentres, lambda = 0.5, bcmethod = "reflect"),
xlab = "x", ylab = "F(x)", type = "l")
lines(xx, pbckdengpdcon(xx, kerncentres, lambda = 0.5, xi = 0.3, bcmethod = "reflect"),
xlab = "x", ylab = "F(x)", col = "red")
lines(xx, pbckdengpdcon(xx, kerncentres, lambda = 0.5, xi = -0.3, bcmethod = "reflect"),
xlab = "x", ylab = "F(x)", col = "blue")
legend("topleft", paste("xi =",c(0, 0.3, -0.3)),
col=c("black", "red", "blue"), lty = 1, cex = 0.5)
kerncentres = rweibull(1000, 2, 1)
x = rbckdengpdcon(1000, kerncentres, lambda = 0.1, phiu = TRUE, bcmethod = "reflect")
xx = seq(0.01, 3.5, 0.01)
hist(x, breaks = 100, freq = FALSE)
lines(xx, dbckdengpdcon(xx, kerncentres, lambda = 0.1, phiu = TRUE, bcmethod = "reflect"),
xlab = "x", ylab = "f(x)")
lines(xx, dbckdengpdcon(xx, kerncentres, lambda = 0.1, xi=-0.2, phiu = 0.1, bcmethod = "reflect"),
xlab = "x", ylab = "f(x)", col = "red")
lines(xx, dbckdengpdcon(xx, kerncentres, lambda = 0.1, xi=0.2, phiu = 0.1, bcmethod = "reflect"),
xlab = "x", ylab = "f(x)", col = "blue")
legend("topleft", c("xi = 0", "xi = 0.2", "xi = -0.2"),
col=c("black", "red", "blue"), lty = 1)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab