scedastic.test: Test on the effect of concomitant covariate on the extremes of the response variable

Description

Given observed data, perform a Kolmogorov-Smirnov type test comparing the cumulative distribution function of the concomitant covariate, defined as \(X \mid Y > t\), with \(t\) being the threshold, against the cumulative distribution function of the random vector of covariate.

Usage

scedastic.test(data, k, M = 1000L, xg, ng, bayes = TRUE, C = 5L, alpha = 0.05)

Value

a list with components

Delta maximum observed distance between the empirical distribution functions of the concomitant and complete covariate
DeltaM vector of length M containing the sample of maximum distances between the empirical distribution function of the concomitant complete covariate
critical double, critical value for the test statistic, computed as the \((1-alpha)\) level empirical quantile of DeltaM
pval double, p-value

Arguments

data: design matrix of dimension n by 2 containing the complete data for the dependent variable (first column) and covariate (second column) in [0,1]
k: integer, number of exceedances for the generalized Pareto
M: integer, number of samples to draw from the posterior distrinution of the law of the concomitant covariate. Default: 1000
xg: vector of covariate grid of dimension ng by 1 containing a sequence between zero and the last value of the corresponding covariate
ng: length of covariate grid
bayes: logical indicating the bootstrap method. If FALSE, a frequentist bootstrap on the empirical cumulative distribution function of the concomitant covariate is performed. Default to TRUE
C: integer, hypermparameter entering the posterior distributyion of the law of the concomitant covariate. Default: 5
alpha: double, significance level for the critical value of the test, computed as the \((1-alpha)\) level empirical quantile of the sample of distances between the empirical cumulative distribution function of the concomitant and complete covariate. Default: 0.05

References

Dombry, C., S. Padoan and S. Rizzelli (2025). Asymptotic theory for Bayesian inference and prediction: from the ordinary to a conditional Peaks-Over-Threshold method, arXiv:2310.06720v2.

Examples

Run this code

if (FALSE) {
# generate data
set.seed(1234)
n <- 500
samp <- evd::rfrechet(n,0,1:n,4)
# set effective sample size and threshold
k <- 50
threshold <- sort(samp,decreasing = TRUE)[k+1]
# preliminary mle estimates of scale and shape parameters
mlest <- evd::fpot(samp,
 threshold,
 control=list(maxit = 500))
# empirical bayes procedure
proc <- estPOT(
  samp,
  k = k,
  pn = c(0.01, 0.005),
  type = "continuous",
  method = "bayesian",
  prior = "empirical",
  start = as.list(mlest$estimate),
  sig0 = 0.1)
# conditional predictive density estimation
yg <- seq(0, 50, by = 2)
nyg <- length(yg)
# estimation of scedasis function
# setting
M <- 1e3
C <- 5
alpha <- 0.05
bw <- .5
nsim <- 5000
burn <- 1000
# create covariate
# in sample obs
n_in = n
# number of years ahead
nY = 1
n_out = 365 * nY
# total obs
n_tot = n_in + n_out
# total covariate (in+out sample period)
x <- seq(0, 1, length = n_tot)
# in sample grid dimension for covariate
ng_in <- 150
xg <- seq(0, x[n_in], length = ng_in)
# in+out of sample grid
xg <- c(xg,
 seq(x[n_in + 1],
     x[(n_tot)],
     length = ng_in))
# in+out sample grid dimension
nxg <- length(xg)
xg <- array(xg, c(nxg, 1))
# in sample observations
samp_in <- samp[1:n_in]
ssamp_in <- sort(samp_in, decreasing = TRUE, index = TRUE)
x_in <- x[1:n_in] # in sample covariate
xs <- x_in[ssamp_in$ix[1:k]] # in sample concomitant covariate
# test on covariate effect
test <- scedastic.test(
  cbind(samp, x[1:n]),
  k,
  M,
  array(xg[1:ng_in], c(ng_in, 1)),
  ng_in,
  TRUE,
  C,
  0.05
)
}

Run the code above in your browser using DataLab