vkgmss.test.bootstrap
Local test for the regression function
A local test for the regression function.
Usage
vkgmss.test.bootstrap(data.X, data.Y, linkfunction.H0, risk,
bandwidth = "optimal", kernel.function = kernel.function.epan,
bootstrap = c(50, "Mammen"), verbose = TRUE)
Arguments
- data.X
a numeric data vector used to obtain the nonparametric estimator of the error distribution.
- data.Y
a numeric data vector used to obtain the nonparametric estimator of the error distribution.
- linkfunction.H0
the regression function under the null hypothesis.
- risk
a numeric value specifying the risk of rejecting the null hypothesis. The value (1-
risk
) corresponds to the confidence level of the statistical test.- bandwidth
the bandwidth used to obtain the nonparametric estimator of the error distribution. If
bandwidth
="optimal", the optimal bandwidth of the regression function under the null hypothesis is computed. Default option is "optimal".- kernel.function
the kernel function used to obtain the nonparametric estimator of the error distribution. Default option is "kernel.function.epan".
- bootstrap
a numeric vector of length 2. The first value specifies the number of bootstrap datasets (default is "50"). The second value specifies the distribution used for the wild bootstrap resampling (default is "Mammen").
- verbose
If
TRUE
, the R function displays the optimal bandwidth value obtained under the null hypothesis. Default option isTRUE
.
Details
From data.X
and data.Y
datasets, wild bootstrap datasets ("50" by default) are built. From each bootstrap dataset, a bootstrap test statistic is computed. The test statistic under the null hypothesis is compared to the distribution of the bootstrap statistics. The test is rejected if the test statistic under the null hypothesis is greater than the (1-risk
)-quantile of the empirical distribution of the bootstrap statistics.
An inappropriate bandwidth choice can produce "NaN" values in test statistics.
Value
vkgmss.test.bootstrap
returns a list containing the following components:
the statistical decision made on whether to reject the null hypothesis or not.
the bandwidth used to build the statistics test.
the p-value of the test statistics.
the test statistics value.
References
I. Van Keilegom, W. Gonzalez Manteiga, and C. Sanchez Sellero. Goodness-of-fit tests in parametric regression based on the estimation of the error distribution. Test, 17, 401:415, 2008.
R. Azais, S. Ferrigno and M-J Martinez. cvmgof: An R package for Cram<U+00E9>r-von Mises goodness-of-fit tests in regression models. 2018. Preprint in progress.
Examples
# NOT RUN {
set.seed(1)
# Data simulation
n = 25 # Dataset size
data.X = runif(n,min=0,max=5) # X
data.Y = 0.2*data.X^2-data.X+2+rnorm(n,mean=0,sd=0.3) # Y
########################################################################
# Test (bootstrap) under H0
# We want to test if the link function is f(x)=0.2*x^2-x+2
# The answer is yes (see the definition of data.Y above)
# We generate a dataset under H0 to estimate the optimal bandwidth under H0
linkfunction.H0 = function(x){0.2*x^2-x+2}
test_vkgmss.H0 = vkgmss.test.bootstrap(data.X,data.Y,linkfunction.H0,
0.05,bandwidth='optimal',bootstrap=c(50,'Mammen'))
########################################################################
# Test (bootstrap) under H1
# We want to test if the link function is f(x)=0.5*cos(x)+1
# The answer is no (see the definition of data.Y above)
linkfunction.H1=function(x){0.8*cos(x)+1}
test_vkgmss.H1 = vkgmss.test.bootstrap(data.X,data.Y,linkfunction.H1,
0.05,bandwidth='optimal',bootstrap=c(50,'Mammen'))
# }