Learn R Programming

npregfast (version 1.3.0)

globaltest: Testing the equality of the M curves specific to each level

Description

This function can be used to test the equality of the $M$ curves specific to each level.

Usage

globaltest(formula, data = data, der, smooth = "kernel", weights = NULL,
  nboot = 500, h0 = -1, h = -1, nh = 30, kernel = "epanech", p = 3,
  kbin = 100, seed = NULL, cluster = TRUE, ncores = NULL, ...)

Arguments

formula
An object of class formula: a sympbolic description of the model to be fitted. The details of model specification are given under 'Details'.
data
A data frame or matrix containing the model response variable and covariates required by the formula.
der
Number which determines any inference process. By default der is NULL. If this term is 0, the testing procedures is applied for the estimate. If it is 1 or 2, it is designed for the first
smooth
Type smoother used: smooth = "kernel" for local polynomial kernel smoothers and smooth = "splines" for splines using the mgcv package.
weights
Prior weights on the data.
nboot
Number of bootstrap repeats.
h0
The kernel bandwidth smoothing parameter for the global effect (see references for more details at the estimation). Large values of the bandwidth lead to smoothed estimates; smaller values of the bandwidth lead lo undersmoothed estimates. By default, cro
h
The kernel bandwidth smoothing parameter for the partial effects.
nh
Integer number of equally-spaced bandwidth on which the h is discretised, to speed up computation.
kernel
A character string specifying the desired kernel. Defaults to kernel = "epanech", where the Epanechnikov density function kernel will be used. Also, several types of kernel funcitons can be used: triangular and Gaussian density function,
p
Degree of polynomial to be used. Its value must be the value of derivative + 1. The default value is 3 due to the function returns the estimation, first and second derivative.
kbin
Number of binning nodes over which the function is to be estimated.
seed
Seed to be used in the bootstrap procedure.
cluster
A logical value. If TRUE (default), the bootstrap procedure is parallelized (only for smooth = "splines". Note that there are cases (e.g., a low number of bootstrap repetitions) that R will gain in performance through serial
ncores
An integer value specifying the number of cores to be used in the parallelized procedure. If NULL (default), the number of cores to be used is equal to the number of cores of the machine - 1.
...
Other options.

Value

  • The $T$ value and the $p$-value are returned. Additionally, it is shown the decision, accepted or rejected, of the global test. The null hypothesis is rejected if the $p$-value$< 0.05$.

Details

globaltest can be used to test the equality of the $M$ curves specific to each level. This bootstrap based test assumes the following null hypothesis:

$$H_0^r: m_1^r(\cdot) = \ldots = m_M^r(\cdot)$$

versus the general alternative

$$H_1^r: m_i^r (\cdot) \ne m_j^r (\cdot) \quad \rm{for} \quad \rm{some} \quad \emph{i}, \emph{j} \in { 1, \ldots, M}.$$

Note that, if $H_0$ is not rejected, then the equality of critical points will also accepted.

To test the null hypothesis, it is used a test statistic, $T$, based on direct nonparametric estimates of the curves.

If the null hypothesis is true, the $T$ value should be close to zero but is generally greater. The test rule based on $T$ consists of rejecting the null hypothesis if $T > T^{1- \alpha}$, where $T^p$ is the empirical $p$-percentile of $T$ under the null hypothesis. To obtain this percentile, we have used bootstrap techniques. See details in references.

Note that the models fitted by globaltest function are specified in a compact symbolic form. The ~ operator is basic in the formation of such models. An expression of the form y ~ model is interpreted as a specification that the response y is modelled by a predictor specified symbolically by model. The possible terms consist of a variable name or a variable name and a factor name separated by : operator. Such a term is interpreted as the interaction of the continuous variable and the factor. However, if smooth = "splines", the formula is based on the function formula.gam of the mgcv package.

References

Sestelo, M. (2013). Development and computational implementation of estimation and inference methods in flexible regression models. Applications in Biology, Engineering and Environment. PhD Thesis, Department of Statistics and O.R. University of Vigo.

Examples

Run this code
library(npregfast)
data(barnacle)
globaltest(DW ~ RC : F, data = barnacle, der = 1, seed = 130853, nboot = 100)

Run the code above in your browser using DataLab