Learn R Programming

FWDselect (version 1.1)

test: Bootstrap based test for covariate selection

Description

Function that applies a bootstrap based test for covariate selection. It helps to determine the number of variables to be included in the model.

Usage

test(x, y, method = "lm", family = "gaussian", nboot = 50, speedup=TRUE,
 unique=FALSE, num.h0=1)

Arguments

x
A data frame containing all the covariates.
y
A vector with the response values.
method
A character string specifying which regression method is used, i.e., linear models ("lm"), generalized additive models ("glm") or generalized additive models ("gam").
family
This is a family object specifying the distribution and link to use in fitting: "gaussian", "binomial" or "poisson".
nboot
Number of bootstrap repeats.
speedup
A logical value. If TRUE (default), the testing procedure is accelerated by a minor change in the statistic.
unique
A logical value. If TRUE, the test is performed only for one null hypothesis, given by the argument num.h0.
num.h0
If unique is TRUE, num.h0 is the integer number $q$ of $H_0(q)$ to be tested.

Value

  • HypothesisNumber of the null hypothesis tested
  • StatisticValue of the T statistic
  • pvaluepvalue obtained in the testing procedure
  • DecisionResult of the test for a significance level of 0.05

Details

In a regression framework, let $X_1, X_2, \ldots, X_p$, a set of $p$ initial variables and $Y$ the response variable, we propose a procedure to test the null hypothesis of $q$ significant variables in the model versus the alternative in which the model contains more than $q$ variables. Based on the general model

$$Y=m(\textbf{X})+\varepsilon \quad {\rm{where}} \quad m(\textbf{X})= m_{1}(X_{1})+m_{2}(X_{2})+\ldots+m_{p}(X_{p})$$

the following strategy is considered: for a subset of size $q$, considerations will be given to a test for the null hypothesis

$$H_{0} (q): \sum_{j=1}^p I_{{m_j \ne 0}} \le q$$

vs. the general hypothesis

$$H_{1} : \sum_{j=1}^p I_{{m_j \ne 0}} > q$$

References

Sestelo, M., Villanueva, N. M. and Roca-Pardinas, J. (2013). FWDselect: an R package for selecting variables in regression models. Discussion Papers in Statistics and Operation Research, 13/02.

See Also

selection

Examples

Run this code
library(FWDselect)
data(pollution)
x=pollution[,-19]
y=pollution[,19]
test(x,y,method="lm",nboot=5)

Run the code above in your browser using DataLab