estimPVal: Estimate p-values for a model fitted by GAMBoost or GLMBoost

Description

Performs permutation-based p-value estimation for the optional covariates in a fit from GAMBoost or GAMBoost. Currently binary response models with linear effects are supported, and the components have to be selected with criterion="score"

Usage

estimPVal(object,x,y,permute.n=10,per.covariate=FALSE,parallel=FALSE, multicore=FALSE,trace=FALSE,...)

Arguments

object

fit object obtained from GAMBoost or GLMBoost.

n * p matrix of covariates with linear effect. This has to be the same that was used as x.linear in the call to GAMBoost or x in GLMBoost.

response vector. This has to be the same that was used in the call to GAMBoost or GLMBoost.

permute.n

number of permutations employed for obtaining a null distribution.

per.covariate

logical value indicating whether a separate null distribution should be considered for each covariate. A larger number of permutations will be needed if this is wanted.

parallel

logical value indicating whether computations for obtaining a null distribution via permutation should be performed in parallel on a compute cluster. Parallelization is performed via the package snowfall and the initialization function of of this package, sfInit, should be called before calling estimPVal.

multicore

indicates whether computations in the permuted data sets should be performed in parallel, using package multicore. If TRUE, package multicore is employed using the default number of cores. A value larger than 1 is taken to be the number of cores that should be employed.

trace

logical value indicating whether progress in estimation should be indicated by printing the number of the permutation that is currently being evaluated.

...

miscellaneous parameters for the calls to GAMBoost

Value

Vector with p-value estimates, one value for each optional covariate with linear effect specificed in the original call to GAMBoost or GLMBoost.

Details

As p-value estimates are based on permutations, random numbers are drawn for determining permutation indices. Therfore, the results depend on the state of the random number generator. This can be used to explore the variability due to random variation and help to determine an adequate value for permute.n. A value of 100 should be sufficient, but this can be quite slow. If there is a considerable number of covariates, e.g., larger than 100, a much smaller number of permutations, e.g., 10, might already work well. The estimates might also be negatively affected, if only a small number of boosting steps (say

References

Binder, H., Porzelius, C. and Schumacher, M. (2009). Rank-based p-values for sparse high-dimensional risk prediction models fitted by componentwise boosting. FDM-Preprint Nr. 101, University of Freiburg, Germany.

Examples

Run this code

## Not run: 
# ##  Generate some data 
# x <- matrix(runif(100*8,min=-1,max=1),100,8)             
# eta <- -0.5 + 2*x[,1] + 4*x[,3]
# y <- rbinom(100,1,binomial()$linkinv(eta))
# 
# ##  Fit a model with only linear components
# gb1 <- GLMBoost(x,y,penalty=100,stepno=100,trace=TRUE,family=binomial(),criterion="score") 
# 
# #   estimate p-values
# 
# p1 <- estimPVal(gb1,x,y,permute.n=10)
# 
# #   get a second vector of estimates for checking how large
# #   random variation is
# 
# p2 <- estimPVal(gb1,x,y,permute.n=10)
# 
# plot(p1,p2,xlim=c(0,1),ylim=c(0,1),xlab="permute 1",ylab="permute 2")
# ## End(Not run)

Run the code above in your browser using DataLab