Learn R Programming

MESS (version 0.4-3)

feature.test: Inference for features identified by the Lasso

Description

Performs randomization tests of features identified by the Lasso

Usage

feature.test(x, y, B = 100, type.measure = "deviance", s = "lambda.min", keeplambda = FALSE, olsestimates = TRUE, penalty.factor = rep(1, nvars), alpha = 1, control = list(trace = FALSE, maxcores = 24), ...)

Arguments

x
input matrix, of dimension nobs x nvars; each row is an observation vector.
y
quantitative response variable of length nobs
B
The number of randomizations used in the computations
type.measure
loss to use for cross-validation. See cv.glmnet for more information
s
Value of the penalty parameter 'lambda' at which predictions are required. Default is the entire sequence used to create the model. See coef.glmnet for more information
keeplambda
If set to TRUE then the estimated lambda from cross validation from the original dataset is kept and used for evaluation in the subsequent randomization datasets. This reduces computation time substantially as it is not necessary to perform cross validation for each randomization. If set to a value then that value is used for the value of lambda. Defaults to FALSE
olsestimates
Logical. Should the test statistic be based on OLS estimates from the model based on the variables selected by the lasso. Defaults to TRUE. If set to FALSE then the coefficients from the lasso is used as test statistics.
penalty.factor
a vector of weights used for adaptive lasso. See glmnet for more information.
alpha
The elasticnet mixing parameter. See glmnet for more information.
control
A list of options that control the algorithm. Currently trace is a logical and if set to TRUE then the function produces more output. maxcores sets the maximum number of cores to use with the parallel package
...
Other arguments passed to glmnet

Value

Returns a list of 7 variables:
p.full
The p-value for the test of the full set of variables selected by the lasso (based on the OLS estimates)
ols.selected
A vector of the indices of the non-zero variables selected by glmnet sorted from (numerically) highest to lowest based on their ols test statistic.
p.maxols
The p-value for the maximum of the OLS test statistics
lasso.selected
A vector of the indices of the non-zero variables selected by glmnet sorted from (numerically) highest to lowest based on their absolute lasso coefficients.
p.maxlasso
The p-value for the maximum of the lasso test statistics
lambda.orig
The value of lambda used in the computations
B
The number of permutations used

References

Brink-Jensen, K and Ekstrom, CT 2014. Inference for feature selection using the Lasso with high-dimensional data. http://arxiv.org/abs/1403.4296

See Also

glmnet

Examples

Run this code


# Simulate some data
x <- matrix(rnorm(30*100), nrow=30)
y <- rnorm(30, mean=1*x[,1])

# Make inference for features
## Not run: feature.test(x, y)


Run the code above in your browser using DataLab