cv.relaxnet
will cross-validate on the value of lambda for both the main model and for the relaxed models, a two dimensional cross-validation. cv.alpha.relaxnet
will in addition cross-validate on the value of alpha. For each value of alpha, relaxnet is run once on whole data set, then it is run again v times on subsets of the rows of the data.
cv.relaxnet(x, y, family = c("gaussian", "binomial"), nlambda = 100, alpha = 1, relax = TRUE, relax.nlambda = 100, relax.max.vars = min(nrow(x), ncol(x)) * 0.8, lambda = NULL, relax.lambda.index = NULL, relax.lambda.list = NULL, nfolds = 10, foldid, multicore = FALSE, mc.cores, mc.seed = 123, ...)
cv.alpha.relaxnet(x, y, family = c("gaussian", "binomial"), nlambda = 100, alpha = c(.1, .3, .5, .7, .9), relax = TRUE, relax.nlambda = 100, relax.max.vars = min(nrow(x), ncol(x)) * 0.8, lambda = NULL, relax.lambda.index = NULL, relax.lambda.list = NULL, nfolds = 10, foldid, multicore = FALSE, mc.cores, mc.seed = 123, ...)
"sparseMatrix"
as in package Matrix
). Must have unique colnames.
family="gaussian"
. For family="binomial"
should be either a factor with two levels, or a two-column matrix of counts or proportions.
lambda
values - default is 100. Determines how fine the grid of lambda values should be.
glmnet
). For cv.relaxnet
, this should be a single value. For cv.alpha.relaxnet
it should be a vector of values.
ncol(x)
> nrow(x)
and alpha
< 1, it may make sense to use a value > nrow(x)
, but this may lead to increased computation time.
glmnet
). Optional: default is to let glmnet
choose its own sequence.
relaxnet
determine these values based on the beta
matrix from the main glmnet
fit. Ignored if lambda
argument is NULL
.
relaxnet
determine these values. Ignored if lambda argument is NULL
.
nfolds
can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3
.
nfold
identifying what fold each observation is in. If supplied, nfolds
can be missing.
cv.relaxnet
) or over alpha values (for cv.alpha.relaxnet
) using multicore functionality from R's parallel package?
cv.relaxnet
) or the length of alpha (for cv.alpha.relaxnet
) is a multiple of mc.cores. Ignored if multicore
is FALSE
RNGkind
will be called to set the RNG to "L'Ecuyer-CMRG"
). Will be ignored if multicore
is FALSE
. If mulicore
is FALSE
, one should be able to get reprodicible results by setting the seed normally (with set.seed
) prior to running.
standardize = FALSE
will probably work correctly, but setting an offset probably won't.
cv.relaxnet
-- an object of class "cv.relaxnet"
containing the following slots:FALSE
, then several of the other elements of this result will be set to NA
.
length(lambda)
. For main model.
cvm
for main model.
cvm+cvsd
for main model.
cvm-cvsd
for main model.
lambda
for main model.
"main"
if the main model "won" the cross-validation, and if not, it will be an integer specifying which relaxed model won (i.e. which element of relaxnet.fit$relax.glmnet.fits).
which.model.min
).
cv.glmnet
).
cv.alpha.relaxnet
-- an object of class "cv.alpha.relaxnet"
containing the following slots:FALSE
, then several of the other elements of this result will be set to NA
.
cv.glmnet's type.measure
argument has not yet been implemented. For type = gaussian models, mean squared error is used, and for type = binomial, binomial deviance is used.
Jerome Friedman, Trevor Hastie, Rob Tibshirani (2010) Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software 33(1)
Nicolai Meinshausen (2007) Relaxed Lasso Computational Statistics and Data Analysis 52(1), 374-393
relaxnet
, predict.cv.relaxnet
## generate predictor matrix
nobs <- 100
nvars <- 200
set.seed(23)
x <- matrix(rnorm(nobs * nvars), nobs, nvars)
## make sure it has unique colnames
colnames(x) <- paste("x", 1:ncol(x), sep = "")
## let y depend on first 5 columns plus noise
y <- rowSums(x[, 1:5]) + rnorm(nrow(x))
## run cv.relaxnet
cv.result <- cv.relaxnet(x, y)
predict(cv.result, type = "nonzero")
## very few false positives compared to glmnet alone
## glmnet min rule
predict(cv.result$relaxnet.fit$main.glmnet.fit,
type = "nonzero",
s = cv.result$main.lambda.min)
## glmnet 1se rule
predict(cv.result$relaxnet.fit$main.glmnet.fit,
type = "nonzero",
s = cv.result$main.lambda.1se)
## get values of the coefs for cv.relaxnet's chosen fit
coefs <- drop(predict(cv.result, type = "coef"))
coefs[coefs != 0]
Run the code above in your browser using DataLab