Learn R Programming

cxreg (version 1.0.0)

cv.classo: Cross-validation for classo

Description

Does k-fold cross-validation for classo, produces a plot, and returns a value for lambda

Usage

cv.classo(
  x,
  y,
  weights = NULL,
  lambda = NULL,
  nfolds = 10,
  foldid = NULL,
  alignment = c("lambda", "fraction"),
  keep = FALSE,
  parallel = FALSE,
  trace.it = 0,
  ...
)

Value

an object of class "cv.classo" is returned, which is a list with the ingredients of the cross-validation fit.

lambda

the values of lambda used in the fits.

cvm

The mean cross-validated error - a vector of length length(lambda).

cvsd

estimate of standard error of cvm.

cvup

upper curve = cvm+cvsd.

cvlo

lower curve = cvm-cvsd.

nzero

number of non-zero coefficients at each lambda.

name

a text string indicating type of measure for plotting purposes).

classo.fit

a fitted classo object for the full data.

lambda.min

value of lambda that gives minimum cvm.

lambda.1se

largest value of lambda such that error is within 1 standard error of the minimum.

fit.preval

if keep=TRUE, this is the array of pre-validated fits. Some entries can be NA, if that and subsequent values of lambda are not reached for that fold

foldid

if keep=TRUE, the fold assignments used

index

a one column matrix with the indices of lambda.min and lambda.1se in the sequence of coefficients, fits etc.

Arguments

x

x matrix as in classo.

y

response y as in classo.

weights

Observation weights; defaults to 1 per observation

lambda

Optional user-supplied lambda sequence; default is NULL, and classo chooses its own sequence. Note that this is done for the full model (master sequence), and separately for each fold. The fits are then aligned using the master sequence (see the alignment argument for additional details). Adapting lambda for each fold leads to better convergence. When lambda is supplied, the same sequence is used everywhere.

nfolds

number of folds - default is 10. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large dataset. Smallest value allowable is nfolds=3

foldid

an optional vector of values between 1 and nfolds identifying what fold each observation is in. If supplied, nfolds can be missing.

alignment

This is an experimental argument, designed to fix the problems users were having with CV, with possible values "lambda" (the default) else "fraction". With "lambda" the lambda values from the master fit (on all the data) are used to line up the predictions from each of the folds. In some cases this can give strange values, since the effective lambda values in each fold could be quite different. With "fraction" we line up the predictions in each fold according to the fraction of progress along the regularization. If in the call a lambda argument is also provided, alignment="fraction" is ignored (with a warning).

keep

If keep=TRUE, a prevalidated array is returned containing fitted values for each observation and each value of lambda. This means these fits are computed with this observation and the rest of its fold omitted. The foldid vector is also returned. Default is keep=FALSE.

parallel

If TRUE, use parallel foreach to fit each fold. Must register parallel before hand, such as doMC or others. Currently it is unavailable.

trace.it

If trace.it=1, then progress bars are displayed; useful for big models that take a long time to fit. Limited tracing if parallel=TRUE

...

Other arguments that can be passed to classo

Author

Navonil Deb, Younghoon Kim, Sumanta Basu
Maintainer: Younghoon Kim yk748@cornell.edu

Details

The function runs classo nfolds+1 times; the first to get the lambda sequence, and then the remainder to compute the fit with each of the folds omitted. The error is accumulated, and the average error and standard deviation over the folds is computed.

Note that the results of cv.classo are random, since the folds are selected at random. Users can reduce this randomness by running cv.classo many times, and averaging the error curves.

See Also

classo and plot and coef methods for "cv.classo".

Examples

Run this code
# \donttest{
set.seed(1010)
n = 1000
p = 200
x = array(rnorm(n*p), c(n,p)) + (1+1i) * array(rnorm(n*p), c(n,p))
for (j in 1:p) x[,j] = x[,j] / sqrt(mean(Mod(x[,j])^2))
e = rnorm(n) + (1+1i) * rnorm(n)
b = c(1, -1, rep(0, p-2)) + (1+1i) * c(-0.5, 2, rep(0, p-2))
y = x %*% b + e
cv.test = cv.classo(x,y)
# }

Run the code above in your browser using DataLab