crossval: Cross-validation function for quadrupen fitting methods.

Description

Function that computes K-fold (double) cross-validated error of a quadrupen fit. If no lambda2 is provided, simple cross validation on the lambda1 parameter is performed. If a vector lambda2 is passed as an argument, double cross-validation is performed.

Usage

crossval(
  x,
  y,
  penalty = c("elastic.net", "bounded.reg"),
  K = 10,
  folds = split(sample(1:nrow(x)), rep(1:K, length = nrow(x))),
  lambda2 = 0.01,
  verbose = TRUE,
  mc.cores = 1,
  ...
)

Value

An object of class "cvpen" for which a plot method is available.

Arguments

x: matrix of features, possibly sparsely encoded (experimental). Do NOT include intercept.
y: response vector.
penalty: a string for the fitting procedure used for cross-validation. Either "elastic.net" or "bounded.reg", at the moment. Default is elastic.net.
K: integer indicating the number of folds. Default is 10.
folds: list of K vectors that describes the folds to use for the cross-validation. By default, the folds are randomly sampled with the specified K. The same folds are used for each values of lambda2.
lambda2: tunes the \(\ell_2\)-penalty (ridge-like) of the fit. If none is provided, the default scalar value of the corresponding fitting method is used and a simple CV is performed. If a vector of values is given, double cross-validation is performed (both on lambda1 and lambda2, using the same folds for each lambda2).
verbose: logical; indicates if the progression (the current lambda2) should be displayed. Default is TRUE.
mc.cores: the number of cores to use. The default uses 1 core.
...: additional parameters to overwrite the defaults of the fitting procedure identified by the 'penalty' argument. See the corresponding documentation (elastic.net or bounded.reg).

Examples

Run this code

## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor  <- 0.75
Soo  <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variable
Sww  <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.1
diag(Sigma) <- 1
n <- 100
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)

## Use fewer lambda1 values by overwritting the default parameters
## and cross-validate over the sequences lambda1 and lambda2
# \donttest{
cv.double <- crossval(x,y, lambda2=10^seq(2,-2,len=50), nlambda1=50)
# }
## Rerun simple cross-validation with the appropriate lambda2
cv.10K <- crossval(x,y, lambda2=0.2)
## Try leave one out also
cv.loo <- crossval(x,y, K=n, lambda2=0.2)

# \donttest{
plot(cv.double)
# }
plot(cv.10K)
plot(cv.loo)

## Performance for selection purpose
beta.min.10K <- slot(cv.10K, "beta.min")
beta.min.loo <- slot(cv.loo, "beta.min")

cat("\nFalse positives with the minimal 10-CV choice: ", sum(sign(beta) != sign(beta.min.10K)))
cat("\nFalse positives with the minimal LOO-CV choice: ", sum(sign(beta) != sign(beta.min.loo)))

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples