The main cross-validation function to select the best sprinter fit for a path of tuning parameters.
cv.sprinter(x, y, num_keep = NULL, square = FALSE, lambda = NULL,
nlam = 100, lam_min_ratio = ifelse(nrow(x) < ncol(x), 0.01, 1e-04),
nfold = 5, foldid = NULL)An n by p design matrix of main effects. Each row is an observation of p main effects.
A response vector of size n.
Number of candidate interactions to keep in Step 2. If num_keep is not specified (as default), it will be set to [n / log n].
Indicator of whether squared effects should be fitted in Step 1. Default to be FALSE.
A user specified list of tuning parameter. Default to be NULL, and the program will compute its own lambda path based on nlam and lam_min_ratio.
The number of lambda values. Default value is 100.
The ratio of the smallest and the largest values in lambda. The largest value in lambda is usually the smallest value for which all coefficients are set to zero. Default to be 1e-2 in the n < p setting.
Number of folds in cross-validation. Default value is 5. If each fold gets too view observation, a warning is thrown and the minimal nfold = 3 is used.
A vector of length n representing which fold each observation belongs to. Default to be NULL, and the program will generate its own randomly.
An object of S3 class "sprinter".
nThe sample size.
pThe number of main effects.
a0estimate of intercept corresponding to the CV-selected model.
compactA compact representation of the selected variables. compact has three columns, with the first two columns representing the indices of a selected variable (main effects with first index = 0), and the last column representing the estimate of coefficients.
fitThe whole glmnet fit object in Step 3.
fittedfitted value of response corresponding to the CV-selected model.
lambdaThe sequence of lambda values used.
cvmThe averaged estimated prediction error on the test sets over K folds.
cvsdThe standard error of the estimated prediction error on the test sets over K folds.
foldidFold assignment. A vector of length n.
ibestThe index in lambda that is chosen by CV.
callFunction call.
# NOT RUN {
n <- 100
p <- 200
x <- matrix(rnorm(n * p), n, p)
y <- x[, 1] - 2 * x[, 2] + 3 * x[, 1] * x[, 3] - 4 * x[, 4] * x[, 5] + rnorm(n)
mod <- cv.sprinter(x = x, y = y)
# }
Run the code above in your browser using DataLab