Learn R Programming

cvLM (version 2.0.0)

grid.search: Efficient Grid Search for Optimal Ridge Regularization

Description

Performs an optimized grid search to find the regularization parameter \(\lambda\) that minimizes the cross-validation metric for ridge regression.

Usage

grid.search(formula, data, subset, na.action, K = 10L, 
            generalized = FALSE, seed = 1L, n.threads = 1L, 
            tol = 1e-7, max.lambda = 10000, precision = 0.1, 
            center = TRUE)

Value

A list with the following components:

CV

the minimum cross-validation metric.

lambda

the value of \(\lambda\) associated with the minimum metric.

Arguments

formula

a formula specifying the model.

data

a data frame, list or environment containing the variables in the model. See model.frame.

subset

a specification of the rows/observations to be used. See model.frame.

na.action

a function indicating how NA values should be handled. See model.frame.

K

an integer specifying the number of folds. For Leave-One-Out CV, set K equal to the number of observations.

generalized

if TRUE, the Generalized Cross-Validation (GCV) statistic is computed. K is ignored.

seed

an integer used to initialize the random number generator for reproducible K-fold splits.

n.threads

an integer specifying the number of threads. For K-fold CV, parallelization occurs across folds; for GCV/LOOCV, it occurs across the lambda grid. Set to -1 to use the RcppParallel default (defaultNumThreads).

tol

numeric scalar specifying the tolerance for rank estimation in the SVD. See cvLM.

max.lambda

numeric scalar for the maximum \(\lambda\) value in the search grid.

precision

numeric scalar specifying the step size (increment) between \(\lambda\) values in the grid.

center

if TRUE (the default), the predictors and response are mean-centered, effectively excluding the intercept from the ridge penalty. See cvLM.

Details

grid.search is designed for high-performance parameter tuning. Unlike naive implementations that refit the model for every grid point, this function utilizes Singular Value Decomposition (SVD) of the design matrix to evaluate the entire grid analytically.

For Generalized Cross-Validation (GCV) and Leave-One-Out (LOOCV), the SVD is computed once. Each \(\lambda\) in the grid is then evaluated by updating the diagonal "shrinkage" matrix, reducing the cost of each grid point evaluation from \(O(np^2)\) to \(O(\min(n,p))\).

The search begins at \(\lambda = 0\) and increments by precision until max.lambda is reached (inclusive). The function returns the \(\lambda\) that achieves the minimum cross-validation metric across the scheme.

See Also

cvLM

Examples

Run this code
if (FALSE) {
data(mtcars)
grid.search(
  formula = mpg ~ ., 
  data = mtcars,
  K = 5L,           # Use 5-fold CV
  max.lambda = 100, # Search values between 0 and 100
  precision = 0.01  # Increment in steps of 0.01
)

grid.search(
  formula = mpg ~ ., 
  data = mtcars,
  K = nrow(mtcars),    # Use LOOCV
  max.lambda = 10000,  # Search values between 0 and 10000
  precision = 0.5      # Increment in steps of 0.5
)
}

Run the code above in your browser using DataLab