Cross-Validation for Linear & Ridge Regression Models (RcppArmadillo & RcppParallel)
This package provides efficient implementations of cross-validation techniques for linear and ridge regression models, leveraging C++17 with RcppArmadillo and RcppParallel. It supports leave-one-out, generalized, and K-fold cross-validation methods, utilizing Singular Value Decomposition (SVD) and Complete Orthogonal Decomposition (COD) for high performance and numerical stability in high-dimensional settings.
Dependencies
- Rcpp: Integration between R and C++.
- RcppParallel: Parallel computing support for Rcpp.
- RcppArmadillo: Integration between R and the Armadillo C++ library.
- RhpcBLASctl: Control of BLAS/LAPACK thread counts.
Requirements
- R
- Rcpp
- RcppParallel
- RcppArmadillo
- C++17 compatible compiler
Acknowledgments
This code is adapted and extended from various sources, leveraging the capabilities of the following:
- Rcpp by Dirk Eddelbuettel, Romain Francois, et al., for R and C++ integration.
- RcppParallel by JJ Allaire, Romain Francois, et al., for parallel computing support.
- RcppArmadillo by Dirk Eddelbuettel, Conrad Sanderson, et al., for high-performance linear algebra.
Please refer to the source files for detailed information and licenses.
Contributors
- [Philip Nye]: GitHub Profile
License
This code is under MIT License.
Example Usage
library(cvLM)
data(mtcars)
# 10-fold CV for a linear regression model
cvLM(mpg ~ ., data = mtcars, K.vals = 10)
# Comparing 5-fold, 10-fold, and Leave-One-Out CV configurations using 2 threads
cvLM(mpg ~ ., data = mtcars, K.vals = c(5, 10, nrow(mtcars)), n.threads = 2)
# Ridge regression with analytic GCV (using lm interface)
fitted.lm <- lm(mpg ~ ., data = mtcars)
cvLM(fitted.lm, data = mtcars, lambda = 0.5, generalized = TRUE)
grid.search(
formula = mpg ~ .,
data = mtcars,
K = 5L, # Use 5-fold CV
max.lambda = 100, # Search values between 0 and 100
precision = 0.01 # Increment in steps of 0.01
)