k-fold cross-validation for hierarchical regularized
regression xrnet
tune_xrnet(
x,
y,
external = NULL,
unpen = NULL,
family = c("gaussian", "binomial"),
penalty_main = define_penalty(),
penalty_external = define_penalty(),
weights = NULL,
standardize = c(TRUE, TRUE),
intercept = c(TRUE, FALSE),
loss = c("deviance", "mse", "mae", "auc"),
nfolds = 5,
foldid = NULL,
parallel = FALSE,
control = list()
)A list of class tune_xrnet with components
mean cross-validated error for each penalty combination. Object returned is a vector if there is no external data (external = NULL) and matrix if there is external data.
estimated standard deviation for cross-validated errors. Object returned is a vector if there is no external data (external = NULL) and matrix if there is external data.
loss function used to compute cross-validation error
the value of the loss function for the optimal cross-validated error
first-level penalty value that achieves the optimal loss
second-level penalty value that achieves the optimal loss (if external data is present)
fitted xrnet object using all data, see
xrnet for details of object
predictor design matrix of dimension \(n x p\), matrix options include:
matrix
big.matrix
filebacked.big.matrix
sparse matrix (dgCMatrix)
outcome vector of length \(n\)
(optional) external data design matrix of dimension \(p x q\), matrix options include:
matrix
sparse matrix (dgCMatrix)
(optional) unpenalized predictor design matrix, matrix options include:
matrix
error distribution for outcome variable, options include:
"gaussian"
"binomial"
specifies regularization object for x. See
define_penalty for more details.
specifies regularization object for external. See
define_penalty for more details.
See define_penalty for more details.
optional vector of observation-specific weights. Default is 1 for all observations.
indicates whether x and/or external should be standardized. Default is c(TRUE, TRUE).
indicates whether an intercept term is included for x and/or external. Default is c(TRUE, FALSE).
loss function for cross-validation. Options include:
"deviance"
"mse" (Mean Squared Error)
"mae" (Mean Absolute Error)
"auc" (Area under the curve)
number of folds for cross-validation. Default is 5.
(optional) vector that identifies user-specified fold for each observation. If NULL, folds are automatically generated.
use foreach function to fit folds in parallel if TRUE,
must register cluster (doParallel) before using.
specifies xrnet control object. See
xrnet_control for more details.
k-fold cross-validation is used to determine the 'optimal'
combination of hyperparameter values, where optimal is based on the optimal
value obtained for the user-selected loss function across the k folds. To
efficiently traverse all possible combinations of the hyperparameter values,
'warm-starts' are used to traverse the penalty from largest to smallest
penalty value(s). Note that the penalty grid for the folds is generated
by fitting the model on the entire training data. Parallelization is enabled
through the foreach and doParallel R packages. To use
parallelization, parallel = TRUE, you must first create the cluster
makeCluster and then register the cluster registerDoParallel.
See the parallel, foreach, and/or doParallel R packages
for more details on how to setup parallelization.
## cross validation of hierarchical linear regression model
data(GaussianExample)
## 5-fold cross validation
cv_xrnet <- tune_xrnet(
x = x_linear,
y = y_linear,
external = ext_linear,
family = "gaussian",
control = xrnet_control(tolerance = 1e-6)
)
## contour plot of cross-validated error
plot(cv_xrnet)
Run the code above in your browser using DataLab