Runs k-fold cross-validation on a grid of \(\lambda_0\) values. Records class accuracy and deviance for each \(\lambda_0\). Returns an object of class "cv_risk_mod".
cv_risk_mod(
X,
y,
weights = NULL,
beta = NULL,
a = -10,
b = 10,
max_iters = 10000,
tol = 1e-05,
nlambda = 25,
lambda_min_ratio = ifelse(nrow(X) < ncol(X), 0.01, 1e-04),
lambda0 = NULL,
nfolds = 10,
foldids = NULL,
parallel = FALSE,
shuffle = TRUE,
seed = NULL,
method = "annealscore"
)An object of class "cv_risk_mod" with the following attributes:
Dataframe containing a summary of deviance, accuracy, and auc for
each value of lambda0 (mean and SD). Also includes the number of nonzero
coefficients that are produced by each lambda0 when fit on the full data.
Numeric value indicating the lambda0 that resulted in the
highest mean auc
Numeric value indicating the largest lamdba0 that
had a mean auc within one standard error of lambda_min.
Input covariate matrix with dimension \(n \times p\); every row is an observation.
Numeric vector for the (binomial) response variable.
Numeric vector of length \(n\) with weights for each observation. Unless otherwise specified, default will give equal weight to each observation.
Starting numeric vector with \(p\) coefficients. Default starting coefficients are rounded coefficients from a logistic regression model.
Integer lower bound for coefficients (default: -10).
Integer upper bound for coefficients (default: 10).
Maximum number of iterations (default: 10000).
Tolerance for convergence (default: 1e-5).
Number of lambda values to try (default: 25).
Smallest value for lambda, as a fraction of lambda_max (the smallest value for which all coefficients are zero). The default depends on the sample size (\(n\)) relative to the number of variables (\(p\)). If \(n > p\), the default is 0.0001, close to zero. If \(n < p\), the default is 0.01.
Optional sequence of lambda values. By default, the function
will derive the lambda0 sequence based on the data (see lambda_min_ratio).
Number of folds, implied if foldids provided (default: 10).
Optional vector of values between 1 and nfolds.
If TRUE, parallel processing (using foreach)
is implemented during cross-validation to increase efficiency
(default: FALSE). User must first register parallel backend with
a function such as registerDoParallel.
Whether order of coefficients is shuffled during coordinate descent (default: TRUE).
An integer that is used as argument by set.seed() for
offsetting the random number generator. Default is to not set a
particular randomization seed.
A string that specifies which method ("riskcd" or "annealscore") to run (default: "annealscore")