cvpre
performs k-fold cross validation on the dataset used to create
the ensemble, providing an estimate of predictive accuracy on future observations.
cvpre(object, k = 10, verbose = FALSE, pclass = 0.5,
penalty.par.val = "lambda.1se", parallel = FALSE)
An object of class pre
.
integer. The number of cross validation folds to be used.
logical. Should progress of the cross validation be printed to the command line?
numeric. Only used for classification. Cut-off value for the predicted probabilities that should be used to classify observations to the second class.
character. Calculate cross-validated error for ensembles
with penalty parameter criterion giving minimum cv error ("lambda.min"
)
or giving cv error that is within 1 standard error of minimum cv error
("lambda.1se
")? Alternatively, a numeric value may be specified,
corresponding to one of the values of lambda in the sequence used by glmnet,
for which estimated cv error can be inspected by running
object$glmnet.fit
and plot(object$glmnet.fit)
.
logical. Should parallel foreach be used? Must register parallel beforehand, such as doMC or others.
A list with three objects: $cvpreds
(a vector with cross-validated
predicted y values), $ss
(a vector indicating the cross-validation subsample
each training observation was assigned to) and $accuracy
. For continuous
outputs, accuracy is a list with elements $MSE
(mean squared error on test
observations), $MAE
(mean absolute error on test observations). For
classification, accuracy is a list with elements
$SEL
(mean squared error on predicted probabilities), $AEL
(mean absolute
error on predicted probabilities), $MCR
(average misclassification error rate)
and $table
(table with proportions of (in)correctly classified observations
per class).
pre
, plot.pre
,
coef.pre
, importance
, predict.pre
,
interact
, print.pre
# NOT RUN {
set.seed(42)
airq.ens <- pre(Ozone ~ ., data = airquality[complete.cases(airquality),])
airq.cv <- cvpre(airq.ens)
# }
Run the code above in your browser using DataLab