comp_pred provides a wrapper for running (i.e., fit or predict)
alternative classification algorithms to data
(i.e., data.train or data.test, respectively).
comp_pred(
formula,
data.train,
data.test = NULL,
algorithm = NULL,
model = NULL,
sens.w = NULL,
new.factors = "exclude",
quiet_mis = FALSE
)A formula (usually x$formula, for an FFTrees object x).
A training dataset (as a data frame).
A testing dataset (as a data frame).
A character string specifying an algorithm in the set:
"lr": Logistic regression (using glm from stats with family = "binomial");
"rlr": Regularized logistic regression (currently not supported);
"cart": Decision trees (using rpart from rpart);
"svm": Support vector machines (using svm from e1071);
"rf": Random forests (using randomForest from randomForest.
An optional existing model (as a model), to be applied to the test data.
Sensitivity weight parameter (numeric, from 0 to 1), required to compute wacc.
What should be done if new factor values are discovered in the test set (as a character string)? Available options:
"exclude": exclude case (i.e., remove these cases, used by default);
"base": predict the base rate of the criterion.
A logical value passed to hide/show NA user feedback
(usually x$params$quiet$mis of the calling function).
Default: quiet_mis = FALSE (i.e., show user feedback).
The range of competing algorithms currently available includes
logistic regression (stats::glm),
CART (rpart::rpart),
support vector machines (e1071::svm), and
random forests (randomForest::randomForest).
The current support for handling missing data (or NA values) is only rudimentary.
When enabled (via the global options allow_NA_pred or allow_NA_crit),
any rows in data.train or data.test with incomplete cases are being removed
prior to fitting or predicting a model (by using na.omit from stats).
See the specifications of each model for more sophisticated ways of handling missing data.