Meta-package for statistical and machine learning with a common interface for model fitting, prediction, performance assessment, and presentation of results. Supports predictive modeling of numerical, categorical, and censored time-to-event outcomes and resample (bootstrap and cross-validation) estimation of model performance.
MachineShop provides a unified interface to machine learning and
statistical models provided by other packages. Supported models are
summarized in the table below according to the types of response variables
with which each can be used. Additional model information can be obtained
with the modelinfo
function.
Model Objects | Categorical | Continuous | Survival |
AdaBagModel |
f |
AdaBoostModel |
f | ||
BARTModel |
f | n | ||
S |
BARTMachineModel |
b | n | |
BlackBoostModel |
b | n | S |
C50Model |
f |
CForestModel |
f | ||
n | S |
CoxModel |
||
S |
EarthModel |
f | n | |
FDAModel |
f |
GAMBoostModel |
||
b | n | S |
GBMModel |
f |
n | S |
GLMBoostModel |
b | n |
S |
GLMModel |
b | n | |
GLMNetModel |
f | m,n | S |
KNNModel |
f,o | n |
LARSModel |
||
n |
LDAModel |
f | ||
LMModel |
f | m,n | ||
MDAModel |
f |
NaiveBayesModel |
||
f |
NNetModel |
f | ||
n |
PDAModel |
f | ||
PLSModel |
f | n | ||
POLRModel |
o |
QDAModel |
||
f |
RandomForestModel |
f | ||
n |
RangerModel |
f | n | |
S |
RPartModel |
f | n | S |
StackedModel |
f,o | m,n | S |
SuperModel |
f,o | m,n | S |
SurvRegModel |
|
S |
SVMModel |
f | n | |
TreeModel |
f | n | ||
XGBModel |
f | n | Model Objects |
Categorical: b = binary, f = factor, o = ordered; Continuous: m = matrix, n = numeric; Survival: S = Surv
The following set of standard model training, prediction, performance assessment, and tuning functions are available for the model objects.
Training:
fit |
Model Fitting |
resample |
Resample Estimation of Model Performance |
tune |
Model Tuning and Selection |
Prediction:
predict |
Model Prediction |
Performance Assessment:
calibration |
Model Calibration |
confusion |
Confusion Matrix |
dependence |
Parital Dependence |
diff |
Model Performance Differences |
lift |
Lift Curves |
performance |
Model Performance Metrics |
varimp |
Variable Importance |
Methods for resample estimation include
BootControl |
Simple Bootstrap |
CVControl |
Repeated K-Fold Cross-Validation |
OOBControl |
Out-of-Bootstrap |
SplitControl |
Split Training-Testing |
TrainControl |
Training Resubstitution |
Tabular and graphical summaries of modeling results can be obtained with
Custom metrics and models can be created with the MLMetric
and
MLModel
constructors.
Useful links: