MachineShop-package: MachineShop: Machine Learning Models and Tools

Description

Meta-package for statistical and machine learning with a common interface for model fitting, prediction, performance assessment, and presentation of results. Supports predictive modeling of numerical, categorical, and censored time-to-event outcomes and resample (bootstrap and cross-validation) estimation of model performance.

Arguments

Details

MachineShop provides a unified interface to machine learning and statistical models provided by other packages. Supported models are summarized in the table below according to the types of response variables with which each can be used. Additional model information can be obtained with the modelinfo function.

Model Objects	Categorical	Continuous	Survival	`AdaBagModel`
f			`AdaBoostModel`	f
		`BARTModel`	f	n
S	`BARTMachineModel`	b	n
`BlackBoostModel`	b	n	S	`C50Model`
f			`CForestModel`	f
n	S	`CoxModel`
S	`EarthModel`	f	n
`FDAModel`	f			`GAMBoostModel`
b	n	S	`GBMModel`	f
n	S	`GLMBoostModel`	b	n
S	`GLMModel`	b	n
`GLMNetModel`	f	m,n	S	`KNNModel`
f,o	n		`LARSModel`
n		`LDAModel`	f
	`LMModel`	f	m,n
`MDAModel`	f			`NaiveBayesModel`
f			`NNetModel`	f
n		`PDAModel`	f
	`PLSModel`	f	n
`POLRModel`	o			`QDAModel`
f			`RandomForestModel`	f
n		`RangerModel`	f	n
S	`RPartModel`	f	n	S
`StackedModel`	f,o	m,n	S	`SuperModel`
f,o	m,n	S	`SurvRegModel`
	S	`SVMModel`	f	n
	`TreeModel`	f	n
`XGBModel`	f	n		Model Objects

Categorical: b = binary, f = factor, o = ordered; Continuous: m = matrix, n = numeric; Survival: S = Surv

The following set of standard model training, prediction, performance assessment, and tuning functions are available for the model objects.

Training:

`fit`	Model Fitting
`resample`	Resample Estimation of Model Performance
`tune`	Model Tuning and Selection

Prediction:

predict Model Prediction

Performance Assessment:

`calibration`	Model Calibration
`confusion`	Confusion Matrix
`dependence`	Parital Dependence
`diff`	Model Performance Differences
`lift`	Lift Curves
`performance`	Model Performance Metrics
`varimp`	Variable Importance

Methods for resample estimation include

`BootControl`	Simple Bootstrap
`CVControl`	Repeated K-Fold Cross-Validation
`OOBControl`	Out-of-Bootstrap
`SplitControl`	Split Training-Testing
`TrainControl`	Training Resubstitution

Tabular and graphical summaries of modeling results can be obtained with

summary plot

Custom metrics and models can be created with the MLMetric and MLModel constructors.

Description

Arguments

Details

See Also