Learn R Programming

⚠️There's a newer version (3.9.0) of this package.Take me there.

MachineShop: Machine Learning Models and Tools for R

Description

MachineShop is a meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Support is provided for predictive modeling of numerical, categorical, and censored time-to-event outcomes and for resample (bootstrap, cross-validation, and split training-test sets) estimation of model performance. This vignette introduces the package interface with a survival data analysis example, followed by supported methods of variable specification; applications to other response variable types; available performance metrics, resampling techniques, and graphical and tabular summaries; and modeling strategies.

Features

  • Unified and concise interface for model fitting, prediction, and performance assessment.
  • Current support for 49 established models from 26 R packages.
  • Dynamic model parameters.
  • Ensemble modeling with stacked regression and super learners.
  • Modeling of response variables types: binary factors, multi-class nominal and ordinal factors, numeric vectors and matrices, and censored time-to-event survival.
  • Model specification with traditional formulas, design matrices, and flexible pre-processing recipes.
  • Resample estimation of predictive performance, including cross-validation, bootstrap resampling, and split training-test set validation.
  • Parallel execution of resampling algorithms.
  • Choices of performance metrics: accuracy, areas under ROC and precision recall curves, Brier score, coefficient of determination (R2), concordance index, cross entropy, F score, Gini coefficient, unweighted and weighted Cohen’s kappa, mean absolute error, mean squared error, mean squared log error, positive and negative predictive values, precision and recall, and sensitivity and specificity.
  • Graphical and tabular performance summaries: calibration curves, confusion matrices, partial dependence plots, performance curves, lift curves, and variable importance.
  • Model tuning over automatically generated grids of parameter values and randomly sampled grid points.
  • Model selection and comparisons for any combination of models and model parameter values.
  • User-definable models and performance metrics.

Getting Started

Installation

# Current release from CRAN
install.packages("MachineShop")

# Development version from GitHub
# install.packages("devtools")
devtools::install_github("brian-j-smith/MachineShop")

# Development version with vignettes
devtools::install_github("brian-j-smith/MachineShop", build_vignettes = TRUE)

Documentation

Once installed, the following R commands will load the package and display its help system documentation. Online documentation and examples are available at the MachineShop website.

library(MachineShop)

# Package help summary
?MachineShop

# Vignette
RShowDoc("Introduction", package = "MachineShop")

Copy Link

Version

Install

install.packages('MachineShop')

Monthly Downloads

646

Version

1.3.0

License

GPL-3

Maintainer

Brian Smith

Last Published

April 23rd, 2019

Functions in MachineShop (1.3.0)

GAMBoostModel

Gradient Boosting with Additive Models
MLControl

Resampling Controls
GLMModel

Generalized Linear Model
GLMBoostModel

Gradient Boosting with Linear Models
NaiveBayesModel

Naive Bayes Classifier Model
PLSModel

Partial Least Squares Model
lift

Model Lift
metricinfo

Display Performance Metric Information
expand.model

Model Expansion Over a Grid of Tuning Parameters
.

Quote Operator
MLMetric

MLMetric Class Constructor
RPartModel

Recursive Partitioning and Regression Tree Models
response

Extract Response Variable
resample

Resample Estimation of Model Performance
RandomForestModel

Random Forest Model
Grid

Tuning Grid Control
ModelFrame

ModelFrame Class
GLMNetModel

GLM Lasso or Elasticnet Model
NNetModel

Neural Network Model
SurvMatrix

SurvMatrix Class Constructor
SurvRegModel

Parametric Survival Model
calibration

Model Calibration
ICHomes

Iowa City Home Sales Dataset
LMModel

Linear Models
GBMModel

Generalized Boosted Regression Model
MDAModel

Mixture Discriminant Analysis Model
POLRModel

Ordered Logistic or Probit Regression Model
FDAModel

Flexible and Penalized Discriminant Analysis Models
LDAModel

Linear Discriminant Analysis Model
LARSModel

Least Angle Regression, Lasso and Infinitesimal Forward Stagewise Models
confusion

Confusion Matrix
StackedModel

Stacked Regression Model
KNNModel

Weighted k-Nearest Neighbor Model
summary

Model Performance Summary
MLModel

MLModel Class Constructor
diff

Model Performance Differences
SuperModel

Super Learner Model
dependence

Partial Dependence
MachineShop-package

MachineShop: Machine Learning Models and Tools
QDAModel

Quadratic Discriminant Analysis Model
metrics

Performance Metrics
modelinfo

Display Model Information
t.test

Paired t-Tests for Model Comparisons
TreeModel

Classification and Regression Tree Models
SVMModel

Support Vector Machine Models
RangerModel

Fast Random Forest Model
extract

Extract Parts of an Object
fit

Model Fitting
XGBModel

Extreme Gradient Boosting Models
performance

Model Performance Metrics
plot

Model Performance Plots
performance_curve

Performance Curves
predict

Model Prediction
tune

Model Tuning and Selection
varimp

Variable Importance
BlackBoostModel

Gradient Boosting with Regression Trees
BARTMachineModel

Bayesian Additive Regression Trees Model
AdaBagModel

Bagging with Classification Trees
AdaBoostModel

Boosting with Classification Trees
BARTModel

Bayesian Additive Regression Trees Model
C50Model

C5.0 Decision Trees and Rule-Based Model
EarthModel

Multivariate Adaptive Regression Splines Model
CoxModel

Proportional Hazards Regression Model
CForestModel

Conditional Random Forest Model