tuneTrain splits the Data, it is an automatic function for tuning, training, and making predictions, it returns a list containing a model object, data frame and plot.
tuneTrain(
data,
y,
p = 0.7,
method = method,
parallelComputing = FALSE,
length = 10,
control = "repeatedcv",
number = 10,
repeats = 10,
process = c("center", "scale"),
summary = multiClassSummary,
positive,
...
)object of class "data.frame" with target variable and predictor variables.
character. Target variable.
numeric. Proportion of data to be used for training. Default: 0.7
character. Type of model to use for classification or regression.
logical. indicates whether to also use the parallel processing. Default: False
integer. Number of values to output for each tuning parameter. If search = "random" is passed to trainControl through ..., this becomes the maximum number of tuning parameter combinations that are generated by the random search. Default: 10.
character. Resampling method to use. Choices include: "boot", "boot632", "optimism_boot", "boot_all", "cv", "repeatedcv", "LOOCV", "LGOCV", "none", "oob", timeslice, "adaptive_cv", "adaptive_boot", or "adaptive_LGOCV". Default: "repeatedcv". See train for specific details on the resampling methods.
integer. Number of cross-validation folds or number of resampling iterations. Default: 10.
integer. Number of folds for repeated k-fold cross-validation if "repeatedcv" is chosen as the resampling method in control. Default: 10.
character. Defines the pre-processing transformation of predictor variables to be done. Options are: "BoxCox", "YeoJohnson", "expoTrans", "center", "scale", "range", "knnImpute", "bagImpute", "medianImpute", "pca", "ica", or "spatialSign". See preProcess for specific details on each pre-processing transformation. Default: c('center', 'scale').
expression. Computes performance metrics across resamples. For numeric y, the mean squared error and R-squared are calculated. For factor y, the overall accuracy and Kappa are calculated. See trainControl and defaultSummary for details on specification and summary options. Default: multiClassSummary.
character. The positive class for the target variable if y is factor. Usually, it is the first level of the factor.
additional arguments to be passed to createDataPartition, trainControl and train functions in the package caret.
A list object with results from tuning and training the model selected in method, together with predictions and class probabilities. The training and test data sets obtained from splitting the data are also returned.
If y is factor, class probabilities are calculated for each class. If y is numeric, predicted values are calculated.
A ROC curve is created if y is factor. Otherwise, a plot of residuals versus predicted values is created if y is numeric.
tuneTrain relies on packages caret, ggplot2 and plotROC to perform the modelling and plotting.
Types of classification and regression models available for use with tuneTrain can be found using names(getModelInfo()). The results given depend on the type of model used.
For classification models, class probabilities and ROC curve are given in the results. For regression models, predictions and residuals versus predicted plot are given. y should be converted to either factor if performing classification or numeric if performing regression before specifying it in tuneTrain.
createDataPartition,
trainControl,
train,
predict.train,
ggplot,
geom_roc,
calc_auc
# NOT RUN {
if(interactive()){
data(septoriaDurumWC)
knn.mod <- tuneTrain(data = septoriaDurumWC,y = 'ST_S',method = 'knn',positive = 'R')
nnet.mod <- tuneTrain(data = septoriaDurumWC,y = 'ST_S',method = 'nnet',positive = 'R')
}
# }
Run the code above in your browser using DataLab