modelingSummary is an automatic function for modeling data, it returns a dataframe containing the metrics of the modeling using five machine learning algorithms: KNN, SVM, RF, NNET, and Bcart. This function is based on spliData, tuneTrain, predict, and getMetrics functions.
modelingSummary(
data,
y,
p = 0.7,
length = 10,
control = "repeatedcv",
number = 10,
repeats = 10,
process = c("center", "scale"),
summary = multiClassSummary,
positive,
parallelComputing = FALSE,
classtype,
...
)object of class "data.frame" with target variable and predictor variables.
character. Target variable.
numeric. Proportion of data to be used for training. Default: 0.7
integer. Number of values to output for each tuning parameter. If search = "random" is passed to trainControl through ..., this becomes the maximum number of tuning parameter combinations that are generated by the random search. Default: 10.
character. Resampling method to use. Choices include: "boot", "boot632", "optimism_boot", "boot_all", "cv", "repeatedcv", "LOOCV", "LGOCV", "none", "oob", timeslice, "adaptive_cv", "adaptive_boot", or "adaptive_LGOCV". Default: "repeatedcv". See train for specific details on the resampling methods.
integer. Number of cross-validation folds or number of resampling iterations. Default: 10.
integer. Number of folds for repeated k-fold cross-validation if "repeatedcv" is chosen as the resampling method in control. Default: 10.
character. Defines the pre-processing transformation of predictor variables to be done. Options are: "BoxCox", "YeoJohnson", "expoTrans", "center", "scale", "range", "knnImpute", "bagImpute", "medianImpute", "pca", "ica", or "spatialSign". See preProcess for specific details on each pre-processing transformation. Default: c('center', 'scale').
expression. Computes performance metrics across resamples. For numeric y, the mean squared error and R-squared are calculated. For factor y, the overall accuracy and Kappa are calculated. See trainControl and defaultSummary for details on specification and summary options. Default: multiClassSummary.
character. The positive class for the target variable if y is factor. Usually, it is the first level of the factor.
logical. indicates whether to also use the parallel processing. Default: False
integer.indicates the number of classes of the traits.
additional arguments to be passed to createDataPartition, trainControl and train functions in the package caret.
A dataframe contains the metrics of the modeling of five machine learning algorithms: KNN, SVM, RF, NNET, and Bcart.
tuneTrain relies on package caret to perform the modeling.
Types of classification and regression models available for use with tuneTrain can be found using names(getModelInfo()). The results given depend on the type of model used.
createDataPartition,
trainControl,
train,
predict.train,
confusionMatrix
# NOT RUN {
if(interactive()){
data(septoriaDurumWC)
models <- modelingSummary(data = septoriaDurumWC, y = "ST_S", positive = "R", classtype = 2)
}
# }
Run the code above in your browser using DataLab