predictML: Parallel-Voting prediction of multiple machine learning models

Description

By sampling your data, running the machine learning algorithm on these samples in parallel on your own machine and letting your models vote on a prediction, we return much faster predictions than the regular machine learning algorithm and possibly even more accurate predictions.

Usage

predictML(predictCall, MLPackage, combine = "raw")

Arguments

predictCall

Your call to a predict function. The object however should be an element of class parallelML.

MLPackage

A character string of the package which provides your machine learning algorithm. This is needed since all cores should load the package.

combine

A character string: "raw" when you want to return a list of predictions, "vote" when you want to return the class upon which most models agree and "avg" when you want to return the average of numeric predictions (probabilities).

Value

Either a list of vectors of predictions for each model or a vector of class upon which most models agree.

Examples

Run this code

## Not run: 
# # Load the library which provides svm
# library(e1071)
# 
# # Create your data
# data(iris)
# 
# # Create a model
# parSvmModel <- parallelML("svm(formula = Species ~ ., data = iris)",
#                      "e1071",samplingSize = 0.8)
#                      
# # Get prediction
# parSvmPred   <- predictML("predict(parSvmModel,newdata=iris)",
#                           "e1071","vote")
# 
# # Check the quality
# table(parSvmPred,iris$Species)
# ## End(Not run)
## Not run: 
# # Load the library which provides rpart
# library(rpart)
# 
# # Create your data
# data("magicData")
# 
# # Create a model
# parTreeModel  <- parallelML("rpart(formula = V11 ~ ., data = trainData[,-1])",
#                             "rpart",samplingSize = 0.8)
# 
# # Get prediction
# parTrainTreePred  <- predictML("predict(parTreeModel,newdata=trainData[,-1],type='class')",
#                                "rpart","vote")
# parTestTreePred  <- predictML("predict(parTreeModel,newdata=testData[,-1],type='class')",
#                               "rpart","vote")
# 
# # Check the quality
# table(parTrainTreePred,trainData$V11)
# table(parTestTreePred,testData$V11)	
# ## End(Not run)
## Not run: 
# # Load the library which provides svm
# library(e1071)
# 
# # Create your data
# data(iris)
# data("magicData")
# 
# # Get nummeric predicitions of Support Vector Machine
# parsvmmodel   <- parallelML("svm(formula = Species ~ ., data = iris,probability=TRUE)",
#                             "e1071",samplingSize = 0.8,
#                             underSample = TRUE, underSampleTarget = "versicolor")
# parsvmpred    <- predictML("predict(parsvmmodel,newdata=iris,probability=TRUE)",
#                            "e1071","avg")
# 
# # Get numeric predictions of a generalized linear model
# parglmmodel   <- parallelML("glm(formula = V11 ~ ., data = trainData[,-1], 
#                                  family = binomial(link='logit'))","stats",samplingSize = 0.8)
# 
# parglmpred   <- predictML("predict(parglmmodel,newdata=trainData[,-1],type='response')",
#                           "stats","avg")
# ## End(Not run)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples