Learn R Programming

emil (version 1.1-6)

batch.model: Perform modeling

Description

This function is the core of the framework, carrying out most of the work. It fits and evaluates models according to a resampling scheme, and extracts variable importance scores. Note that the typical user does not have to call this function directly, but should use fit, tune or evaluate.modeling instead.

Usage

batch.model(proc, x, y, resample = emil::resample("crossval", y, nfold = 2,
  nrep = 2), pre.process = pre.split, .save = list(fit = FALSE, pred =
  FALSE, vimp = FALSE, tuning = FALSE), .parallel.cores = 1,
  .checkpoint.dir = NULL, .return.errors = .parallel.cores > 1,
  .verbose = FALSE)

Arguments

proc
modeling procedure, or list of modeling procedures, as produced by modeling.procedure.
x
Dataset, observations as rows and descriptors as columns.
y
Response vector.
resample
The test subsets used for parameter tuning. Leave blank to randomly generate a resampling scheme of the same kind as is used by batch.model to assess the performance of the whole modeling procedure.
pre.process
Function that performs pre-processing and splits dataset into fitting and test subsets.
.save
What aspects of the modeling to perform and return to the user.
.parallel.cores
Number of CPU-cores to use for parallel computation. The current implementation is based on mcMap, which unfortunatelly do not work on Windows systems. It can however be re-implemented by the user fairly easi
.checkpoint.dir
Directory to save intermediate results to. If set the computation can be restarted with minimal loss of results.
.return.errors
If FALSE the entire modeling is aborted upon an error. If TRUE the modeling of the particular fold is aborted and the error message is returned instead of its results.
.verbose
Whether to print an activity log. Set to -1 to also suppress output generated from the procedure's functions.

Value

  • A list tree where the top level corresponds to folds (in case of multiple folds), the next level corresponds to the modeling procedures (in case of multiple procedures), and the final level is specified by the .save parameter. It typically contains a subset of the following elements: [object Object],[object Object],[object Object],[object Object],[object Object]

See Also

emil, modeling.procedure

Examples

Run this code
x <- iris[-5]
y <- iris$Species
proc <- modeling.procedure("lda")
cv <- resample("crossval", y, 4, 4)
perf <- batch.model(proc, x, y, cv, .save=list(pred=TRUE))

# Parallelization on windows
require(parallel)
cl <- makePSOCKcluster(2)
clusterEvalQ(cl, library(emil))
clusterExport(cl, c("proc", "x", "y"))
perf <- parLapply(cl, cv, function(fold)
    batch.model(proc, x, y, resample=fold))

Run the code above in your browser using DataLab