exprso (version 0.1.8)

split: split ExprsArray objects

Description

A collection of functions to build the training and validation sets.

Usage

splitSample(object, percent.include, ...)

splitStratify(object, percent.include, colBy = NULL, bin = rep(FALSE, length(colBy)), breaks = rep(list(NA), length(colBy)), ...)

# S4 method for ExprsArray splitSample(object, percent.include, ...)

# S4 method for ExprsArray splitStratify(object, percent.include, colBy = NULL, bin = rep(FALSE, length(colBy)), breaks = rep(list(NA), length(colBy)), ...)

Arguments

object

Specifies the ExprsArray object to split.

percent.include

Specifies the percent of the total number of subjects to include in the training set.

...

For splitSample: additional arguments passed along to sample. For splitStratify: additional arguments passed along to cut.

colBy

Specifies a vector of column names by which to stratify in addition to class labels annotation. If colBy = NULL, random sampling will occur across the class label annotation only. For splitStratify only.

bin

A logical vector indicating whether to bin the respective colBy column using cut (e.g., bin = c(FALSE, TRUE)). For splitStratify only.

breaks

A list. Each element of the list should correspond to a breaks argment passed to cut for the respective colBy column. Set an element to NA when not binning that colBy. For splitStratify only.

Value

Returns a list of two ExprsArray objects.

Details

splitSample builds a training and validation set by randomly sampling the subjects found within the ExprsArray object. Note that this method is not truly random. Instead, splitSample iterates through the random sampling process until it settles on a solution such that both the training and validation set contain at least one subject for each class label. If this method finds no solution after 10 iterations, the function will post an error. Set percent.include = 100 to skip random sampling and return a NULL validation set. Additional arguments (e.g., replace = TRUE) passed along to sample. This method works well for all (i.e., binary and multi-class) ExprsArray objects.

splitStratify builds a training and validation set through a stratified random sampling process. This function utilizes the strata function from the sampling package as well as the cut function from the base package. The latter function provides a means by which to bin continuous data prior to stratified random sampling. We refer the user to the parameter descriptions to learn the specifics of how to apply binning, although the user might find it easier to instead bin annotations beforehand. When applied to an ExprsMulti object, this function stratifies subjects across all classes found in that dataset.

See Also

ExprsArray-class