ExprsArray
objectsA collection of functions to build the training and validation sets.
splitSample(object, percent.include, ...)splitStratify(object, percent.include, colBy = NULL, bin = rep(FALSE,
length(colBy)), breaks = rep(list(NA), length(colBy)), ...)
# S4 method for ExprsArray
splitSample(object, percent.include, ...)
# S4 method for ExprsArray
splitStratify(object, percent.include, colBy = NULL,
bin = rep(FALSE, length(colBy)), breaks = rep(list(NA), length(colBy)),
...)
Specifies the ExprsArray
object to split.
Specifies the percent of the total number of subjects to include in the training set.
Specifies a vector of column names by which to stratify in
addition to class labels annotation. If colBy = NULL
, random
sampling will occur across the class label annotation only.
For splitStratify
only.
A logical vector indicating whether to bin the respective
colBy
column using cut
(e.g., bin = c(FALSE, TRUE)
).
For splitStratify
only.
A list. Each element of the list should correspond to a
breaks
argment passed to cut
for the respective
colBy
column. Set an element to NA
when not binning
that colBy
. For splitStratify
only.
Returns a list of two ExprsArray
objects.
splitSample
builds a training and validation set by randomly sampling
the subjects found within the ExprsArray
object. Note that this method
is not truly random. Instead, splitSample
iterates through the random sampling
process until it settles on a solution such that both the training and validation set
contain at least one subject for each class label. If this method finds no solution
after 10 iterations, the function will post an error. Set percent.include = 100
to skip random sampling and return a NULL
validation set. Additional arguments
(e.g., replace = TRUE
) passed along to sample
. This method works well
for all (i.e., binary and multi-class) ExprsArray
objects.
splitStratify
builds a training and validation set through a stratified
random sampling process. This function utilizes the strata
function from the
sampling package as well as the cut
function from the base package. The latter
function provides a means by which to bin continuous data prior to stratified random
sampling. We refer the user to the parameter descriptions to learn the specifics of
how to apply binning, although the user might find it easier to instead bin
annotations beforehand. When applied to an ExprsMulti
object, this function
stratifies subjects across all classes found in that dataset.