splitStratify builds a training and validation set through a stratified
random sampling process. This function utilizes the strata function from the
sampling package as well as the cut function from the base package. The latter
function provides a means by which to bin continuous data prior to stratified random
sampling. We refer the user to the parameter descriptions to learn the specifics of
how to apply binning, although the user might find it easier to instead bin
annotations beforehand. When applied to an ExprsMulti object, this function
stratifies subjects across all classes found in that dataset.
splitStratify(object, percent.include = 67, colBy = NULL,
bin = rep(FALSE, length(colBy)), breaks = rep(list(NA),
length(colBy)), ...)An ExprsArray object to split.
Specifies the percent of the total number of subjects to include in the training set.
Specifies a vector of column names by which to stratify in
addition to class labels annotation. If colBy = NULL, random
sampling will occur across the class label annotation only.
For splitStratify only.
A logical vector indicating whether to bin the respective
colBy column using cut (e.g., bin = c(FALSE, TRUE)).
For splitStratify only.
A list. Each element of the list should correspond to a
breaks argument passed to cut for the respective
colBy column. Set an element to NA when not binning
that colBy. For splitStratify only.
Returns a list of two ExprsArray objects.