Learn R Programming

gensemble (version 1.0.1)

mksampsize: Generate sample size information for use with gensemble

Description

This translates the sampsize argument to gensemble to a form for use internally.

Usage

mksampsize(Y, sampsize = NULL, proportion = FALSE)

Arguments

Y

The response vector.

sampsize

The desired sample size(s). Can be NULL, a single value, a vector or a list. See the details section for more information.

proportion

A logical indicating the values in sampsize represent proportions.

Value

If Y is a factor, will return a list of each class and the number of data points to sample for that class. Otherwise it will return a single value.

Details

For regression, sampsize indicates how much of the underlying data should be used in the bagged model. It should either be NULL or a single value. If it is NULL, roughly 80

For classification, the internals of gensemble require a list of each class and the size of the sample from each class. If sampsize is NULL, this list will be built using the levels present in Y, and roughly 80

See Also

gensemble

Examples

Run this code
# NOT RUN {

#regression
Y <- trees[,3]
#use roughly 80% for each training iteration
mksampsize(Y)
#the same thing using proportion
mksampsize(Y, 0.8, TRUE)

#classification
Y <- iris[,5]
#use rougly 80% of each class
mksampsize(Y)
#specifiy the size of each class in absolute terms
mksampsize(Y, list(setosa=20, versicolor=30, virginica=40))
#use about 70% of each class
mksampsize(Y, 0.7, proportion=TRUE)
#specifiy the proportion for each class
mksampsize(Y, c(0.5, 0.6, 0.7), proportion=TRUE)
# }

Run the code above in your browser using DataLab