- formula
Object of class formula or character describing the model to fit. Interaction terms supported only for numerical variables.
- dependent.variable.name
Name of outcome variable, needed if no formula given.
- data
Training data of class data.frame, matrix, dgCMatrix (Matrix) or gwaa.data (GenABEL).
- num.trees
Number of trees. Default is 20000.
- num.cand.trees
Number of random candidate trees to generate for each tree root. Default is 500.
- probability
Grow a probability forest as in Malley et al. (2012). (NOTE: Currently only probability forests are implemented, will be changed in the next version)
- importance
Variable importance mode, either 'unity' (unity VIM) or 'none'.
- prop.best.splits
Related to the unity VIM. Default value should generally not be modified by the user. When calculating the unity VIM, only the top prop.best.splits \(\times\) 100% of the splits -- those with the highest split criterion values weighted by node size -- are considered for each variable. The default value is 0.01, meaning that only the top 1% of splits are used. While small values are recommended, they should not be set too low to ensure that each variable has a sufficient number of splits for a reliable unity VIM computation.
- min.node.size.root
Minimal node size in the tree roots. Default is 10 irrespective of the outcome type.
- min.node.size
Minimal node size. Default 1 for classification and 5 for probability.
- max.depth.root
Maximal depth of the tree roots. Default value is 3 and should generally not be modified by the user. Larger values can be associated with worse predictive performance for some datasets.
- max.depth
Maximal tree depth. A value of NULL or 0 (the default) corresponds to unlimited depth, 1 to tree stumps (1 split per tree). Must be at least as large as max.depth.root.
- prop.var.root
Proportion of variables randomly sampled for constructing each tree root. Default is the square root of the number of variables divided by the number of variables. Consequently, per default, for each tree root, a random subset of variables is considered, with size equal to the (rounded up) square root of the total number of variables. An exception is made for datasets with more than 100 variables, where the default for prop.var.root is set to 0.1. See the 'Details' section below for explanation.
- mtry.sprout
Number of randomly sampled variables to possibly split at in each node of the tree sprouts (i.e., the branches of the trees beyond the tree roots). Default is the (rounded down) square root of the number variables.
- replace
Sample with replacement. Default is FALSE.
- sample.fraction
Fraction of observations to sample for each tree. Default is 1 for sampling with replacement and 0.7 for sampling without replacement.
- case.weights
Weights for sampling of training observations. Observations with larger weights will be selected with higher probability in the bootstrap (or subsampled) samples for the trees.
- class.weights
Weights for the outcome classes (in order of the factor levels) in the splitting rule (cost sensitive learning). Classification and probability prediction only. For classification the weights are also applied in the majority vote in terminal nodes.
- inbag
Manually set observations per tree. List of size num.trees, containing inbag counts for each observation. Can be used for stratified sampling.
- oob.error
Compute OOB prediction error. Set to FALSE to save computation time.
- num.threads
Number of threads. Default is number of CPUs available.
- write.forest
Save unityfor.forest object, required for prediction. Set to FALSE to reduce memory usage if no prediction intended.
- verbose
Show computation status and estimated runtime.