Usage
ranger(formula = NULL, data = NULL, num.trees = 500, mtry = NULL,
importance = "none", write.forest = FALSE, probability = FALSE,
min.node.size = NULL, replace = TRUE, splitrule = NULL,
split.select.weights = NULL, always.split.variables = NULL,
respect.unordered.factors = FALSE, scale.permutation.importance = FALSE,
num.threads = NULL, save.memory = FALSE, verbose = TRUE, seed = NULL,
dependent.variable.name = NULL, status.variable.name = NULL,
classification = NULL)
Arguments
formula
Object of class formula
or character
describing the model to fit.
data
Training data of class data.frame
, matrix
or gwaa.data
(GenABEL).
num.trees
Number of trees.
mtry
Number of variables to possibly split at in each node. Default is the (rounded down) square root of the number variables.
importance
Variable importance mode, one of 'none', 'impurity', 'permutation'. The 'impurity' measure is the Gini index for classification and the variance of the responses for regression.
write.forest
Save ranger.forest
object, needed for prediction.
probability
Grow a probability forest as in Malley et al. (2012).
min.node.size
Minimal node size. Default 1 for classification, 5 for regression, 3 for survival, and 10 for probability.
replace
Sample with replacement.
splitrule
Splitting rule, survival only. The splitting rule can be chosen of "logrank" and "C" with default "logrank".
split.select.weights
Numeric vector with weights between 0 and 1, representing the probability to select variables for splitting.
always.split.variables
Character vector with variable names to be always tried for splitting.
respect.unordered.factors
Regard unordered factor covariates as unordered categorical variables. If FALSE
, all factors are regarded ordered.
scale.permutation.importance
Scale permutation importance by standard error as in (Breiman 2001). Only applicable if permutation variable importance mode selected.
num.threads
Number of threads. Default is number of CPUs available.
save.memory
Use memory saving (but slower) splitting mode. No effect for GWAS data.
verbose
Verbose output on or off.
seed
Random seed. Default is NULL
, which generates the seed from R
.
dependent.variable.name
Name of dependent variable, needed if no formula given. For survival forests this is the time variable.
status.variable.name
Name of status variable, only applicable to survival data and needed if no formula given. Use 1 for event and 0 for censoring.
classification
Only needed if data is a matrix. Set to TRUE
to grow a classification forest.