predict.divfor: Diversity Forest prediction

Description

Prediction with new data and a saved forest from divfor.

Usage

# S3 method for divfor
predict(
  object,
  data = NULL,
  predict.all = FALSE,
  num.trees = object$num.trees,
  type = "response",
  se.method = "infjack",
  quantiles = c(0.1, 0.5, 0.9),
  seed = NULL,
  num.threads = NULL,
  verbose = TRUE,
  ...
)

Value

Object of class divfor.prediction with elements

`predictions`	Predicted classes/values (only for classification and regression)
`unique.death.times`	Unique death times (only for survival).
`chf`	Estimated cumulative hazard function for each sample (only for survival).
`survival`	Estimated survival function for each sample (only for survival).
`num.trees`	Number of trees.
`num.independent.variables`	Number of independent variables.
`treetype`	Type of forest/tree. Classification, regression or survival.
`num.samples`	Number of samples.

Arguments

object: divfor object.
data: New test data of class data.frame or gwaa.data (GenABEL).
predict.all: Return individual predictions for each tree instead of aggregated predictions for all trees. Return a matrix (sample x tree) for classification and regression, a 3d array for probability estimation (sample x class x tree) and survival (sample x time x tree).
num.trees: Number of trees used for prediction. The first num.trees in the forest are used.
type: Type of prediction. One of 'response', 'se', 'terminalNodes', 'quantiles' with default 'response'. See below for details.
se.method: Method to compute standard errors. One of 'jack', 'infjack' with default 'infjack'. Only applicable if type = 'se'. See below for details.
quantiles: Vector of quantiles for quantile prediction. Set type = 'quantiles' to use.
seed: Random seed. Default is NULL, which generates the seed from R. Set to 0 to ignore the R seed. The seed is used in case of ties in classification mode.
num.threads: Number of threads. Default is number of CPUs available.
verbose: Verbose output on or off.
...: further arguments passed to or from other methods.

Author

Marvin N. Wright

Details

This package is a fork of the R package 'ranger' that implements random forests using an efficient C++ implementation. More precisely, 'diversityForest' was written by modifying the code of 'ranger', version 0.11.0. Therefore, details on further functionalities of the code that are not presented in the help pages of 'diversityForest' are found in the help pages of 'ranger' (version 0.11.0). The code in the example sections of divfor and tunedivfor can be used as a template for all common application scenarios with respect to classification, regression and survival prediction using univariable, binary splitting. Some function arguments adopted from the 'ranger' package may not be useable with diversity forests (for the current package version).

References

Hornung, R. (2022). Diversity forests: Using split sampling to enable innovative complex split procedures in random forests. SN Computer Science 3(2):1, <tools:::Rd_expr_doi("10.1007/s42979-021-00920-1")>.
Wright, M. N., Ziegler, A. (2017). ranger: A fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software 77:1-17, <tools:::Rd_expr_doi("10.18637/jss.v077.i01")>.
Wager, S., Hastie T., & Efron, B. (2014). Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife. Journal of Machine Learning Research 15:1625-1651.
Meinshausen (2006). Quantile Regression Forests. Journal of Machine Learning Research 7:983-999.