Learn R Programming

diversityForest (version 0.5.0)

predict.divfor: Diversity Forest prediction

Description

Prediction with new data and a saved forest from divfor.

Usage

# S3 method for divfor
predict(
  object,
  data = NULL,
  predict.all = FALSE,
  num.trees = object$num.trees,
  type = "response",
  se.method = "infjack",
  quantiles = c(0.1, 0.5, 0.9),
  seed = NULL,
  num.threads = NULL,
  verbose = TRUE,
  ...
)

Value

Object of class divfor.prediction with elements

predictionsPredicted classes/values (only for classification and regression)
unique.death.timesUnique death times (only for survival).
chfEstimated cumulative hazard function for each sample (only for survival).
survivalEstimated survival function for each sample (only for survival).
num.treesNumber of trees.
num.independent.variablesNumber of independent variables.
treetypeType of forest/tree. Classification, regression or survival.
num.samplesNumber of samples.

Arguments

object

divfor object.

data

New test data of class data.frame or gwaa.data (GenABEL).

predict.all

Return individual predictions for each tree instead of aggregated predictions for all trees. Return a matrix (sample x tree) for classification and regression, a 3d array for probability estimation (sample x class x tree) and survival (sample x time x tree).

num.trees

Number of trees used for prediction. The first num.trees in the forest are used.

type

Type of prediction. One of 'response', 'se', 'terminalNodes', 'quantiles' with default 'response'. See below for details.

se.method

Method to compute standard errors. One of 'jack', 'infjack' with default 'infjack'. Only applicable if type = 'se'. See below for details.

quantiles

Vector of quantiles for quantile prediction. Set type = 'quantiles' to use.

seed

Random seed. Default is NULL, which generates the seed from R. Set to 0 to ignore the R seed. The seed is used in case of ties in classification mode.

num.threads

Number of threads. Default is number of CPUs available.

verbose

Verbose output on or off.

...

further arguments passed to or from other methods.

Author

Marvin N. Wright

Details

This package is a fork of the R package 'ranger' that implements random forests using an efficient C++ implementation. More precisely, 'diversityForest' was written by modifying the code of 'ranger', version 0.11.0. Therefore, details on further functionalities of the code that are not presented in the help pages of 'diversityForest' are found in the help pages of 'ranger' (version 0.11.0). The code in the example sections of divfor and tunedivfor can be used as a template for all common application scenarios with respect to classification, regression and survival prediction using univariable, binary splitting. Some function arguments adopted from the 'ranger' package may not be useable with diversity forests (for the current package version).

References

  • Hornung, R. (2022). Diversity forests: Using split sampling to enable innovative complex split procedures in random forests. SN Computer Science 3(2):1, <tools:::Rd_expr_doi("10.1007/s42979-021-00920-1")>.

  • Wright, M. N., Ziegler, A. (2017). ranger: A fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software 77:1-17, <tools:::Rd_expr_doi("10.18637/jss.v077.i01")>.

  • Wager, S., Hastie T., & Efron, B. (2014). Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife. Journal of Machine Learning Research 15:1625-1651.

  • Meinshausen (2006). Quantile Regression Forests. Journal of Machine Learning Research 7:983-999.

See Also

divfor