Learn R Programming

rfinterval (version 1.0.0)

rfinterval: Prediction Intervals for Random forests

Description

The rfinterval constructs prediction intervals for random forest predictions using a fast implementation package 'ranger'.

Usage

rfinterval(formula = NULL, train_data = NULL, test_data = NULL,
  method = c("oob", "split-conformal", "quantreg"), alpha = 0.1,
  symmetry = TRUE, seed = NULL, params_ranger = NULL)

Arguments

formula

Object of class formula or character describing the model to fit. Interaction terms supported only for numerical variables.

train_data

Training data of class data.frame, matrix, or dgCMatrix (Matrix).

test_data

Test data of class data.frame, matrix, or dgCMatrix (Matrix).

method

Method for constructing prediction interval. If method = "oob", compute the out-of-bag prediction intervals; if method = "split-conformal", compute the split conformal prediction interval; if method = "quantreg", use quantile regression forest to compute prediction intervals.

alpha

Confidence level. alpha = 0.05 for the 95% prediction interval.

symmetry

True if constructing symmetric out-of-bag prediction intervals, False otherwise. Only for method = "oob"

seed

Seed (only for method = "split-conformal")

params_ranger

List of further parameters that should be passed to ranger. See ranger for possible parameters.

Value

oob_interval

Out-of-bag prediction intervals

sc_interval

Split-conformal prediction intervals

quantreg_interval

Quantile regression forest prediction intervals

alpha

Confidence level for prediction intervals

testPred

Random forest prediction for test set

train_data

Training data

test_data

Test data

References

Haozhe Zhang, Joshua Zimmerman, Dan Nettleton, and Dan Nordman. (2019). "Random Forest Prediction Intervals." The American Statistician. Doi: 10.1080/00031305.2019.1585288.

Haozhe Zhang. (2019). "Topics in Functional Data Analysis and Machine Learning Predictive Inference." Ph.D. Dissertations. Iowa State University Digital Repository. 17929.

Lei, J., Max G<U+2019>Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman. "Distribution-free predictive inference for regression." Journal of the American Statistical Association 113, no. 523 (2018): 1094-1111.

Meinshausen, Nicolai. "Quantile regression forests." Journal of Machine Learning Research 7 (2006): 983-999.

Leo Breiman. (2001). Random Forests. Machine Learning 45(1), 5-32.

Examples

Run this code
# NOT RUN {
train_data <- sim_data(n = 500, p = 8)
test_data <- sim_data(n = 500, p = 8)
output <- rfinterval(y~., train_data = train_data, test_data = test_data,
                     method = c("oob", "split-conformal", "quantreg"),
                     symmetry = TRUE,alpha = 0.1)
y <- test_data$y
mean(output$oob_interval$lo < y & output$oob_interval$up > y)
mean(output$sc_interval$lo < y & output$sc_interval$up > y)
mean(output$quantreg_interval$lo < y & output$quantreg_interval$up > y)
# }

Run the code above in your browser using DataLab