randomForestSRC
objects using the ggplot2 package.randomForestSRC.
The package is designed to simplify the graphical analysis and exploration
of randomForests.The randomForestSRC package provides a unified treatment of Breimans random
forests (Breiman 2001) for a variety of data settings. Regression and
classification forests are grown when the response is numeric or categorical
(factor) while survival and competing risk forests (Ishwaran et al. 2008, 2012)
are grown for right-censored survival data. Support for unsupervised and
multivariate randomForests have also recently been added.
Many of the features of the ggRandomForests package are available
within the randomForestSRC package. However, the ggRandomForests offers the
following advantages:
vimp.rfsrc,var.select.rfsrc,plot.variable.rfsrc) to generate intermediate
data.frame objects. These objects are then passed to corresponding plot functions
using the S3 object model. Alternatively, a user can use these data object for
additional external, custom plotting or analysis operations.gridExtrapackage.ggplot2figures: We chose to use theggplot2package for our figures.
The plot functions all return either a singleggplot2object, or alistofggplot2objects.
The user can then use additionalggplot2functions to modify and customise the
figures to their liking.The ggRandomForests package contains the following functions:
gg_rfsrc: randomForest[SRC] predictiongg_error: randomForest[SRC] convergence rate based on the OOB error rate.gg_roc: ROC curves for randomForest classification models.gg_vimp: Variable Importance ranking for variable selectiongg_minimal_depth: Minimal Depth ranking for variable selectiongg_interaction: Minimal Depth interaction detection (under development)gg_variable: Marginal variable dependence (including conditional dependence)gg_partial: Partial variable dependence (including conditional partial
dependence)gg_survival: Random Forest survival plots including either Kaplan-Meier or
Nelson-Aalon survival estimatesAll functions have an associated plotting function that returns ggplot2 graphics, either
individually or as a list, that can be further customised using standard ggplot2 commands.
Ishwaran, H. and Kogalur, U.B. (2007). Random survival forests for R, Rnews, 7(2):25-31.
Ishwaran, H. and Kogalur, U.B. (2014). Random Forests for Survival, Regression and Classification (RF-SRC), R package version 1.5.
Ishwaran, H., U. B. Kogalur, E. Z. Gorodeski, A. J. Minn, and M. S. Lauer (2010). High-dimensional variable selection for survival data. J. Amer. Statist. Assoc. 105, 205-217.
Ishwaran, H. (2007). Variable importance in binary regression trees and forests. Electronic J. Statist., 1, 519-537.
Wickham, H. ggplot2: elegant graphics for data analysis. Springer New York, 2009.