randomForestSRC
objects using the ggplot2
package.randomForestSRC
.
The package is designed to simplify the graphical analysis and exploration
of randomForests.The randomForestSRC
package provides a unified treatment of Breimans random
forests (Breiman 2001) for a variety of data settings. Regression and
classification forests are grown when the response is numeric or categorical
(factor) while survival and competing risk forests (Ishwaran et al. 2008, 2012)
are grown for right-censored survival data. Support for unsupervised and
multivariate randomForests have also recently been added.
Many of the features of the ggRandomForests
package are available
within the randomForestSRC
package. However, the ggRandomForests offers the
following advantages:
vimp.rfsrc
,var.select.rfsrc
,plot.variable.rfsrc
) to generate intermediate
data.frame objects. These objects are then passed to corresponding plot functions
using the S3 object model. Alternatively, a user can use these data object for
additional external, custom plotting or analysis operations.gridExtra
package.ggplot2
figures: We chose to use theggplot2
package for our figures.
The plot functions all return either a singleggplot2
object, or alist
ofggplot2
objects.
The user can then use additionalggplot2
functions to modify and customise the
figures to their liking.The ggRandomForests
package contains the following functions:
gg_rfsrc
: randomForest[SRC] predictiongg_error
: randomForest[SRC] convergence rate based on the OOB error rate.gg_roc
: ROC curves for randomForest classification models.gg_vimp
: Variable Importance ranking for variable selectiongg_minimal_depth
: Minimal Depth ranking for variable selectiongg_interaction
: Minimal Depth interaction detection (under development)gg_variable
: Marginal variable dependence (including conditional dependence)gg_partial
: Partial variable dependence (including conditional partial
dependence)gg_survival
: Random Forest survival plots including either Kaplan-Meier or
Nelson-Aalon survival estimatesAll functions have an associated plotting function that returns ggplot2
graphics, either
individually or as a list, that can be further customised using standard ggplot2
commands.
Ishwaran, H. and Kogalur, U.B. (2007). Random survival forests for R, Rnews, 7(2):25-31.
Ishwaran, H. and Kogalur, U.B. (2014). Random Forests for Survival, Regression and Classification (RF-SRC), R package version 1.5.
Ishwaran, H., U. B. Kogalur, E. Z. Gorodeski, A. J. Minn, and M. S. Lauer (2010). High-dimensional variable selection for survival data. J. Amer. Statist. Assoc. 105, 205-217.
Ishwaran, H. (2007). Variable importance in binary regression trees and forests. Electronic J. Statist., 1, 519-537.
Wickham, H. ggplot2: elegant graphics for data analysis. Springer New York, 2009.