rfPermute
Description
rfPermute
estimates the significance of importance metrics for a Random Forest model by permuting the response variable. It will produce null distributions of importance metrics for each predictor variable and p-value of observed. The package also includes several summary and visualization functions for randomForest
and rfPermute
results.
Installation
To install the stable version from CRAN:
install.packages('rfPermute')
To install the latest version from GitHub:
# make sure you have Rtools installed
if (!require('devtools')) install.packages('devtools')
# install from GitHub
devtools::install_github('EricArcher/rfPermute')
Contact
- submit suggestions and bug-reports: https://github.com/ericarcher/rfPermute/issues
- send a pull request: https://github.com/ericarcher/rfPermute/
- e-mail: eric.archer@noaa.gov
Current Functions
classConfInt
Classification Confidence IntervalscleanRFdata
Clean Random Forest Input DataconfusionMatrix
Confusion MatrixexptdErrRate
Expected Error RateimpHeatmap
Importance HeatmappctCorrect
Percent Correctly ClassifiedplotConfMat
Heatmap representation of Confusion MatrixplotImpVarDist
Distribution of Important VariablesplotInbag
Distribution of sample inbag ratesplotNull
Plot Random Forest Importance Null DistributionsplotOOBtimes
Distribution of sample OOB ratesplotPredictedProbs
Distribution of prediction assignment probabilitiesplotRFtrace
Trace of cumulative error rates in forestplotVotes
Vote Distributionplot.rp.importance
Plot Random Forest Importance DistributionsproximityPlot
Plot Random Forest Proximity ScoresrfPermute
Estimate Permutation p-values for Random Forest Importance Metricsrp.combine
Combine rfPermute Objectsrp.importance
Extract rfPermute Importance Scores and p-values
Development version (current on GitHub)
- Added
plotConfMat
,plotOOBtimes
,plotRFtrace
, andplotInbag
, andplotImpVarDist
visualizations. - Changed
confusionMatrix
so it will work whenrandomForest model doesn't have a
$confusionelement, like when model is result of
combine`-ing multiple models. - Improved efficiency and stability of parallel processing code. Changed default value of
num.cores
toNULL
.
version 2.1.5
- Added
type
argument toplotVotes
to choose between area and bar charts. - Changed
plot.rfPermute
toplotNull
to avoid clashes and maintain functionality ofrandomForest::plot.randomForest
. - Changed name of
proximity.plot
toproximityPlot
,exptd.err.rate
toexptdErrRate
, andclean.rf.data
tocleanRFdata
to make camelCase naming scheme more consistent in package. - Changed
plotNull
from base graphics to ggplot2. - Added
symb.metab
data set.
version 2.1.1
- Added
n
argument toimpHeatmap
. - Added functions:
classConfInt
,confusionMatrix
,plotVotes
,pctCorrect
.
version 2.0.1
- Fixed bug in
plot.rfPermute
that was reporting the p-value incorrectly at the top of the figure. - Fixed multi-threading in
rfPermute
so it works on Windows too. - Added
impHeatmap
function. - Switched
proximity.plot
to useggplot2
graphics.
version 2.0
- Fixed bug with calculation of p-values not respecting importance measure scaling (division by standard deviations). New format of output of
rfPemute
has separate$null.dist
and$pval
elements, each with results for unscaled and scaled importance mesures. See?rfPermute
for more information. rp.importance
andplot.rfPermute
now take ascale
argument to specify whether or not importance values should be scaled by standard deviations.- If
nrep = 0
forrfPermute
, arandomForest
object is returned.
version 1.9.3
- Fixed import declarations to avoid
grid
name clashes. - Fixed logic error in
clean.rf.data
where fixed predictors were not removed. - Fixed error in use of
main
argument inplot.rp.importance
.
version 1.9.2
- Added this NEWS.md
- Added README.md
- Added
num.cores
argument torfPermute
to take advantage of multi-threading
version 1.9.1
- Added internal keyword to
calc.imp.pval
to keep it from indexing - Updated imports to match new CRAN policies