Learn R Programming

rfUtilities (version 2.0-0)

rf.classBalance: Random Forest Class Balance (Zero Inflation Correction) Model

Description

Implements Evans & Cushman (2008) Random Forests class-balance (zero inflation) modeling approach.

Usage

rf.classBalance(ydata, xdata, p = 0.005, cbf = 3, sf = 2, ...)

Arguments

ydata
Response variable using index (i.e., [,2] or [,"SPP"] )
xdata
Independent variables using index (i.e., [,3:14] or [3:ncol(data)] )
p
p-value of covariance convergence (do not recommend changing)
cbf
Scaling factor to test if problem is imbalanced, default is size of majority class * 3
sf
Majority subsampling factor. If sf=1 then random sample would be perfectly balanced with smallest class [s|0=n|1] whereas; sf=2 provides [s|0=(n|1*2)]
...
Additional arguments passed to randomForest

Value

A list class object with the following components: @return model Final Combined Random Forests ensemble @return oob.error Median out-of-bag error @return confusion Confusion matrix (summed across models) @return pcc Percent correctly classified

References

Evans, J.S. and S.A. Cushman (2009) Gradient Modeling of Conifer Species Using Random Forest. Landscape Ecology 5:673-683.

Evans J.S., M.A. Murphy, Z.A. Holden, S.A. Cushman (2011). Modeling species distribution and change using Random Forests CH.8 in Predictive Modeling in Landscape Ecology eds Drew, CA, Huettmann F, Wiersma Y. Springer

Examples

Run this code
require(randomForest)
data(iris)
  iris$Species <- as.character(iris$Species)
    iris$Species <- ifelse(iris$Species == "setosa", "virginica", iris$Species)
      iris$Species <- as.factor(iris$Species)	
	
# Percent of "virginica" observations
length( iris$Species[iris$Species == "virginica"] ) / dim(iris)[1]*100
	
rf.classBalance( ydata=iris[,"Species"], xdata=iris[,1:4], cbf=1 )

Run the code above in your browser using DataLab