LogicForest (version 2.1.0)

logforest: Logic Forest

Description

Constructs an ensemble of logic regression models using bagging for classification and identification of important predictors and predictor interactions

Usage

logforest( resp, Xs, nBSXVars, anneal.params, nBS = 100, h = 0.5, norm = TRUE, numout = 5)

Arguments

resp
numeric vector of binary response values.
Xs
matrix or dataframe of zeros and ones for all predictor variables.
nBSXVars
integer for the number of predictors used to construct each logic regression model. The default value is all predictors in the data.
anneal.params
a list containing the parameters for simulated annealing. See the help file for the function logreg.anneal.control in the LogicReg package. If missing, default annealing parameters are set at start=1, end=-2, and iter=50000.
nBS
number of logic regression trees to be fit in the logic forest model.
h
a number between 0 and 1 for the minimum proportion of trees in the logic forest that must predict a 1 for the prediction to be one.
norm
logical. If FALSE, predictor and interaction scores in model output are not normalized to range between zero and one.
numout
number of predictors and interactions to be included in model output

Value

"logforest" which is a list including values
AllFits
A list of all logic regression fits in the logic forest model.
Top5.PI
a vector of the 5 interactions with the largest magnitude variable importance score.
Predictor.importance
a vector of importance scores for all predictors that occur in the logic forest.
PI.importance
a vector of importance scores for all interactions that occur in the logic forest.
Predictor.frequency
a vector frequency of predictors occurring in individual logic regression in the logic forest.
PI.frequency
a vector frequency of interactions occurring in individual logic regression in the logic forest.
ModelPI.import
a list on interaction importance measures for each logic regression model in the logic forest.
OOBmisclass
out-of-bag error estimate for the logic forest.
OOBprediction
a matrix. Column one is the out-of-bag prediction for responses in original data. Columns 2 is the proportion of out-of-bag trees that predicted class value to be one.
IBdata
a list of all in-bag data sets for the logic forest model.
OOBdata
a list of all out-of-bag data sets for the logic forest model.
norm
logical. If TRUE the normalized predictor and interaction importance scores are returned.
numout
the number of predictors and interactions (based on the variable importance measure) to be returned by logforest.
predictors
number of predictor variables in the data used to construct the logic forest.

References

Wolf, B.J., Slate, E.H., Hill, E.G. (2010) Logic Forest: An ensemble classifier for discovering logical combinations of binary markers. Bioinformatics.

See Also

print.logforest, predict.logforest, vimp.plot, submatch.plot, persistence.plot

Examples

Run this code
data(LF.data)

#Set using annealing parameters using the logreg.anneal.control 
#function from LogicReg package

newanneal<-logreg.anneal.control(start=1, end=-2, iter=2500)

#typically more than 2500 iterations (iter>25000) would be used for 
#the annealing algorithm.  A typical forest also contains at 
#least 100 trees.  These parameters were set to allow for faster
#run times

#The data set LF.data contains 50 binary predictors and a binary
#response Ybin
LF.fit1<-logforest(resp=LF.data$Ybin, Xs=LF.data[,1:50], nBS=20,
anneal.params=newanneal)
print(LF.fit1)
predict(LF.fit1)

#Changing print parameters
LF.fit2<-logforest(resp=LF.data$Ybin, Xs=LF.data[,1:50], nBS=20,
anneal.params=newanneal, norm=TRUE, numout=10)
print(LF.fit2)

Run the code above in your browser using DataLab