LogicForest (version 2.1.0)

LBoost: LBoost

Description

Constructs an ensemble of logic regression models using boosting for classification and identification of important predictors and predictor interactions

Usage

LBoost(resp, Xs, anneal.params, nBS = 100, kfold = 5, nperm = 1, PI.imp = NULL, pred.imp = FALSE)

Arguments

resp
numeric vector of binary response values.
Xs
matrix or data frame of zeros and ones for all predictor variables.
anneal.params
a list containing the parameters for simulated annealing. See the help file for the function logreg.anneal.control in the LogicReg package. If missing, default annealing parameters are set at start=1, end=-2, and iter=50000.
nBS
number of logic regression trees to be fit in the LBoost model.
kfold
The number of times the data are to be split in constructing the ensemble.
nperm
If measuring predictor importance of interaction importance using the permutation based measure, nperm is the number of permutations to be done in determining predictor of interaction importance.
PI.imp
A character string describing which measure of interaction importance will be used. Possible values include "Permutation", "AddRemove", and "Both". Using "Permutation" will provide the permutation based measure of interaction importance, "AddRemove" will provide the add-in/leave-out based measure of interaction importance, and "Both" provides both measures of importance.
pred.imp
logical. If FALSE, predictor importance scores will not be measured.

Value

"LBoost" which is a list including values
CVmod
A list of all logic regression fits and the associated information in the LBoost model. Each item in the list also gives a list of LR fits for a specific kfold data set, a matrix of weights given to each LR fit for that kfold data set, a matrix of the kfold training data used to construct the list of fits.
CVmisclass
a list including the mean cross-validation misclassification rate for the models and a list of vectors giving the predictions for each of the kfold test data sets.
AddRemove.PIimport
If PI.imp is specified as either "AddRemove" or "Both, this is a vector of add-in/leave-out importance scores for all interactions that occur in the LBoost model. If PI.imp is not specified or is "Permutation", this will state "Not measured".
Perm.PIimport
If PI.imp is specified as either "Permutation" or "Both, this is a vector of add-in/leave-out importance scores for all interactions that occur in the LBoost model. If PI.imp is not specified or is "AddRemove", this will state "Not measured".
Pred.import
If pred.imp is specified as TRUE, a vector of importance scores for all predictors in the data.
Pred.freq
a vector frequency of predictors occurring in individual logic regression in the LBoost model.
PI.frequency
a vector frequency of interactions occurring in individual logic regression in the LBoost model.
wt.mat
a list containing kfold matrices of observation weights for each tree for the kfold training data sets.
alphas
a list containing kfold vectors of tree specific weights for trees constructed from each of the kfold training data sets.
data
A matrix of the original data used to construct the LBoost model.
PIimp
A character string describing which interaction importance measure was used.
PredImp
logical. If TRUE predictor importance was measured.

References

Wolf, B.J., Hill, E.G., Slate, E.H., Neumann, C.A., Kistner-Griffin, E. (2012). LBoost: A boosting algorithm with applications for epistasis discovery. PLoS One.

See Also

print.LBoost, predict.LBoost, BoostVimp.plot, submatch.plot, persistence.plot

Examples

Run this code
data(LF.data)

#Set using annealing parameters using the logreg.anneal.control 
#function from LogicReg package
newanneal<-logreg.anneal.control(start=1, end=-2, iter=2000)

#typically more than 2000 iterations (>25000) would be used for 
#the annealing algorithm.  A typical LBoost models also contains at 
#least 100 trees.  These parameters were set to allow for faster
#run time

#The data set LF.data contains 50 binary predictors and a binary response Ybin
#Looking at only the Permutation Measure
LBfit.1<-LBoost(resp=LF.data$Ybin, Xs=LF.data[,1:50], nBS=10, kfold=2,
anneal.params=newanneal, nperm=2, PI.imp="Permutation")
print(LBfit.1)

#Looking at only the Add-in/Leave-out importance measure
LBfit.2<-LBoost(resp=LF.data$Ybin, Xs=LF.data[,1:50], nBS=10, kfold=2,
anneal.params=newanneal, PI.imp="AddRemove")
print(LBfit.2)

#Looking at both measures of importance plus predictor importance
LBfit.3<-LBoost(resp=LF.data$Ybin, Xs=LF.data[,1:50], nBS=10, kfold=2,
anneal.params=newanneal, nperm=2, PI.imp="Both", pred.imp=TRUE)
print(LBfit.3)

Run the code above in your browser using DataCamp Workspace