this function is set to prune back the maximal tree by using the BIC
or the AIC
criterion.
best.tree.BIC.AIC(xtree, xdata, Y.name, X.names,
family = "binomial", verbose = TRUE)
a list of four elements:
The size of the selected trees by BIC
and AIC
The selected trees by BIC
and AIC
The fitted pltr models selected with BIC
, and AIC
The execution time of the selection procedure
a tree to prune
the dataset used to build the tree
the name of the dependent variable
the names of independent confounding variables to consider in the linear part of the glm
the glm
family considered depending on the type of the dependent variable.
Logical; TRUE for printing progress during the computation (helpful for debugging)
Cyprien Mbogning and Wilson Toussile
Mbogning, C., Perdry, H., Toussile, W., Broet, P.: A novel tree-based procedure for deciphering the genomic spectrum of clinical disease entities. Journal of Clinical Bioinformatics 4:6, (2014)
Akaike, H.: A new look at the statistical model identification. IEEE Trans. Automat. Control AC-19
, 716-723 (1974)
Schwarz, G.: Estimating the dimension of a model. The Annals of Statistics
6, 461-464 (1978)
best.tree.CV
, pltr.glm
data(burn)
args.rpart <- list(minbucket = 10, maxdepth = 4, cp = 0, maxcompete = 0,
maxsurrogate = 0)
family <- "binomial"
X.names = "Z2"
Y.name = "D2"
G.names = c('Z1','Z3','Z4','Z5','Z6','Z7','Z8','Z9','Z10','Z11')
pltr.burn <- pltr.glm(burn, Y.name, X.names, G.names, args.rpart = args.rpart,
family = family, iterMax = 4, iterMin = 3, verbose = FALSE)
## Prunned back the maximal tree using either the BIC or the AIC criterion
pltr.burn_prun <- best.tree.BIC.AIC(xtree = pltr.burn$tree, burn, Y.name,
X.names, family = family)
## plot the BIC selected tree
plot(pltr.burn_prun$tree$BIC, main = 'BIC selected tree')
text(pltr.burn_prun$tree$BIC, xpd = TRUE, cex = .6, col = 'blue')
if (FALSE) {
##load the data set
data(data_pltr)
## Set the parameters
args.rpart <- list(minbucket = 40, maxdepth = 10, cp = 0)
family <- "binomial"
Y.name <- "Y"
X.names <- "G1"
G.names <- paste("G", 2:15, sep="")
## build a maximal tree
fit_pltr <- pltr.glm(data_pltr, Y.name, X.names, G.names, args.rpart = args.rpart,
family = family,iterMax = 5, iterMin = 3)
##prunned back the maximal tree by BIC or AIC criterion
tree_select <- best.tree.BIC.AIC(xtree = fit_pltr$tree,data_pltr,Y.name,
X.names, family = family)
plot(tree_select$tree$BIC, main = 'BIC TREE')
text(tree_select$tree$BIC, minlength = 0L, xpd = TRUE, cex = .6)
}
Run the code above in your browser using DataLab