Model-based recursive partitioning based on generalized linear models.
glmtree(formula, data, subset, na.action, weights, offset, cluster,
family = gaussian, epsilon = 1e-8, maxit = 25, …)
symbolic description of the model (of type
y ~ z1 + … + zl
or y ~ x1 + … + xk | z1 + … + zl
;
for details see below).
arguments controlling formula processing
via model.frame
.
optional numeric vector of weights. By default these are
treated as case weights but the default can be changed in
mob_control
.
optional numeric vector with an a priori known component to be
included in the model y ~ x1 + … + xk
(i.e., only when
x
variables are specified).
optional vector (typically numeric or factor) with a cluster ID to be employed for clustered covariances in the parameter stability tests.
specification of a family for glm
.
control parameters passed to
glm.control
.
optional control parameters passed to
mob_control
.
An object of class glmtree
inheriting from modelparty
.
The info
element of the overall party
and the individual
node
s contain various informations about the models.
Convenience interface for fitting MOBs (model-based recursive partitions) via
the mob
function. glmtree
internally sets up a model
fit
function for mob
, using glm.fit
.
Then mob
is called using the negative log-likelihood as the objective
function.
Compared to calling mob
by hand, the implementation tries to avoid
unnecessary computations while growing the tree. Also, it provides a more
elaborate plotting function.
Zeileis A, Hothorn T, Hornik K (2008). Model-Based Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492--514.
# NOT RUN {
if(require("mlbench")) {
## Pima Indians diabetes data
data("PimaIndiansDiabetes", package = "mlbench")
## recursive partitioning of a logistic regression model
pid_tree2 <- glmtree(diabetes ~ glucose | pregnant +
pressure + triceps + insulin + mass + pedigree + age,
data = PimaIndiansDiabetes, family = binomial)
## printing whole tree or individual nodes
print(pid_tree2)
print(pid_tree2, node = 1)
## visualization
plot(pid_tree2)
plot(pid_tree2, tp_args = list(cdplot = TRUE))
plot(pid_tree2, terminal_panel = NULL)
## estimated parameters
coef(pid_tree2)
coef(pid_tree2, node = 5)
summary(pid_tree2, node = 5)
## deviance, log-likelihood and information criteria
deviance(pid_tree2)
logLik(pid_tree2)
AIC(pid_tree2)
BIC(pid_tree2)
## different types of predictions
pid <- head(PimaIndiansDiabetes)
predict(pid_tree2, newdata = pid, type = "node")
predict(pid_tree2, newdata = pid, type = "response")
predict(pid_tree2, newdata = pid, type = "link")
}
# }
Run the code above in your browser using DataLab