# Conditional Trees

##### Conditional Trees

Recursive partitioning for continuous, censored, ordered, nominal and multivariate response variables in a conditional inference framework.

- Keywords
- tree

##### Usage

```
ctree(formula, data, subset = NULL, weights = NULL,
controls = ctree_control(), xtrafo = NULL, ytrafo = NULL,
scores = NULL)
```

##### Arguments

- formula
- a symbolic description of the model to be fit.
- data
- an data frame containing the variables in the model.
- subset
- an optional vector specifying a subset of observations to be used in the fitting process.
- weights
- an optional vector of weights to be used in the fitting process. Only non-negative integer valued weights are allowed.
- controls
- an object of class
`TreeControl`

, which can be obtained using`ctree_control`

. - xtrafo
- an optional function to be applied to all input variables.
- ytrafo
- an optional function to be applied to all response variables.
- scores
- an optional named list of scores to be attached to ordered factors.

##### Details

Conditional trees estimate a regression relationship by binary recursive partitioning in a conditional inference framework. Roughly, the algorithm works as follows: 1) Test the global null hypothesis of independence between any of the input variables and the response (which may be multivariate as well). Stop if this hypothesis cannot be rejected. Otherwise select the input variable with strongest association to the resonse. This association is measured by a p-value corresponding to a test for the partial null hypothesis of a single input variable and the response. 2) Implement a binary split in the selected input variable. 3) Recursively repeate steps 1) and 2).

The implementation utilizes a unified framework for conditional inference,
or permutation tests, developed by Strasser and Weber (1999). The stop
criterion in step 1) is either based on a p-value
(`teststattype = "Bonferroni"`

or `teststattype = "MonteCarlo"`

in `ctree_control`

)
or on the raw (standardized) test
statistic (`teststattype = "Raw"`

). In both cases, the
criterion is maximized, i.e., 1 - p-value is used. A split is implemented
when the criterion exceeds the value given by `mincriterion`

as
specified in `ctree_control`

. For example, when
`mincriterion = 0.95`

, the p-value must be smaller than
$0.05$ in order to split this node. This statistical approach ensures that
the right sized tree is grown and no form of pruning or cross-validation
or whatsoever is needed. The selection of the input variable to split in
is based on the univariate p-values avoiding a variable selection bias
towards input variables with many possible cutpoints.

By default, the scores for each ordinal factor `x`

are
`1:length(x)`

, this may be changed using ```
scores = list(x =
c(1,5,6))
```

, for example.

For a general description of the methodology see Hothorn, Hornik and Zeileis (2004).

##### Value

- An object of class
`BinaryTree`

.

##### References

Torsten Hothorn, Kurt Hornik and Achim Zeileis (2004). Unbiased Recursive
Partitioning: A Conditional Inference Framework. Technical Report Nr. 8,
Research Report Series / Department of Statistics and Mathematics, WU Wien.

Helmut Strasser and Christian Weber (1999). On the asymptotic theory of permutation
statistics. *Mathematical Methods of Statistics*, **8**, 220--250.

##### Examples

```
### regression
data(airquality)
airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq,
controls = ctree_control(maxsurrogate = 3))
airct
plot(airct)
mean((airq$Ozone - predict(airct))^2)
### classification
irisct <- ctree(Species ~ .,data = iris)
irisct
plot(irisct)
table(predict(irisct), iris$Species)
### estimated class probabilities, a list
tr <- treeresponse(irisct, newdata = iris[1:10,])
### ordinal regression
data(mammoexp)
mammoct <- ctree(ME ~ ., data = mammoexp)
plot(mammoct)
### estimated class probabilities
treeresponse(mammoct, newdata = mammoexp[1:10,])
### survival analysis
if (require(ipred)) {
data(GBSG2, package = "ipred")
GBSG2ct <- ctree(Surv(time, cens) ~ .,data = GBSG2)
plot(GBSG2ct)
treeresponse(GBSG2ct, newdata = GBSG2[1:2,])
}
```

*Documentation reproduced from package party, version 0.2-1, License: GPL*