# Control Forest Hyper Parameters

##### Control for Conditional Tree Forests

Various parameters that control aspects of the `cforest' fit via its `control' argument.

- Keywords
- misc

##### Usage

```
cforest_unbiased(…)
cforest_classical(…)
cforest_control(teststat = "max",
testtype = "Teststatistic",
mincriterion = qnorm(0.9),
savesplitstats = FALSE,
ntree = 500, mtry = 5, replace = TRUE,
fraction = 0.632, trace = FALSE, …)
```

##### Arguments

- teststat
a character specifying the type of the test statistic to be applied.

- testtype
a character specifying how to compute the distribution of the test statistic.

- mincriterion
the value of the test statistic (for

`testtype == "Teststatistic"`

), or 1 - p-value (for other values of`testtype`

) that must be exceeded in order to implement a split.- mtry
number of input variables randomly sampled as candidates at each node for random forest like algorithms. Bagging, as special case of a random forest without random input variable sampling, can be performed by setting

`mtry`

either equal to`NULL`

or manually equal to the number of input variables.- savesplitstats
a logical determining whether the process of standardized two-sample statistics for split point estimate is saved for each primary split.

- ntree
number of trees to grow in a forest.

- replace
a logical indicating whether sampling of observations is done with or without replacement.

- fraction
fraction of number of observations to draw without replacement (only relevant if

`replace = FALSE`

).- trace
a logical indicating if a progress bar shall be printed while the forest grows.

- …
additional arguments to be passed to

`ctree_control`

.

##### Details

All three functions return an object of class `ForestControl-class`

defining hyper parameters to be specified via the `control`

argument
of `cforest`

.

The arguments `teststat`

, `testtype`

and `mincriterion`

determine how the global null hypothesis of independence between all input
variables and the response is tested (see `ctree`

). The
argument `nresample`

is the number of Monte-Carlo replications to be
used when `testtype = "MonteCarlo"`

.

A split is established when the sum of the weights in both daugther nodes
is larger than `minsplit`

, this avoids pathological splits at the
borders. When `stump = TRUE`

, a tree with at most two terminal nodes
is computed.

The `mtry`

argument regulates a random selection of `mtry`

input
variables in each node. Note that here `mtry`

is fixed to the value 5 by
default for merely technical reasons, while in `randomForest`

the default values for classification and regression vary with the number of input
variables. Make sure that `mtry`

is defined properly before using `cforest`

.

It might be informative to look at scatterplots of input variables against
the standardized two-sample split statistics, those are available when
`savesplitstats = TRUE`

. Each node is then associated with a vector
whose length is determined by the number of observations in the learning
sample and thus much more memory is required.

The number of trees `ntree`

can be increased for large numbers of input variables.

Function `cforest_unbiased`

returns the settings suggested
for the construction of unbiased random forests (```
teststat = "quad", testtype = "Univ",
replace = FALSE
```

) by Strobl et al. (2007)
and is the default since version 0.9-90.
Hyper parameter settings mimicing the behaviour of
`randomForest`

are available in
`cforest_classical`

which have been used as default up to
version 0.9-14.

Please note that `cforest`

, in contrast to
`randomForest`

, doesn't grow trees of
maximal depth. To grow large trees, set `mincriterion = 0`

.

##### Value

An object of class `ForestControl-class`

.

##### References

Carolin Strobl, Anne-Laure Boulesteix, Achim Zeileis and Torsten Hothorn (2007).
Bias in Random Forest Variable Importance Measures: Illustrations, Sources and
a Solution. *BMC Bioinformatics*, **8**, 25. DOI: 10.1186/1471-2105-8-25

*Documentation reproduced from package party, version 1.3-3, License: GPL-2*