Control for Conditional Inference Trees
Various parameters that control aspects of the `ctree' fit.
ctree_control(teststat = c("quad", "max"), testtype = c("Bonferroni", "MonteCarlo", "Univariate", "Teststatistic"), mincriterion = 0.95, minsplit = 20, minbucket = 7, stump = FALSE, nresample = 9999, maxsurrogate = 0, mtry = 0, savesplitstats = TRUE, maxdepth = 0, remove_weights = FALSE)
a character specifying the type of the test statistic to be applied.
a character specifying how to compute the distribution of the test statistic.
the value of the test statistic (for
testtype == "Teststatistic"), or 1 - p-value (for other values of
testtype) that must be exceeded in order to implement a split.
the minimum sum of weights in a node in order to be considered for splitting.
the minimum sum of weights in a terminal node.
a logical determining whether a stump (a tree with three nodes only) is to be computed.
number of Monte-Carlo replications to use when the distribution of the test statistic is simulated.
number of surrogate splits to evaluate. Note the currently only surrogate splits in ordered covariables are implemented.
number of input variables randomly sampled as candidates at each node for random forest like algorithms. The default
mtry = 0means that no random selection takes place.
a logical determining if the process of standardized two-sample statistics for split point estimate is saved for each primary split.
maximum depth of the tree. The default
maxdepth = 0means that no restrictions are applied to tree sizes.
a logical determining if weights attached to nodes shall be removed after fitting the tree.
determine how the global null hypothesis of independence between all input
variables and the response is tested (see
nresample is the number of Monte-Carlo replications to be
testtype = "MonteCarlo".
A split is established when the sum of the weights in both daugther nodes
is larger than
minsplit, this avoids pathological splits at the
stump = TRUE, a tree with at most two terminal nodes
mtry > 0 means that a random forest like `variable
selection', i.e., a random selection of
mtry input variables, is
performed in each node.
It might be informative to look at scatterplots of input variables against
the standardized two-sample split statistics, those are available when
savesplitstats = TRUE. Each node is then associated with a vector
whose length is determined by the number of observations in the learning
sample and thus much more memory is required.
An object of class