Usage
rpart.control(minsplit = 20, minbucket = round(minsplit/3), cp = 0.01,
maxcompete = 4, maxsurrogate = 5, usesurrogate = 2, xval = 10,
surrogatestyle = 0, maxdepth = 30, ...)
Arguments
minsplit
the minimum number of observations that must exist in a node in order for
a split to be attempted.
minbucket
the minimum number of observations in any terminal
node.
If only one of minbucket
or minsplit
is specified,
the code either sets minsplit
to minbucket*3
or minbucket
cp
complexity parameter. Any split that does not decrease the overall
lack of fit by a factor of cp
is not attempted. For instance,
with anova
splitting, this means that the overall R-squared must
increase by cp<
maxcompete
the number of competitor splits retained in the output. It is useful to
know not just which split was chosen, but which variable came in second,
third, etc.
maxsurrogate
the number of surrogate splits retained in the output. If this is set to
zero the compute time will be reduced, since approximately half of the
computational time (other than setup) is used in the search for surrogate
splits.
usesurrogate
how to use surrogates in the splitting process. 0
means
display only; an observation with a missing value for the primary
split rule is not sent further down the tree. 1
means use
surrogates, in order, to split subj
xval
number of cross-validations.
surrogatestyle
controls the selection of a best surrogate.
If set to 0
(default) the program uses the total number of correct
classification for a potential surrogate variable,
if set to 1
it uses the percent correct, calculated ove
maxdepth
Set the maximum depth of any node of the final tree, with the root
node counted as depth 0. Values greater than 30 rpart
will
give nonsense results on 32-bit machines.
...
mop up other arguments.