Uses `xval`

-fold cross-validation of a sequence of trees to derive
estimates of the mean squared error and Somers' `Dxy`

rank correlation
between predicted and observed responses. In the case of a binary response
variable, the mean squared error is the Brier accuracy score. For
survival trees, `Dxy`

is negated so that larger is better.
There are `print`

and `plot`

methods for
objects created by `validate.rpart`

.

```
# f <- rpart(formula=y ~ x1 + x2 + \dots) # or rpart
# S3 method for rpart
validate(fit, method, B, bw, rule, type, sls, aics,
force, estimates, pr=TRUE,
k, rand, xval=10, FUN, …)
# S3 method for validate.rpart
print(x, …)
# S3 method for validate.rpart
plot(x, what=c("mse","dxy"), legendloc=locator, …)
```

fit

an object created by `rpart`

. You must have specified the
`model=TRUE`

argument to `rpart`

.

method,B,bw,rule,type,sls,aics,force,estimates

are there only for consistency with the generic `validate`

function; these are ignored

x

the result of `validate.rpart`

k

a sequence of cost/complexity values. By default these are obtained
from calling `FUN`

with no optional arguments or
from the `rpart`

`cptable`

object in the original fit object.
You may also specify a scalar or vector.

rand

a random sample (usually omitted)

xval

number of splits

FUN

the name of a function which produces a sequence of trees, such
`prune`

.

…

additional arguments to `FUN`

(ignored by `print,plot`

).

pr

set to `FALSE`

to prevent intermediate results for each `k`

to be printed

what

a vector of things to plot. By default, 2 plots will be done, one for
`mse`

and one for `Dxy`

.

legendloc

a function that is evaluated with a single argument equal to `1`

to
generate a list with components `x, y`

specifying coordinates of the
upper left corner of a legend, or a 2-vector. For the latter,
`legendloc`

specifies the relative fraction of the plot at which to
center the legend.

a list of class `"validate.rpart"`

with components named `k, size, dxy.app`

,
`dxy.val, mse.app, mse.val, binary, xval`

. `size`

is the number of nodes,
`dxy`

refers to Somers' `D`

, `mse`

refers to mean squared error of prediction,
`app`

means apparent accuracy on training samples, `val`

means validated
accuracy on test samples, `binary`

is a logical variable indicating whether
or not the response variable was binary (a logical or 0/1 variable is
binary). `size`

will not be present if the user specifies `k`

.

prints if `pr=TRUE`

# NOT RUN { n <- 100 set.seed(1) x1 <- runif(n) x2 <- runif(n) x3 <- runif(n) y <- 1*(x1+x2+rnorm(n) > 1) table(y) require(rpart) f <- rpart(y ~ x1 + x2 + x3, model=TRUE) v <- validate(f) v # note the poor validation par(mfrow=c(1,2)) plot(v, legendloc=c(.2,.5)) par(mfrow=c(1,1)) # }