# validate.rpart

##### Dxy and Mean Squared Error by Cross-validating a Tree Sequence

Uses `xval`

-fold cross-validation of a sequence of trees to derive
estimates of the mean squared error and Somers' `Dxy`

rank correlation
between predicted and observed responses. In the case of a binary response
variable, the mean squared error is the Brier accuracy score. For
survival trees, `Dxy`

is negated so that larger is better.
There are `print`

and `plot`

methods for
objects created by `validate.rpart`

.

##### Usage

```
# f <- rpart(formula=y ~ x1 + x2 + \dots) # or rpart
# S3 method for rpart
validate(fit, method, B, bw, rule, type, sls, aics,
force, estimates, pr=TRUE,
k, rand, xval=10, FUN, …)
# S3 method for validate.rpart
print(x, …)
# S3 method for validate.rpart
plot(x, what=c("mse","dxy"), legendloc=locator, …)
```

##### Arguments

- fit
an object created by

`rpart`

. You must have specified the`model=TRUE`

argument to`rpart`

.- method,B,bw,rule,type,sls,aics,force,estimates
are there only for consistency with the generic

`validate`

function; these are ignored- x
the result of

`validate.rpart`

- k
a sequence of cost/complexity values. By default these are obtained from calling

`FUN`

with no optional arguments or from the`rpart`

`cptable`

object in the original fit object. You may also specify a scalar or vector.- rand
a random sample (usually omitted)

- xval
number of splits

- FUN
the name of a function which produces a sequence of trees, such

`prune`

.- …
additional arguments to

`FUN`

(ignored by`print,plot`

).- pr
set to

`FALSE`

to prevent intermediate results for each`k`

to be printed- what
a vector of things to plot. By default, 2 plots will be done, one for

`mse`

and one for`Dxy`

.- legendloc
a function that is evaluated with a single argument equal to

`1`

to generate a list with components`x, y`

specifying coordinates of the upper left corner of a legend, or a 2-vector. For the latter,`legendloc`

specifies the relative fraction of the plot at which to center the legend.

##### Value

a list of class `"validate.rpart"`

with components named `k, size, dxy.app`

,
`dxy.val, mse.app, mse.val, binary, xval`

. `size`

is the number of nodes,
`dxy`

refers to Somers' `D`

, `mse`

refers to mean squared error of prediction,
`app`

means apparent accuracy on training samples, `val`

means validated
accuracy on test samples, `binary`

is a logical variable indicating whether
or not the response variable was binary (a logical or 0/1 variable is
binary). `size`

will not be present if the user specifies `k`

.

##### Side Effects

prints if `pr=TRUE`

##### See Also

##### Examples

```
# NOT RUN {
n <- 100
set.seed(1)
x1 <- runif(n)
x2 <- runif(n)
x3 <- runif(n)
y <- 1*(x1+x2+rnorm(n) > 1)
table(y)
require(rpart)
f <- rpart(y ~ x1 + x2 + x3, model=TRUE)
v <- validate(f)
v # note the poor validation
par(mfrow=c(1,2))
plot(v, legendloc=c(.2,.5))
par(mfrow=c(1,1))
# }
```

*Documentation reproduced from package rms, version 5.1-4, License: GPL (>= 2)*