# Predict

##### Compute Predicted Values and Confidence Limits

`Predict`

allows the user to easily specify which predictors are to
vary. When the vector of values over which a predictor should vary is a
single value that is `NA`

(usually specified as a single dot), the
range will be all levels of a categorical predictor or equally-spaced
points between the `datadist`

`"Low:prediction"`

and
`"High:prediction"`

values for the variable (`datadist`

by
default uses the 10th smallest and 10th largest predictor values in the
dataset). Predicted values are
the linear predictor (X beta), a user-specified transformation of that
scale, or estimated probability of surviving past a fixed single time
point given the linear predictor. `Predict`

is usually used for
plotting predicted values but there is also a `print`

method.

There is a `plot`

method for `Predict`

objects that makes it
easy to show predicted values and confidence bands.

The `rbind`

method for `Predict`

objects allows you to create
separate sets of predictions under different situations and to combine
them into one set for feeding to `plot.Predict`

. For example you
might want to plot confidence intervals for means and for individuals
using `ols`

, and have the two types of confidence bands be
superposed onto one plot or placed into two panels. Another use for
`rbind`

is to combine predictions from quantile regression models
that predicted three different quantiles.

- Keywords
- models

##### Usage

```
Predict(x, ..., fun,
type = c("predictions", "model.frame", "x"),
np = 200, conf.int = 0.95,
conf.type = c("mean", "individual"),
adj.zero = FALSE, ref.zero = FALSE,
non.slopes, time = NULL, loglog = FALSE, digits=4, name)
```## S3 method for class 'Predict':
print(x, \dots)

## S3 method for class 'Predict':
rbind(\dots)

##### Arguments

- x
- an
`rms`

fit object, or for`print`

the result of`Predict`

.`options(datadist="d")`

must have been specified (where`d`

was created by`datadist`

), or it must have been in effect when - ...
- One or more variables to vary, or single-valued adjustment values.
Specify
`x=.`

to use the default display range, or any range you choose (e.g.`seq(0,100,by=2),c(2,3,7,14)`

). The default list of values for which predictions are - fun
- an optional transformation of the linear predictor
- type
- defaults to providing predictions. Set to
`"model.frame"`

to return a data frame of predictor settings used. Set to`"x"`

to return the corresponding design matrix constructed from the predictor settings. - np
- the number of equally-spaced points computed for continuous
predictors that vary, i.e., when the specified value is
`.`

or`NA`

- conf.int
- confidence level. Default is 0.95. Specify
`FALSE`

to suppress. - conf.type
- type of confidence interval. Default is
`"mean"`

which applies to all models. For models containing a residual variance (e.g,`ols`

), you can specify`conf.type="individual"`

instead, to obtain limits on the predicted - adj.zero
- Set to
`TRUE`

to adjust all non-plotted variables to 0 (or reference cell for categorical variables) and to omit intercept(s) from consideration. Default is`FALSE`

. - ref.zero
- Set to
`TRUE`

to subtract a constant from $X\beta$ before plotting so that the reference value of the`x`

-variable yields`y=0`

. This is done before applying function`fun`

. - non.slopes
- This is only useful in a multiple intercept model such as the ordinal
logistic model. There to use to second of three intercepts, for example,
specify
`non.slopes=c(0,1,0)`

. The default is`non.slopes=rep(0,k)`

if`adj.zero=T`

- time
- Specify a single time
`u`

to cause function`survest`

to be invoked to plot the probability of surviving until time`u`

when the fit is from`cph`

or`psm`

. - loglog
- Specify
`loglog=TRUE`

to plot`log[-log(survival)]`

instead of survival, when`time`

is given. - digits
- Controls how ``adjust-to'' values are plotted. The default is 4 significant digits.
- name
- Instead of specifying the variables to vary in the
`variables`

(...) list, you can specify one or more variables by specifying a vector of character string variable names in the`name`

argument. Using this mode you cannot specify

##### Details

When there are no intercepts in the fitted model, plot subtracts adjustment values from each factor while computing variances for confidence limits.

Specifying `time`

will not work for Cox models with time-dependent
covariables. Use `survest`

or `survfit`

for that purpose.

##### Value

- a data frame containing all model predictors and the computed values
`yhat`

,`lower`

,`upper`

, the latter two if confidence intervals were requested. The data frame has an additional`class`

`"Predict"`

. If`name`

is specified or no predictors are specified in ..., the resulting data frame has an additional variable called`.predictor.`

specifying which predictor is currently being varied.`.predictor.`

is handy for use as a paneling variable in`lattice`

or`ggplot2`

graphics.

##### See Also

`datadist`

, `predictrms`

,
`contrast.rms`

, `summary.rms`

,
`rms`

, `rms.trans`

, `survest`

,
`survplot`

, `rmsMisc`

,
`smearingEst`

, `rbind`

##### Examples

```
n <- 1000 # define sample size
set.seed(17) # so can reproduce the results
age <- rnorm(n, 50, 10)
blood.pressure <- rnorm(n, 120, 15)
cholesterol <- rnorm(n, 200, 25)
sex <- factor(sample(c('female','male'), n,TRUE))
label(age) <- 'Age' # label is in Hmisc
label(cholesterol) <- 'Total Cholesterol'
label(blood.pressure) <- 'Systolic Blood Pressure'
label(sex) <- 'Sex'
units(cholesterol) <- 'mg/dl' # uses units.default in Hmisc
units(blood.pressure) <- 'mmHg'
# Specify population model for log odds that Y=1
L <- .4*(sex=='male') + .045*(age-50) +
(log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
# Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)]
y <- ifelse(runif(n) < plogis(L), 1, 0)
ddist <- datadist(age, blood.pressure, cholesterol, sex)
options(datadist='ddist')
fit <- lrm(y ~ blood.pressure + sex * (age + rcs(cholesterol,4)))
Predict(fit, age=., cholesterol=., np=4)
Predict(fit, age=seq(20,80,by=10), sex=., conf.int=FALSE)
Predict(fit, age=seq(20,80,by=10), sex='male') # works if datadist not used
ddist$limits$age[2] <- 30 # make 30 the reference value for age
# Could also do: ddist$limits["Adjust to","age"] <- 30
fit <- update(fit) # make new reference value take effect
Predict(fit, age=., ref.zero=TRUE, fun=exp)
# Make two curves, and plot the predicted curves as two trellis panels
w <- Predict(fit, age=., sex=.)
require(lattice)
xyplot(yhat ~ age | sex, data=w, type='l')
# To add confidence bands we need to use the Hmisc xYplot function in
# place of xyplot
xYplot(Cbind(yhat,lower,upper) ~ age | sex, data=w,
method='filled bands', type='l', col.fill=gray(.95))
# If non-displayed variables were in the model, add a subtitle to show
# their settings using title(sub=paste('Adjusted to',attr(w,'info')$adjust),adj=0)
# Easier: feed w into plot.Predict
# Predictions form a parametric survival model
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n,
rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
t <- -log(runif(n))/h
label(t) <- 'Follow-up Time'
e <- ifelse(t<=cens,1,0)
t <- pmin(t, cens)
units(t) <- "Year"
ddist <- datadist(age, sex)
Srv <- Surv(t,e)
# Fit log-normal survival model and plot median survival time vs. age
f <- psm(Surv(t, e) ~ rcs(age), dist=if(.R.)'lognormal' else 'gaussian')
med <- Quantile(f) # Creates function to compute quantiles
# (median by default)
Predict(f, age=., fun=function(x)med(lp=x))
# Note: This works because med() expects the linear predictor (X*beta)
# as an argument. Would not work if use
# ref.zero=TRUE or adj.zero=TRUE.
# Also, confidence intervals from this method are approximate since
# they don't take into account estimation of scale parameter
# Fit an ols model to log(y) and plot the relationship between x1
# and the predicted mean(y) on the original scale without assuming
# normality of residuals; use the smearing estimator. Before doing
# that, show confidence intervals for mean and individual log(y)
set.seed(1)
x1 <- runif(300)
x2 <- runif(300)
ddist <- datadist(x1,x2); options(datadist='ddist')
y <- exp(x1+ x2 - 1 + rnorm(300))
f <- ols(log(y) ~ pol(x1,2) + x2)
p1 <- Predict(f, x1=., conf.type='mean')
p2 <- Predict(f, x1=., conf.type='individual')
p <- rbind(mean=p1, individual=p2)
plot(p, label.curve=FALSE) # uses superposition
plot(p, ~x1 | .set.) # 2 panels
r <- resid(f)
smean <- function(yhat)smearingEst(yhat, exp, res, statistic='mean')
formals(smean) <- list(yhat=numeric(0), res=r[!is.na(r)])
#smean$res <- r[!is.na(r)] # define default res argument to function
Predict(f, x1=., fun=smean)
options(datadist=NULL)
```

*Documentation reproduced from package rms, version 2.0-2, License: GPL (>= 2)*