predict: Prediction from a model fit.

Description

Prediction of the response variable by its expected value obtained as (the inverse link transformation of) the linear predictor ($\eta$) and more generally for terms of the form X[n]$\beta$+Z[n]v, for possibly new design matrices X[n] and Z[n].

Usage

## S3 method for class 'HLfit':
predict(object,newdata = newX, newX=NULL, re.form= NULL,
                 variances=list(fixef=FALSE, linPred=FALSE,dispVar=FALSE, 
                                resid=FALSE, sum=FALSE, cov=FALSE),
                 predVar=variances$linPred,residVar=variances$resid,
                 binding = FALSE,...)

Arguments

object

The return object of an HLfit or similar function.

newdata

Either a matrix or data frame containing all required variables for evaluating fixed and random effects, including an offset. If NULL, the original data are reused. or a numeric vector, which names (if any) are ignored.

newX

equivalent to newdata, available for back-compatibility

re.form

formula for random effects to include. By default, it is NULL, in which case all random effects are included. If it is NA, no random effect is included. If it is a formula, only the random effects it contains are retained. The other variance components a

variances

A list which elements control the computation of different estimated variances. fixef=TRUE will provide the variances of X$\beta$; linPred=TRUE will provide the variance of the linear predictor $\eta$ for given dispersion

predVar

(for back-compatibility: variances should now be used) predVar=TRUE corresponds to variances=list(linPred=TRUE), and predVar="Cov" corresponds to variances=list(linPred=TRUE,cov=TRUE).

residVar

(for back-compatibility: variances should now be used) residVar=TRUE corresponds to variances=list(resid=TRUE).

binding

If binding is a character string, the predicted values are bound with the newdata and the result is returned as a data frame. The predicted values column name is the given binding, or a name based on it, if the

...

further arguments passed to or from other methods.

Value

A matrix or data frame (according to the binding argument), with optionally one or more prediction variance vector or (co)variance matrices as attributes. The further attribute fittedName contains the binding name, if any.

Details

If newdata is NULL, predict returns the fitted responses, including random effects, from the object. Otherwise it computes new predictions including random effects as far as possible. For spatial random effects it constructs a correlation matrix C between new locations and locations in the original fit. Then it infers the random effects in the new locations as C (L'$)^{-1}$ v (see spaMM for notation). For non-spatial random effects, it checks whether any group (i.e., level of a random effect) in the new data was represented in the original data, and it adds the inferred random effect for this group to the prediction for individuals in this group. fixefVar is the (co)variance of X$\beta$ (or X[n]$\beta$), deduced from the asymptotic covariance matrix of $\beta$ estimates. predVar is the prediction (co)variance of $\eta$=bold{X}$\beta$+ZLv (see HLfit Details for notation), or more generally of X[n]$\beta$+Z[n]L[n]v, by default computed for given dispersion parameters. For levels of the random effects present in the original data, predVar computation assumes that the covariance matrix of $\beta$ and v estimates is the inverse of the expected Hessian matrix (for given dispersion parameters) of the augmented linear model for $\beta$ and v. It thus takes into account the joint uncertainty in estimation of $\beta$ and prediction of v. For new levels of the random effects, predVar computation additionally takes into account unceratinty in prediction of v for these new levels. For prediction covariance with a new Z[n], it matters whether a single or multiple new levels are used: see Examples. If variances$dispVar is TRUE, prediction variance may also include a term accounting for uncertainty in $\phi$ and $\lambda$, computed following Booth and Hobert (1998, eq. 19). This computation is currently implemented for models with a single random effect, and ignore uncertainties in spatial correlation parameters. For models with non-Gaussian response, the prediction covariance of the response is approximated by the prediction covariance of the linear predictor, pre- and post-multiplied by $\partial\mu/\partial\eta$. These variance calculations are approximate except for LMMs, and cannot be garanteed to give accurate results. In the point prediction of the linear predictor, the unconditional expected value of $u$ is assigned to the realizations of $u$ for unobserved levels of non-spatial random effects (it is zero in GLMMs but not for non-gaussian random effects), and the inferred value of $u$ is assigned in all other cases. Corresponding values of $v$ are then deduced. This computation yields the classical BLUP or empirical Bayes predictor in LMMs, but otherwise it may yield less well characterized predictors, where unconditional $v$ may not be its expected value when the rand.family link is not identity.

References

Booth, J.G., Hobert, J.P. (1998) Standard errors of prediction in generalized linear mixed models. J. Am. Stat. Assoc. 93: 262-272.

Examples

Run this code

data(blackcap)
fitobject <- corrHLfit(migStatus ~ 1 + Matern(1|latitude+longitude),data=blackcap,
                       ranFix=list(nu=4,rho=0.4,phi=0.05))
predict(fitobject)
getDistMat(fitobject)

#### multiple controls of prediction variances
## (1) fit with an additional random effect
grouped <- cbind(blackcap,grp=c(rep(1,7),rep(2,7))) 
fitobject <- corrHLfit(migStatus ~ 1 +  (1|grp) +Matern(1|latitude+longitude),
                       data=grouped,  ranFix=list(nu=4,rho=0.4,phi=0.05))

## (2) re.form usage to remove a random effect from point prediction and variances: 
predict(fitobject,re.form= ~ 1 +  Matern(1|latitude+longitude))

## (3) comparison of covariance matrices for two types of new data
moregroups <- grouped[1:5,]
rownames(moregroups) <- paste("newloc",1:5,sep="")
moregroups$grp <- rep(3,5) ## all new data belong to an unobserved third group 
cov1 <- attr(predict(fitobject,newdata=moregroups,
                     variances=list(linPred=TRUE,cov=TRUE)),"predVar")
moregroups$grp <- 3:7 ## all new data belong to distinct unobserved groups
cov2 <- attr(predict(fitobject,newdata=moregroups,
                     variances=list(linPred=TRUE,cov=TRUE)),"predVar")
cov1-cov2 ## the expected off-diagonal covariance due to the common group in the first fit.

## Effects of numerically singular correlation matrix C:
fitobject <- corrHLfit(migStatus ~ 1 + Matern(1|latitude+longitude),data=blackcap,
                       ranFix=list(nu=10,rho=0.001)) ## numerically singular C
predict(fitobject) ## predicted mu computed as X beta + L v 
predict(fitobject,newdata=blackcap) ## predicted mu computed as X beta + C
## point predictions and variances with new X and Z
if(require("rsae", quietly = TRUE)) {
  data(landsat)
  fitobject <- HLfit(HACorn ~ PixelsCorn + PixelsSoybeans + (1|CountyName),
                     data=landsat[-33,],HLmethod="ML")
  newXandZ <- unique(data.frame(PixelsCorn=landsat$MeanPixelsCorn,
                                PixelsSoybeans=landsat$MeanPixelsSoybeans,
                                CountyName=landsat$CountyName))
  predict(fitobject,newdata=newXandZ,variances = list(linPred=TRUE,dispVar=TRUE))
}

Run the code above in your browser using DataLab