Learn R Programming

aster (version 0.7-4)

predict.aster: Predict Method for Aster Model Fits

Description

Obtains predictions and optionally estimates standard errors of those predictions from a fitted Aster model object.

Usage

## S3 method for class 'aster':
predict(object, x, root, modmat, amat,
    parm.type = c("mean.value", "canonical"),
    model.type = c("unconditional", "conditional"),
    se.fit = FALSE, info = c("expected", "observed"),
    info.tol = sqrt(.Machine$double.eps), ...)

## S3 method for class 'aster.formula': predict(object, newdata, varvar, idvar, root, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), ...)

Arguments

object
a fitted object of class inheriting from "aster" or "aster.formula".
modmat
a model matrix to use instead of object$modmat. Must have the same structure (three-dimensional array, first index runs over individuals, second over nodes of the graphical model, third over covariates. Must have the same second
x
response. Ignored and may be missing unless parm.type == "mean.value" && model.type == "conditional". Even then may be missing when modmat is missing, in which case object$x is used. A matrix whose firs
root
root data. Ignored and may be missing unless parm.type == "mean.value". Even then may be missing when modmat is missing, in which case object$root is used. A matrix of the same form as x
amat
if zeta is the requested prediction (mean value or canonical, unconditional or conditional, depending on parm.type and model.type), then we predict the linear function t(amat) %*% zeta. May
parm.type
the type of parameter to predict. The default is mean value parameters (the opposite of the default for predict.glm), the expected value of a linear function of the response under the MLE p
model.type
the type of model in which to predict. The default is "unconditional" in which case the parameters (either mean value or canonical, depending on the value of parm.type) are those of an unconditional model. The al
se.fit
logical switch indicating if standard errors are required.
info
the type of Fisher information use to compute standard errors.
info.tol
tolerance for eigenvalues of Fisher information. If eval is the vector of eigenvalues of the information matrix, then eval < cond.tol * max(eval) are considered zero. Hence the corresponding eigenvectors are directio
newdata
optionally, a data frame in which to look for variables with which to predict. If omitted, see modmat above. See also details section below.
varvar
a variable of length nrow(newdata), typically a variable in newdata that is a factor whose levels are character strings treated as variable names. The number of variable names is nnode. Must be of th
idvar
a variable of length nrow(newdata), typically a variable in newdata that indexes individuals. The number of individuals is nind. Must be of the form rep(inds, times = nnode) where
...
further arguments passed to or from other methods.

Value

  • If se.fit = FALSE, a vector of predictions. If se.fit = TRUE, a list with components
  • fitPredictions
  • se.fitEstimated standard errors

concept

  • regression
  • exponential family
  • graphical model

Details

Note that model.type need have nothing to do with the type of the fitted aster model, which is object$type.

Whether the fitted model is conditional or unconditional, one typically wants unconditional mean value parameters, because conditional mean value parameters for hypothetical individuals depend on the hypothetical data x, which usually makes no scientific sense.

If one does ask for conditional mean value parameters, generally the data should satisfy all(x == 1) and all(root == 1), so that the mean value parameters are per unit of predecessor variable, that is we predict $\psi''(\theta_{i j})$ rather than this multiplied by $X_{i p(j)}$, where $p(j)$ is the mathematical function defined by the R expression pred[j].

Similarly, if object$type == "conditional", then the conditional canonical parameters are a linear function of the regression coefficients $\theta = M \beta$, where $M$ is the model matrix, but one can predict either $\theta$ or the unconditional canonical parameters $\varphi$, as selected by model.type. Similarly, if object$type == "unconditional", so $\varphi = M \beta$, one can predict either $\theta$ or $\varphi$ as selected by model.type.

The specification of the prediction model is confusing because there are so many possibilities. First the usual case. The fit was done using a formula, found in object$formula. A data frame newdata that has the same variables as object$data, the data frame used in the fit, but may have different rows (representing hypothetical individuals) is supplied. But newdata must specify all nodes of the graphical model for each (hypothetical, new) individual, just like object$data did for real observed individuals. Hence newdata is typically constructed using reshape. See also the details section of aster.

In this usual case we need varvar and idvar to tell us what rows of newdata correspond to which individuals and nodes (the same role they played in the original fit by aster). If we are predicting canonical parameters, then we do not need root or x. If we are predicting unconditional mean value parameters, then we also need root but not x. If we are predicting conditional mean value parameters, then we also need both root and x. In the usual case, these are found in newdata and not supplied as arguments to predict. Moreover, x is not named "x" but is the response in out$formula.

The next case, predict(object) with no other arguments, is often used with linear models (predict.lm), but we expect will be little used for aster models. As for linear models, this predicts the observed data. In this case modmat, x, and root are found in object and nothing is supplied as an argument to predict.aster, except perhaps amat if one wants a function of predictions for the observed data.

The final case, also perhaps little used, is a fail-safe mode for problems in which the R formula language just cannot be bludgeoned into doing what you want. This is the same reason aster.default exists. Then a model matrix can be constructed by hand, and the function predict.aster is used instead of predict.aster.formula.

Note that it is possible to use a constructed by hand model matrix even if object was produced by aster.formula. Simply explicitly call predict.aster rather than predict to override the R method dispatch (which would call predict.aster.formula in this case).

Examples

Run this code
### see package vignette for explanation ###
library(aster)
data(echinacea)
vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04",
    "hdct02", "hdct03", "hdct04")
redata <- reshape(echinacea, varying = list(vars), direction = "long",
    timevar = "varb", times = as.factor(vars), v.names = "resp")
redata <- data.frame(redata, root = 1)
pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6)
fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3)
hdct <- grep("hdct", as.character(redata$varb))
hdct <- is.element(seq(along = redata$varb), hdct)
redata <- data.frame(redata, hdct = as.integer(hdct))
aout4 <- aster(resp ~ varb + nsloc + ewloc + pop * hdct - pop,
    pred, fam, varb, id, root, data = redata)
newdata <- data.frame(pop = levels(echinacea$pop))
for (v in vars)
    newdata[[v]] <- 1
newdata$root <- 1
newdata$ewloc <- 0
newdata$nsloc <- 0
renewdata <- reshape(newdata, varying = list(vars),
     direction = "long", timevar = "varb", times = as.factor(vars),
     v.names = "resp")
hdct <- grep("hdct", as.character(renewdata$varb))
hdct <- is.element(seq(along = renewdata$varb), hdct)
renewdata <- data.frame(renewdata, hdct = as.integer(hdct))
nind <- nrow(newdata)
nnode <- length(vars)
amat <- array(0, c(nind, nnode, nind))
for (i in 1:nind)
    amat[i , grep("hdct", vars), i] <- 1
foo <- predict(aout4, varvar = varb, idvar = id, root = root,
    newdata = renewdata, se.fit = TRUE, amat = amat)
bar <- cbind(foo$fit, foo$se.fit)
dimnames(bar) <- list(as.character(newdata$pop), c("Estimate", "Std. Error"))
print(bar)

Run the code above in your browser using DataLab