vglm: Fitting Vector Generalized Linear Models

Description

vglm is used to fit vector generalized linear models (VGLMs). This is a very large class of models that includes generalized linear models (GLMs) as a special case.

Usage

vglm(formula, family, data = list(), weights = NULL, subset = NULL, 
     na.action = na.fail, etastart = NULL, mustart = NULL, 
     coefstart = NULL, control = vglm.control(...), offset = NULL, 
     method = "vglm.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE, 
     contrasts = NULL, constraints = NULL, extra = list(), 
     form2 = NULL, qr.arg = FALSE, smart = TRUE, ...)

Arguments

formula

a symbolic description of the model to be fit. The RHS of the formula is applied to each linear predictor. Different variables in each linear predictor can be chosen by specifying constraint matrices.

family

a function of class "vglmff" (see vglmff-class) describing what statistical model is to be fitted. This is called a ``VGAM family function''. See

data

an optional data frame containing the variables in the model. By default the variables are taken from environment(formula), typically the environment from which vglm is called.

weights

an optional vector or matrix of (prior) weights to be used in the fitting process. If weights is a matrix, then it must be in matrix-band form, whereby the first $M$ columns of the matrix are the diagonals, followed by th

subset

an optional logical vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is u

etastart

starting values for the linear predictors. It is a $M$-column matrix with the same number of rows as the response. If $M=1$ then it may be a vector. Note that etastart and the output of predict(fit) should be comparable.

mustart

starting values for the fitted values. It can be a vector or a matrix; if a matrix, then it has the same number of rows as the response. Usually mustart and the output of fitted(fit) should be comparable. Some family fu

coefstart

starting values for the coefficient vector. The length and order must match that of coef(fit).

control

a list of parameters for controlling the fitting process. See vglm.control for details.

offset

a vector or $M$-column matrix of offset values. These are a priori known and are added to the linear predictors during fitting.

method

the method to be used in fitting the model. The default (and presently only) method vglm.fit uses iteratively reweighted least squares (IRLS).

model

a logical value indicating whether the model frame should be assigned in the model slot.

x.arg, y.arg

logical values indicating whether the model matrix and response vector/matrix used in the fitting process should be assigned in the x and y slots. Note the model matrix is the LM model matrix; to get the VGLM model matrix

contrasts

an optional list. See the contrasts.arg of model.matrix.default.

constraints

an optional list of constraint matrices. The components of the list must be named with the term it corresponds to (and it must match in character format exactly). There are two types of input: "lm"-type and "vlm"-type. T

extra

an optional list with any extra information that might be needed by the VGAM family function.

form2

The second (optional) formula. If argument xij is used (see vglm.control) then form2 needs to have all terms in the model. Also, some VGAM family fun

qr.arg

logical value indicating whether the slot qr, which returns the QR decomposition of the VLM model matrix, is returned on the object.

smart

logical value indicating whether smart prediction (smartpred) will be used.

...

further arguments passed into vglm.control.

Value

An object of class "vglm", which has the following slots. Some of these may not be assigned to save space, and will be recreated if necessary later.
extrathe list extra at the end of fitting.
familythe family function (of class "vglmff").
iterthe number of IRLS iterations used.
predictorsa $M$-column matrix of linear predictors.
assigna named list which matches the columns and the (LM) model matrix terms.
callthe matched call.
coefficientsa named vector of coefficients.
constraintsa named list of constraint matrices used in the fitting.
contraststhe contrasts used (if any).
controllist of control parameter used in the fitting.
criterionlist of convergence criterion evaluated at the final IRLS iteration.
df.residualthe residual degrees of freedom.
df.totalthe total degrees of freedom.
dispersionthe scaling parameter.
effectsthe effects.
fitted.valuesthe fitted values, as a matrix. This is often the mean but may be quantiles, or the location parameter, e.g., in the Cauchy model.
misca list to hold miscellaneous parameters.
modelthe model frame.
na.actiona list holding information about missing values.
offsetif non-zero, a $M$-column matrix of offsets.
posta list where post-analysis results may be put.
preplotused by plotvgam, the plotting parameters may be put here.
prior.weightsinitially supplied weights (the weights argument). Also see weightsvglm.
qrthe QR decomposition used in the fitting.
Rthe R matrix in the QR decomposition used in the fitting.
ranknumerical rank of the fitted model.
residualsthe working residuals at the final IRLS iteration.
rssresidual sum of squares at the final IRLS iteration with the adjusted dependent vectors and weight matrices.
smart.predictiona list of data-dependent parameters (if any) that are used by smart prediction.
termsthe terms object used.
weightsthe working weight matrices at the final IRLS iteration. This is in matrix-band form.
xthe model matrix (linear model LM, not VGLM).
xlevelsthe levels of the factors, if any, used in fitting.
ythe response, in matrix form.
This slot information is repeated at vglm-class.

Details

A vector generalized linear model (VGLM) is loosely defined as a statistical model that is a function of $M$ linear predictors. The central formula is given by $$\eta_j = \beta_j^T x$$ where $x$ is a vector of explanatory variables (sometimes just a 1 for an intercept), and $\beta_j$ is a vector of regression coefficients to be estimated. Here, $j=1,\ldots,M$, where $M$ is finite. Then one can write $\eta=(\eta_1,\ldots,\eta_M)^T$ as a vector of linear predictors.

Most users will find vglm similar in flavour to glm. The function vglm.fit actually does the work.

References

Yee, T. W. and Hastie, T. J. (2003) Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15--41.

Yee, T. W. and Wild, C. J. (1996) Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481--493.

Yee, T. W. (2008) The VGAM Package. R News, 8, 28--39.

Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.

Examples

Run this code

# Example 1. See help(glm)
counts = c(18,17,15,20,10,20,25,13,12)
outcome = gl(3,1,9)
treatment = gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
vglm.D93 = vglm(counts ~ outcome + treatment, family=poissonff)
summary(vglm.D93)


# Example 2. Multinomial logit model
pneumo = transform(pneumo, let=log(exposure.time))
vglm(cbind(normal, mild, severe) ~ let, multinomial, pneumo)


# Example 3. Proportional odds model
fit3 = vglm(cbind(normal,mild,severe) ~ let, propodds, pneumo, trace = TRUE)
coef(fit3, matrix = TRUE) 
constraints(fit3)
model.matrix(fit3, type="lm") # LM model matrix
model.matrix(fit3)            # Larger VGLM (or VLM) model matrix


# Example 4. Bivariate logistic model 
fit4 = vglm(cbind(nBnW, nBW, BnW, BW) ~ age, binom2.or, coalminers)
coef(fit4, matrix = TRUE)
fit4@y  # Response are proportions
weights(fit4, type="prior")


# Example 5. The use of the xij argument (simple case).
# The constraint matrix for 'op' has one column.
nn = 1000
eyesdat = round(data.frame(lop = runif(nn),
                           rop = runif(nn),
                            op = runif(nn)), dig=2)
eyesdat = transform(eyesdat, eta1 = -1+2*lop,
                             eta2 = -1+2*lop)
eyesdat = transform(eyesdat,
                    leye = rbinom(nn, size=1, prob=logit(eta1, inv = TRUE)),
                    reye = rbinom(nn, size=1, prob=logit(eta2, inv = TRUE)))
head(eyesdat)
fit5 = vglm(cbind(leye,reye) ~ op,
            binom2.or(exchangeable = TRUE, zero=3),
            data=eyesdat, trace = TRUE,
            xij = list(op ~ lop + rop + fill(lop)),
            form2 = ~  op + lop + rop + fill(lop))
coef(fit5)
coef(fit5, matrix = TRUE)
constraints(fit5)

Run the code above in your browser using DataLab