fastml: Rapidly converging algorithm for maximum-likelihood (ML) estimation in linear mixed models

Description

Computes ML estimates of parameters in linear mixed models using the rapidly converging procedure described by Schafer (1998), which combines Fisher scoring with an ECME algorithm.

For a description of the model, see the "Details" section below.

Usage

fastml(y, subj, pred, xcol, zcol, vmax, occ, start,
   maxits=50, eps=0.0001)

Arguments

vector of responses. This is simply the individual yi vectors stacked upon one another. Each element of y represents the observed response for a particular subject-occasion, or for a particular unit within a cluster.

subj

vector of same length as y, giving the subject (or cluster) indicators i for the elements of y. For example, suppose that y is c(y1,y2,y3,y4) where length(y1)=2, length(y2)=3, length(y3)=2, and length(y4)=7. Then subj should be c(1,1,2,2,2,3,3,4,4,4,4,4,

pred

matrix of covariates used to predict y. The number of rows should be length(y). The first column will typically be constant (one), and the remaining columns correspond to other variables appearing in Xi and Zi.

xcol

vector of integers indicating which columns of pred will be used in Xi. That is, pred[,xcol] is the Xi matrices (stacked upon one another.

zcol

vector of integers indicating which columns of pred will be used in Zi. That is, pred[,zcol] is the Zi matrices (stacked upon one another).

vmax

optional matrix of dimension c(max(occ),max(occ)) from which the Vi matrices will be extracted. In a longitudinal dataset, vmax would represent the Vi matrix for an individual with responses at all possible occasions 1,2,...,nmax=max(occ); for individual

occ

vector of same length as y indicating the "occasions" for the elements of y. This argument is relevant only if a non-identity vmax is specified. In a longitudinal dataset where each individual is measured on at most nmax distinct occasions, each element o

start

optional starting values of the parameters. If this argument is not given then the function chooses its own starting values. This argument should be a list of three elements named "beta", "psi", and "sigma2". Note that "beta" should be a vector of the sam

maxits

maximum number of cycles to be performed. The algorithm runs to convergence or until "maxits" iterations, whichever comes first.

eps

convergence criterion. The algorithm is considered to have converged if the relative differences in all parameters from one iteration to the next are less than eps--that is, if all(abs(new-old)

Value

a list containing the following components.
betavector of same length as "xcol" containing estimated fixed effects.
sigma2estimate of residual error variance.
psimatrix of dimension c(length(zcol),length(zcol)) containing estimated variances and covariances of the random effects.
convergedT if the algorithm converged, F if it did not.
iternumber of iterations actually performed. Will be equal to "maxits" if converged=F.
rejecta logical vector of length iter indicating, for each iteration, whether the scoring estimates were rejected and replaced by ECME estimates (T), or whether the scoring estimates were accepted (F). Scoring estimates are rejected if they do not increase the loglikelihood.
loglikvector of length "iter" reporting the value of the loglikelihood at each iteration.
cov.betamatrix of dimension c(length(xcol),length(xcol)) containing estimated variances and covariances for elements of "beta". These are conventional estimates which regard the variance parameters (sigma2 and psi) as fixed at their ML estimates.
b.hata matrix with length(zcol) rows and m columns, where b.hat[,i] is an empirical Bayes estimate of bi.
cov.ban array of dimension length(zcol) by length(zcol) by m, where cov.b[,,i] is an empirical Bayes estimate of the covariance matrix associated with bi. These are conventional estimates which regard the variance parameters (sigma2 and psi) as fixed at their ML estimates. (An improved version which incorporates variance-parameter uncertainty is available from the function "fastrml".)

Details

A full description of the algorithm is given in Section 3 of Schafer (1998). Scoring is carried out on log(sigma2) and the nonredundant elements of the inverse of psi/sigma2, taking logs of the diagonal elements.

The model, which is typically applied to longitudinal or clustered responses, is

yi = Xi%*%beta + Zi%*%bi + ei , i=1,...,m,

where

yi = (ni x 1) response vector for subject or cluster i; Xi = (ni x p) matrix of covariates; Zi = (ni x q) matrix of covariates; beta = (p x 1) vector of coefficients common to the population (fixed effects); bi = (q x 1) vector of coefficients specific to subject or cluster i (random effects); and ei = (ni x 1) vector of residual errors.

The vector bi is assumed to be normally distributed with mean zero and unstructured covariance matrix psi,

bi $\sim$ N(0,psi) independently for i=1,...,m.

The residual vector ei is assumed to be

ei $\sim$ N(0,sigma2*Vi)

where Vi is a known (ni x ni) matrix. In most applications, Vi is the identity matrix.

References

Schafer, J.L. (1998) Some improved procedures for linear mixed models. Submitted to Journal of the American Statistical Association.

Examples

Run this code

For a detailed example, see the file "example.R" distributed
with this library.

Run the code above in your browser using DataLab