hglm2: Fitting Hierarchical Generalized Linear Models

Description

hglm2 is used to fit hierarchical generalized linear models. hglm2 is used to fit hierarchical generalized linear models. It extends the hglm function by allowing for several random effects, where the model is specified in lme4 convension, and also by implementing sparse matrix techniques using the Matrix library.

Usage

hglm2(meanmodel, data = NULL, family = gaussian(link = identity),
      rand.family = gaussian(link = identity), method = "EQL", 
      conv = 1e-6, maxit = 50, startval = NULL,
      X.disp = NULL, disp = NULL, link.disp = "log", 
      weights = NULL, fix.disp = NULL, offset = NULL, 
      sparse = TRUE, vcovmat = FALSE, calc.like = FALSE, 
      bigRR = FALSE, verbose = FALSE, ...)

Arguments

meanmodel

formula. A two sided formula specifying the fixed and random terms in lme4 convention, e.g. y ~ x1 + (1|id) indicates y as response, x1 as the fixed effect and (1|id) represent

data

data.frame. An optional data frame from where the variables in the meanmodel (and possibly disp) are to be obtained. It is expected that the data frame does not contain any missing value.

family

family. The description of the error distribution and link function to be used in the mean part of the model. (See family for details of family functions.)

rand.family

family. The description of the distribution and link function to be used for the random effect.

method

character. Estimation method where EQL is the method of interconnected GLMs presented in Lee et al (2006). Currently only EQL is available as the estimation method.

conv

numeric. The convergence criteria (change in linear predictor between iterations).

maxit

numeric. Maximum number of iterations in the hglm algorithm.

startval

numeric. A vector of starting values in the following order: fixed effects, random effect, variance of random effects, variance of residuals.

X.disp

matrix. The design matrix for the fixed effects in the dispersion part of the model.

disp

formula. A one-sided formula specifying the fixed effects in the dispersion part of the model.

link.disp

character. The link function for the dispersion part of the model.

weights

numeric. Prior weights to be specified in weighted regression.

fix.disp

numeric. A numeric value if the dispersion parameter of the mean model is known, e.g., 1 for binomial and Poisson model.

offset

An offset for the linear predictor of the mean model.

sparse

logical. If TRUE, the computation is to be carried out by using sparse matrix technique.

vcovmat

logical. If TRUE, the variance-covariance matrix is exported.

calc.like

logical. If TRUE, likelihoods will be computed at convergence and will be shown via the print or summary methods on the output object.

bigRR

logical. If TRUE, and only for the Gaussian model with one random effect term, a specific algorithm will be used for fast fitting high-dimensional (p >> n) problems. See Shen et al. (2013) for more details of the method.

verbose

logical. If TRUE, more information is printed during model fitting process.

...

not used.

Value

It returns an object of class hglm consiting of the following values.
fixeffixed effect estimates.
ranefrandom effect estimates.
RandCintegers (possibly a vector) specified the number of column of Z to be used for each of the random-effect terms.
varFixdispersion parameter of the mean model (residual variance for LMM).
varRanefdispersion parameter of the random effects (variance of random effects for GLMM).
iternumber of iterations used.
Convergespecifies if the algorithm converged.
SeFestandard errors of fixed effects.
SeRestandard errors of random effects.
dfReFedeviance degrees of freedom for the mean part of the model.
SummVC1estimates and standard errors of the linear predictor in the dispersion model.
SummVC2estimates and standard errors of the linear predictor for the dispersion parameter of the random effects.
devindividual deviances for the mean part of the model.
hvhatvalues for the mean part of the model.
residstudentized residuals for the mean part of the model.
fvfitted values for the mean part of the model.
disp.fvfitted values for the dispersion part of the model.
disp.residstandardized deviance residuals for the dispersion part of the model.
link.displink function for the dispersion part of the model.
vcovthe variance-covariance matrix.
likelihooda list of likelihood values for model selection purposes, where $hlik is the h-likelihood, $pvh the adjusted profile likelihood profiled over random effects, $pbvh the adjusted profile likelihood profiled over fixed and random effects, $cAIC the conditional AIC.

Details

Models for hglm are either specified symbolically using formula or by specifying the design matrices ( X, Z and X.disp). Currently, only the extended quasi likelihood (EQL) method is available for the estimation of the model parameters. Only for the Gaussian-Gaussina linear mixed models, it is REML. It should be noted that the EQL estimator can be biased and inconsistent in some special cases e.g. binary pair matched response. A higher order correction might be useful to correct the bias of EQL (Lee et al. 2006). But, those currections are not implemented in the current version. By default, the dispersion parameter is always estimated via EQL. If the dispersion parameter of the mean model is to be held constant, for example if it is desired to be 1 for binomial and Poisson family, then fix.disp=value where, value=1 for the above example, should be used.

References

Lars Ronnegard, Xia Shen and Moudud Alam (2010). hglm: A Package for Fitting Hierarchical Generalized Linear Models. The R Journal, 2(2), 20-28. Youngjo Lee, John A Nelder and Yudi Pawitan (2006) Generalized Linear Models with Random Effect: a unified analysis via h-likelihood. Chapman and Hall/CRC. Xia Shen, Moudud Alam, Freddy Fikse and Lars Ronnegard (2013). A novel generalized ridge regression method for quantitative genetics. Genetics. Moudud Alam, Lars Ronnegard, Xia Shen (2014). Fitting spatial models in hglm. Submitted.

Examples

Run this code

# Find more examples and instructions in the package vignette:
# vignette('hglm')

require(hglm)

# --------------------- #
# semiconductor example #
# --------------------- #

data(semiconductor)

m11 <- hglm(fixed = y ~ x1 + x3 + x5 + x6,
            random = ~ 1|Device,
            family = Gamma(link = log),
            disp = ~ x2 + x3, data = semiconductor)
summary(m11)
plot(m11, cex = .6, pch = 1,
     cex.axis = 1/.6, cex.lab = 1/.6,
     cex.main = 1/.6, mar = c(3, 4.5, 0, 1.5))

# ------------------- #
# redo it using hglm2 #
# ------------------- #

m12 <- hglm2(y ~ x1 + x3 + x5 + x6 + (1|Device),
             family = Gamma(link = log),
             disp = ~ x2 + x3, data = semiconductor)
summary(m12)
     
# -------------------------- #
# redo it using matrix input #
# -------------------------- #

attach(semiconductor)
m13 <- hglm(y = y, X = model.matrix(~ x1 + x3 + x5 + x6),
            Z = kronecker(diag(16), rep(1, 4)),
            X.disp = model.matrix(~ x2 + x3), 
            family = Gamma(link = log))
summary(m13)
     
# --------------------- #
# verbose & likelihoods #
# --------------------- #
     
m14 <- hglm(fixed = y ~ x1 + x3 + x5 + x6,
            random = ~ 1|Device,
            family = Gamma(link = log),
            disp = ~ x2 + x3, data = semiconductor,
            verbose = TRUE, calc.like = TRUE)
summary(m14)

# --------------------------------------------- #  
# simulated example with 2 random effects terms #
# --------------------------------------------- #  
set.seed(911)
x1 <- rnorm(100)
x2 <- rnorm(100)
x3 <- rnorm(100)
z1 <- factor(rep(LETTERS[1:10], rep(10, 10)))
z2 <- factor(rep(letters[1:5], rep(20, 5)))
Z1 <- model.matrix(~ 0 + z1)
Z2 <- model.matrix(~ 0 + z2)
u1 <- rnorm(10, 0, sqrt(2))
u2 <- rnorm(5, 0, sqrt(3))
y <- 1 + 2*x1 + 3*x2 + Z1%*%u1 + Z2%*%u2 + rnorm(100, 0, sqrt(exp(x3)))
dd <- data.frame(x1 = x1, x2 = x2, x3 = x3, z1 = z1, z2 = z2, y = y)

(m21 <- hglm(X = cbind(rep(1, 100), x1, x2), y = y, Z = cbind(Z1, Z2), 
              RandC = c(10, 5)))
summary(m21)
plot(m21)

(m22 <- hglm2(y ~ x1 + x2 + (1|z1) + (1|z2), data = dd, vcovmat = TRUE))
image(m22$vcov, main = 'Variance-covariance Matrix')
summary(m22)
plot(m22)

m31 <- hglm2(y ~ x1 + x2 + (1|z1) + (1|z2), disp = ~ x3, data = dd)
print (m31)
summary(m31)
plot(m31)

Run the code above in your browser using DataLab