Learn R Programming

threeboost (version 1.1)

geeboost: GEEBoost

Description

Thresholded boosting for correlated data via GEE

Usage

geeboost(Y, X, id = 1:length(Y), family = "gaussian", corstr = "ind", traceplot = FALSE, ...)

Arguments

Y
Vector of (presumably correlated) outcomes
X
Matrix of predictors
id
Index indicating clusters of correlated observations
family
Outcome distribution to be used. "gaussian" (the default), "binomial", and "poisson" are currently implemented.
corstr
Working correlation structure to use. "ind" (for independence, the default) and "exch" (for exchanageable) are currently implemented.
traceplot
Option of whether or not to produce a traceplot of the coefficient values. See coef_traceplot for details.
...
Additional arguments to be passed to the threeboost function. See the threeboost help page for details.

Value

A list with three entries:
  • coefmat A matrix with maxit rows and ncol(X) columns, with each row containing the parameter vector from an iteration of EEBoost.
  • QICs A vector of QICs computed from the coefficients.
  • final.model The coefficients corresponding to the model (set of coefficients) yielding the smallest QIC.

Details

This function implements thresholded EEBoost for the Generalized Estimating Equations. The arguments are consistent with those used by geepack.

See Also

threeboost

Wolfson, J. EEBoost: A general method for prediction and variable selection using estimating equations. Journal of the American Statistical Association, 2011.

Examples

Run this code
# Generate some test data
library(mvtnorm)
library(Matrix)
n <- 30
n.var <- 50
clust.size <- 4

B <- c(rep(2,5),rep(0.2,5),rep(0.05,10),rep(0,n.var-20))
mn.X <- rep(0,n.var)
sd.X <- 0.5
rho.X <- 0.3
cov.sig.X <- sd.X^2*((1-rho.X)*diag(rep(1,10)) + rho.X*matrix(data=1,nrow=10,ncol=10))
sig.X <- as.matrix( Matrix::bdiag(lapply(1:(n.var/10),function(x) { cov.sig.X } ) ) )
sd.Y <- 0.5
rho.Y <- 0.3
indiv.Sig <- sd.Y^2*( (1-rho.Y)*diag(rep(1,4)) + rho.Y*matrix(data=1,nrow=4,ncol=4) )
sig.list <- list(length=n)
for(i in 1:n) { sig.list[[i]] <- indiv.Sig }
Sig <- Matrix::bdiag(sig.list)
indiv.index <- rep(1:n,each=clust.size)
sig.Y <- as.matrix(Sig)

if(require(mvtnorm)) {
X <- mvtnorm::rmvnorm(n*clust.size,mean=mn.X,sigma=sig.X)
mn.Y <- X %*% B
Y <- mvtnorm::rmvnorm(1,mean=mn.Y,sigma=sig.Y) ## Correlated continuous outcomes
expit <- function(x) { exp(x) / (1 + exp(x)) }
## Correlated binary outcomes
Y.bin <- rbinom(n*clust.size,1,p=expit(mvtnorm::rmvnorm(1,mean=mn.Y,sigma=sig.Y)))
Y.pois <- rpois(length(Y),lambda=exp(mn.Y)) ## Correlated Poisson outcomes
} else { stop('Need mvtnorm package to generate correlated data.')}

## Run EEBoost (w/ indep working correlation)
results.lin <- geeboost(Y,X,id=indiv.index,maxit=1000)
## Not run: 
#  results.bin <- geeboost(Y.bin,X,id=indiv.index,family="binomial",maxit=1000)
#  results.pois <- geeboost(Y.pois,X,id=indiv.index,family="poisson",maxit=1000,traceplot=TRUE)
# ## End(Not run)

print(results.lin$final.model)

Run the code above in your browser using DataLab