svyglm: Survey-weighted generalised linear models.

Description

Fit a generalised linear model to data from a complex survey design, with inverse-probability weighting and with standard errors corrected for cluster sampling.

Usage

svyglm(formula, design, subset=NULL, ...)
svrepglm(formula, design, subset=NULL, ..., rho=NULL,
return.replicates=FALSE, na.action)
## S3 method for class 'svyglm':
summary(object, correlation = FALSE,  ...)

Arguments

formula

Model formula

design

Survey design from svydesign or svrepdesign. Must contain all variables in the formula

subset

Expression to select a subpopulation

...

Other arguments passed to glm or summary.glm

rho

For replicate BRR designs, to specify the paramter for Fay's variance method

return.replicates

Return the replicates as a component of the result?

object

A svyglm object

correlation

Include the correlation matrix of parameters?

na.action

Handling of NAs

Value

An object of class svyglm.

Details

In svyglm, standard errors for cluster-sampled designs are computed using a linearisation estimate (in the absence of strata this is equivalent to the Huber/White sandwich formula in GEEs). Most of these computations are done in svyCprod. In svrepglm, replicate weight methods are used.

There is no anova method for svyglm as the models are not fitted by maximum likelihood. The function regTermTest may be useful for testing sets of regression terms.

Examples

Run this code

## Independent sampling
  df<-data.frame(x=rnorm(1000),z=rep(0:4,200))
  df$y<-with(df, 3+3*x*z)
  df$p<-with(df, exp(x)/(1+exp(x)))
  mpop<-lm(y~x+z, data=df) 
  xi<-rbinom(1000,1,df$p)
  sdf<-df[xi==1,]

  dxi<-svydesign(~0,~p,data=sdf)
  m<-svyglm(y~x+z,family="gaussian",design=dxi)
  m1<-lm(y~x+z, data=sdf)

  summary(m1) ##wrong
  summary(m)  ##right
  summary(mpop) ##whole population

  ##cluster sampling
  df$id<-rep(1:250,each=4)
  df$clustp<-by(df,list(df$id),function(d) min(exp(d$x*d$z)/(1+exp(d$x*d$z))))[df$id]
  mpop<-lm(y~x+z, data=df) 

  xi<-rbinom(250,1,df$clustp[4*(1:250)])
  sdf<-df[xi[df$id]==1,]

  dxi<-svydesign(~id,~clustp,data=sdf)
  drep<-as.svrepdesign(dxi)
  m<-svyglm(y~x+z,family="gaussian",design=dxi)
  mr<-svrepglm(y~x+z, family="gaussian", design=drep)
  m1<-lm(y~x+z,data=sdf)
 
  summary(m1) ##wrong
  summary(m)  ##right
  summary(mpop) ##whole population

  ## subsets
  msub<-svyglm(y~x+z,family="gaussian",design=dxi,subset=x>1)
  summary(msub)
  subdxi<-subset(dxi,x>1)
  msub<-svyglm(y~x+z,family="gaussian",design=subdxi)
  summary(msub)

Run the code above in your browser using DataLab