survey (version 0.9-1)

svydesign: Survey sample analysis.

Description

Specify a complex survey design.

Usage

svydesign(ids, probs, strata = NULL, variables = NULL, data = NULL, 
    nest = FALSE, check.strata = TRUE)

Arguments

ids
Formula or data frame specifying cluster ids from largest level to smallest level, ~0 is a formula for no clusters.
probs
Formula or data frame specifying cluster sampling probabilities
strata
Formula or factor specifying strata, use NULL for no strata
variables
Formula or data frame specifying the variables measured in the survey. If NULL, the data argument is used.
data
Data frame to look up variables in the formula arguments
nest
If TRUE, relabel cluster ids to enforce nesting
check.strata
If TRUE, check that clusters are nested in strata

Value

  • An object of class survey.design.

Details

When analysing data from a complex survey, observations must be weighted inversely to their sampling probabilities, and the effects of stratification and of correlation induced by cluster sampling must be incorporated in standard errors.

The svydesign object combines a data frame and all the survey design information needed to analyse it. These objects are used by the survey modelling and summary functions.

The dim, "[" and "[<-" and na.action methods for survey.design objects operate on the dataframe specified by variables and ensure that the design information is properly updated to correspond to the new data frame. With the "[<-" method the new value can be a survey.design object instead of a data frame, but only the data frame is used.

References

~put references to the literature/web site here ~

See Also

svyglm, svymean, svyvar, svytable, svyquantile

Examples

Run this code
#population
  df<-data.frame(x=rnorm(1000),z=rep(0:4,200))
  df$y<-with(df, 3+3*x*z)
  #sampling fraction
  df$p<-with(df, exp(x)/(1+exp(x)))
  #sample
  xi<-rbinom(1000,1,df$p)
  sdf<-df[xi==1,]
  
  #survey design object: independent sampling, 
  dxi<-svydesign(~0,~p,data=sdf)
 
  dxi
  summary(dxi)
  svymean(sdf$x,dxi)	
  svymean(~x,dxi)
  svytable(~z, dxi)

   #cluster sampling: population
   df$id<-rep(1:250,each=4)
   df$clustp<-by(df,list(df$id),function(d) min(exp(d$x*d$z)/(1+exp(d$x*d$z))))[df$id]
   xi<-rbinom(250,1,df$clustp[4*(1:250)])
   sdf<-df[xi[df$id]==1,]
   
   #cluster sampling design
   dxi<-svydesign(~id,~clustp,data=sdf)
   
   dxi
   summary(dxi)
   svymean(~x+z,dxi)

   ## stratification
   df<-data.frame(z=rep(1:4,each=200), y=rnorm(800, rep(1:4,each=200)))
   xi<-c(sample(1:200,20), sample(201:400,20), sample(401:600,20), sample(601:800,20))
   sdf<-df[xi,]
   stratdx<-svydesign(id=~0,prob=~0,strata=~z,data=sdf)
   unstrat<-svydesign(id=~0,prob=~0,data=sdf)
   stratdx
   unstrat
   summary(stratdx)

   svymean(~y, stratdx)  ##higher precision
   svymean(~y, unstrat)

Run the code above in your browser using DataLab