surveysummary: Summary statistics for sample surveys

Description

Compute means, variances, ratios and totals for data from complex surveys.

Usage

## S3 method for class 'survey.design':
svymean(x, design, na.rm=FALSE,deff=FALSE,...) 
## S3 method for class 'twophase':
svymean(x, design, na.rm=FALSE,deff=FALSE,...) 
## S3 method for class 'svyrep.design':
svymean(x, design, na.rm=FALSE, rho=NULL,
  return.replicates=FALSE, deff=FALSE,...) 
## S3 method for class 'survey.design':
svyvar(x, design, na.rm=FALSE,...) 
## S3 method for class 'svyrep.design':
svyvar(x, design, na.rm=FALSE, rho=NULL,
   return.replicates=FALSE,...,estimate.only=FALSE) 
## S3 method for class 'survey.design':
svytotal(x, design, na.rm=FALSE,deff=FALSE,...) 
## S3 method for class 'twophase':
svytotal(x, design, na.rm=FALSE,deff=FALSE,...) 
## S3 method for class 'svyrep.design':
svytotal(x, design, na.rm=FALSE, rho=NULL,
   return.replicates=FALSE, deff=FALSE,...)
## S3 method for class 'svystat':
coef(object,...)
## S3 method for class 'svrepstat':
coef(object,...)
## S3 method for class 'svystat':
vcov(object,...)
## S3 method for class 'svrepstat':
vcov(object,...)
cv(object,...)
deff(object, quietly=FALSE,...)
make.formula(names)

Arguments

A formula, vector or matrix

design

survey.design or svyrep.design object

na.rm

Should cases with missing values be dropped?

rho

parameter for Fay's variance estimator in a BRR design

return.replicates

Return the replicate means?

deff

Return the design effect (see below)

object

The result of one of the other survey summary functions

quietly

Don't warn when there is no design effect computed

estimate.only

Don't compute standard errors (useful when svyvar is used to estimate the design effect)

...

additional arguments to cv methods,not currently used

names

vector of character strings

Value

Objects of class "svystat" or "svrepstat", which are vectors with a "var" attribute giving the variance and a "statistic" attribute giving the name of the statistic.

Details

These functions perform weighted estimation, with each observation being weighted by the inverse of its sampling probability. Except for the table functions, these also give precision estimates that incorporate the effects of stratification and clustering.

Factor variables are converted to sets of indicator variables for each category in computing means and totals. Combining this with the interaction function, allows crosstabulations. See ftable.svystat for formatting the output.

With na.rm=TRUE, all cases with missing data are removed. With na.rm=FALSE cases with missing data are not removed and so will produce missing results. When using replicate weights and na.rm=FALSE it may be useful to set options(na.action="na.pass"), otherwise all replicates with any missing results will be discarded.

The svytotal and svreptotal functions estimate a population total. Use predict on svyratio and svyglm, to get ratio or regression estimates of totals.

The design effect compares the variance of a mean or total to the variance from a study of the same size using simple random sampling without replacement. Note that the design effect will be incorrect if the weights have been rescaled so that they are not reciprocals of sampling probabilities. To obtain an estimate of the design effect comparing to simple random sampling with replacement, which does not have this requirement, use deff="replace". This with-replacement design effect is the square of Kish's "deft".

The cv function computes the coefficient of variation of a statistic such as ratio, mean or total. The default method is for any object with methods for SE and coef.

make.formula makes a formula from a vector of names. This is useful because formulas as the best way to specify variables to the survey functions.

Examples

Run this code

data(api)

  ## one-stage cluster sample
  dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
  summary(dclus1)
  svymean(~api00, dclus1, deff=TRUE)
  svymean(~factor(stype),dclus1)
  svymean(~interaction(stype, comp.imp), dclus1)
  svyquantile(~api00, dclus1, c(.25,.5,.75))
  svyvar(~api00, dclus1)
  svytotal(~enroll, dclus1, deff=TRUE)
  svyratio(~api.stu, ~enroll, dclus1)

  #stratified sample
  dstrat<-svydesign(id=~1, strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
  summary(dstrat)
  svymean(~api00, dstrat)
  svyquantile(~api00, dstrat, c(.25,.5,.75))
  svyvar(~api00, dstrat)
  svytotal(~enroll, dstrat)
  svyratio(~api.stu, ~enroll, dstrat)
  
  # replicate weights - jackknife (this is slow)
  jkstrat<-as.svrepdesign(dstrat)
  summary(jkstrat)
  svymean(~api00, jkstrat)
  svymean(~factor(stype),jkstrat)
  svyvar(~api00,jkstrat)
  svyquantile(~api00, jkstrat, c(.25,.5,.75))
  svytotal(~enroll, jkstrat)
  svyratio(~api.stu, ~enroll, jkstrat)

  # coefficients of variation
  cv(svytotal(~enroll,dstrat))
  cv(svyratio(~api.stu, ~enroll, jkstrat))

  # extracting statistic and variance
  coef(svytotal(~enroll,dstrat))
  vcov(svymean(~api00+api99,jkstrat))

  # Design effect
  svymean(~api00, dstrat, deff=TRUE)
  svymean(~api00, dstrat, deff="replace")
  svymean(~api00, jkstrat, deff=TRUE)
  svymean(~api00, jkstrat, deff="replace")
 (a<-svytotal(~enroll, dclus1, deff=TRUE))
  deff(a)

Run the code above in your browser using DataLab