svyquantile: Summary statistics for sample surveys

Description

Compute means, variances, quantiles, and cross-tabulations for data from complex surveys.

Usage

svyquantile(x, design, quantiles, method = "linear", f = 1)
svymean(x, design, na.rm=FALSE) 
svrepmean(x, design, na.rm=FALSE, rho=NULL, return.replicates=FALSE) 
svyvar(x, design, na.rm=FALSE) 
svytotal(x, design, na.rm=FALSE) 
svreptotal(x, design, na.rm=FALSE, rho=NULL, return.replicates=FALSE) 
svytable(formula, design, Ntotal = design$fpc, round = FALSE)
svreptable(formula, design, Ntotal = sum(weights(design, "sampling"))), round = FALSE)

Arguments

A formula, vector or matrix

design

survey.design object

quantiles

Quantiles to estimate

method

see approxfun

na.rm

Should missing values be removed?

formula

A one-sided formula specifying variables to be tabulated

Ntotal

A population total or set of population stratum totals to normalise to.

round

Should the table entries be rounded to the nearest integer?

rho

parameter for Fay's variance estimator in a BRR design

return.replicates

Return the replicate means?

Value

The first three functions return vectors, the last returns an xtabs object.

Details

These functions perform weighted estimation, with each observation being weighted by the inverse of its sampling probability. The svymean and svyvar functions also give precision estimates that incorporate the effects of stratification and clustering. The first four functions are similar to the standard functions whose names do not begin with svy.

The svytotal and svreptotal functions estimate a population total. Use predict on svyratio, svrepratio, svyglm, svrepglm to get ratio or regression estimates of totals.

The svytable and svreptable function computes a weighted crosstabulation. If the sampling probabilities supplied to svydesign were actual probabilities (rather than relative probabilities) this estimates a full population crosstabulation. Otherwise it estimates only relative proportions and should be normalised to some convenient total such as 100 or 1.0 by specifying the Ntotal argument.

The Ntotal argument can be either a single number or a data frame whose first column is the sampling strata and second column the population size in each stratum. In this second case the svytable command performs `post-stratification': tabulating and scaling to the population within strata and then adding up the strata.

As with other xtabs objects, the output of svytable can be processed by ftable for more attractive display.

Examples

Run this code

#population
  df<-data.frame(x=rnorm(1000),z=rep(0:4,200))
  df$y<-with(df, 3+3*x*z)
  #sampling fraction
  df$p<-with(df, exp(x)/(1+exp(x)))
  #sample
  xi<-rbinom(1000,1,df$p)
  sdf<-df[xi==1,]
  
  #survey design object: independent sampling, 
  dxi<-svydesign(~0,~p,data=sdf)
  dxi

  mean(df$x)		#right
  mean(sdf$x)		#wrong
  svymean(~x,dxi)	#right

  var(df$x)		#right
  var(sdf$x)		#wrong
  svyvar(~x,dxi)	#right

  quantile(df$x,c(0.025,0.5,0.975))  #right
  quantile(sdf$x,c(0.025,0.5,0.975))  #wrong
  svyquantile(~x,design=dxi,c(0.025,0.5,0.975))  #right

  table(sdf$z)  # sample table
  svytable(~z, dxi, round=TRUE) # estimated population table

  data(scd)
  repweights<-2*cbind(c(1,0,1,0,1,0), c(1,0,0,1,0,1), c(0,1,1,0,0,1),
              c(0,1,0,1,1,0))
  scdrep<-svrepdesign(data=scd, type="BRR", repweights=repweights)
  svrepmean(~arrests+alive, design=scdrep)

Run the code above in your browser using DataLab