Learn R Programming

survey (version 3.31-2)

svyby: Survey statistics on subsets

Description

Compute survey statistics on subsets of a survey defined by factors.

Usage

svyby(formula, by ,design,...) "svyby"(formula, by, design, FUN, ..., deff=FALSE,keep.var = TRUE, keep.names = TRUE,verbose=FALSE, vartype=c("se","ci","ci","cv","cvpct","var"), drop.empty.groups=TRUE, covmat=FALSE, return.replicates=FALSE, na.rm.by=FALSE, na.rm.all=FALSE, multicore=getOption("survey.multicore")) "SE"(object,...) "deff"(object,...) "coef"(object,...) "confint"(object, parm, level = 0.95,df =Inf,...) unwtd.count(x, design, ...)

Arguments

formula,x
A formula specifying the variables to pass to FUN (or a matrix, data frame, or vector)
by
A formula specifying factors that define subsets, or a list of factors.
design
A svydesign or svrepdesign object
FUN
A function taking a formula and survey design object as its first two arguments.
...
Other arguments to FUN
deff
Request a design effect from FUN
keep.var
If FUN returns a svystat object, extract standard errors from it
keep.names
Define row names based on the subsets
verbose
If TRUE, print a label for each subset as it is processed.
vartype
Report variability as one or more of standard error, confidence interval, coefficient of variation, percent coefficient of variation, or variance
drop.empty.groups
If FALSE, report NA for empty groups, if TRUE drop them from the output
na.rm.by
If true, omit groups defined by NA values of the by variables
na.rm.all
If true, check for groups with no non-missing observations for variables defined by formula and treat these groups as empty
covmat
If TRUE, compute covariances between estimates for different subsets (currently only for replicate-weight designs). Allows svycontrast to be used on output.
return.replicates
Only for replicate-weight designs. If TRUE, return all the replicates as the "replicates" attribute of the result
multicore
Use multicore package to distribute subsets over multiple processors?
parm
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered.
level
the confidence level required.
df
degrees of freedom for t-distribution in confidence interval, use degf(design) for number of PSUs minus number of strata
object
An object of class "svyby"

Value

An object of class "svyby": a data frame showing the factors and the results of FUN.For unwtd.count, the unweighted number of non-missing observations in the data matrix specified by x for the design.

Details

The variance type "ci" asks for confidence intervals, which are produced by confint. In some cases additional options to FUN will be needed to produce confidence intervals, for example, svyquantile needs ci=TRUE or keep.var=FALSE.

unwtd.count is designed to be passed to svyby to report the number of non-missing observations in each subset. Observations with exactly zero weight will also be counted as missing, since that's how subsets are implemented for some designs.

Parallel processing with multicore=TRUE is useful only for fairly large problems and on computers with sufficient memory. The multicore package is incompatible with some GUIs, although the Mac Aqua GUI appears to be safe.

See Also

svytable and ftable.svystat for contingency tables, ftable.svyby for pretty-printing of svyby

Examples

Run this code
data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

svyby(~api99, ~stype, dclus1, svymean)
svyby(~api99, ~stype, dclus1, svyquantile, quantiles=0.5,ci=TRUE,vartype="ci")
## without ci=TRUE svyquantile does not compute standard errors
svyby(~api99, ~stype, dclus1, svyquantile, quantiles=0.5, keep.var=FALSE)
svyby(~api99, list(school.type=apiclus1$stype), dclus1, svymean)
svyby(~api99+api00, ~stype, dclus1, svymean, deff=TRUE,vartype="ci")
svyby(~api99+api00, ~stype+sch.wide, dclus1, svymean, keep.var=FALSE)
## report raw number of observations
svyby(~api99+api00, ~stype+sch.wide, dclus1, unwtd.count, keep.var=FALSE)

rclus1<-as.svrepdesign(dclus1)

svyby(~api99, ~stype, rclus1, svymean)
svyby(~api99, ~stype, rclus1, svyquantile, quantiles=0.5)
svyby(~api99, list(school.type=apiclus1$stype), rclus1, svymean, vartype="cv")
svyby(~enroll,~stype, rclus1,svytotal, deff=TRUE)
svyby(~api99+api00, ~stype+sch.wide, rclus1, svymean, keep.var=FALSE)
##report raw number of observations
svyby(~api99+api00, ~stype+sch.wide, rclus1, unwtd.count, keep.var=FALSE)

## comparing subgroups using covmat=TRUE
mns<-svyby(~api99, ~stype, rclus1, svymean,covmat=TRUE)
vcov(mns)
svycontrast(mns, c(E = 1, M = -1))

str(svyby(~api99, ~stype, rclus1, svymean,return.replicates=TRUE))


## extractor functions
(a<-svyby(~enroll, ~stype, rclus1, svytotal, deff=TRUE, verbose=TRUE, 
  vartype=c("se","cv","cvpct","var")))
deff(a)
SE(a)
cv(a)
coef(a)
confint(a, df=degf(rclus1))

## ratio estimates
svyby(~api.stu, by=~stype, denominator=~enroll, design=dclus1, svyratio)

## empty groups
svyby(~api00,~comp.imp+sch.wide,design=dclus1,svymean)
svyby(~api00,~comp.imp+sch.wide,design=dclus1,svymean,drop.empty.groups=FALSE)

Run the code above in your browser using DataLab