survey (version 4.0)

# svyby: Survey statistics on subsets

## Description

Compute survey statistics on subsets of a survey defined by factors.

## Usage

```svyby(formula, by ,design,...)
# S3 method for default
svyby(formula, by, design, FUN, ..., deff=FALSE,keep.var = TRUE,
keep.names = TRUE,verbose=FALSE, vartype=c("se","ci","ci","cv","cvpct","var"),
drop.empty.groups=TRUE, covmat=FALSE, return.replicates=FALSE,
na.rm.by=FALSE, na.rm.all=FALSE,
multicore=getOption("survey.multicore"))
# S3 method for survey.design2
svyby(formula, by, design, FUN, ..., deff=FALSE,keep.var = TRUE,
keep.names = TRUE,verbose=FALSE, vartype=c("se","ci","ci","cv","cvpct","var"),
drop.empty.groups=TRUE, covmat=FALSE, influence=covmat,
na.rm.by=FALSE, na.rm.all=FALSE, multicore=getOption("survey.multicore"))# S3 method for svyby
SE(object,...)
# S3 method for svyby
deff(object,...)
# S3 method for svyby
coef(object,...)
# S3 method for svyby
confint(object,  parm, level = 0.95,df =Inf,...)
unwtd.count(x, design, ...)```

## Arguments

formula,x

A formula specifying the variables to pass to `FUN` (or a matrix, data frame, or vector)

by

A formula specifying factors that define subsets, or a list of factors.

design

A `svydesign` or `svrepdesign` object

FUN

A function taking a formula and survey design object as its first two arguments.

Other arguments to `FUN`

deff

Request a design effect from `FUN`

keep.var

If `FUN` returns a `svystat` object, extract standard errors from it

keep.names

Define row names based on the subsets

verbose

If `TRUE`, print a label for each subset as it is processed.

vartype

Report variability as one or more of standard error, confidence interval, coefficient of variation, percent coefficient of variation, or variance

drop.empty.groups

If `FALSE`, report `NA` for empty groups, if `TRUE` drop them from the output

na.rm.by

If true, omit groups defined by `NA` values of the `by` variables

na.rm.all

If true, check for groups with no non-missing observations for variables defined by `formula` and treat these groups as empty

covmat

If `TRUE`, compute covariances between estimates for different subsets. Allows `svycontrast` to be used on output. Requires that `FUN` supports either `return.replicates=TRUE` or `influence=TRUE`

return.replicates

Only for replicate-weight designs. If `TRUE`, return all the replicates as the "replicates" attribute of the result

influence

Return the influence functions of the result

multicore

Use `multicore` package to distribute subsets over multiple processors?

parm

a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered.

level

the confidence level required.

df

degrees of freedom for t-distribution in confidence interval, use `degf(design)` for number of PSUs minus number of strata

object

An object of class `"svyby"`

## Value

An object of class `"svyby"`: a data frame showing the factors and the results of `FUN`.

For `unwtd.count`, the unweighted number of non-missing observations in the data matrix specified by `x` for the design.

## Details

The variance type "ci" asks for confidence intervals, which are produced by `confint`. In some cases additional options to `FUN` will be needed to produce confidence intervals, for example, `svyquantile` needs `ci=TRUE` or `keep.var=FALSE`.

`unwtd.count` is designed to be passed to `svyby` to report the number of non-missing observations in each subset. Observations with exactly zero weight will also be counted as missing, since that's how subsets are implemented for some designs.

Parallel processing with `multicore=TRUE` is useful only for fairly large problems and on computers with sufficient memory. The `multicore` package is incompatible with some GUIs, although the Mac Aqua GUI appears to be safe.

`svytable` and `ftable.svystat` for contingency tables, `ftable.svyby` for pretty-printing of `svyby`

## Examples

Run this code
```# NOT RUN {
data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

svyby(~api99, ~stype, dclus1, svymean)
svyby(~api99, ~stype, dclus1, svyquantile, quantiles=0.5,ci=TRUE,vartype="ci")
## without ci=TRUE svyquantile does not compute standard errors
svyby(~api99, ~stype, dclus1, svyquantile, quantiles=0.5, keep.var=FALSE)
svyby(~api99, list(school.type=apiclus1\$stype), dclus1, svymean)
svyby(~api99+api00, ~stype, dclus1, svymean, deff=TRUE,vartype="ci")
svyby(~api99+api00, ~stype+sch.wide, dclus1, svymean, keep.var=FALSE)
## report raw number of observations
svyby(~api99+api00, ~stype+sch.wide, dclus1, unwtd.count, keep.var=FALSE)

rclus1<-as.svrepdesign(dclus1)

svyby(~api99, ~stype, rclus1, svymean)
svyby(~api99, ~stype, rclus1, svyquantile, quantiles=0.5)
svyby(~api99, list(school.type=apiclus1\$stype), rclus1, svymean, vartype="cv")
svyby(~enroll,~stype, rclus1,svytotal, deff=TRUE)
svyby(~api99+api00, ~stype+sch.wide, rclus1, svymean, keep.var=FALSE)
##report raw number of observations
svyby(~api99+api00, ~stype+sch.wide, rclus1, unwtd.count, keep.var=FALSE)

## comparing subgroups using covmat=TRUE
mns<-svyby(~api99, ~stype, rclus1, svymean,covmat=TRUE)
vcov(mns)
svycontrast(mns, c(E = 1, M = -1))

str(svyby(~api99, ~stype, rclus1, svymean,return.replicates=TRUE))

tots<-svyby(~enroll, ~stype, dclus1, svytotal,covmat=TRUE)
vcov(tots)
svycontrast(tots, quote(E/H))

## extractor functions
(a<-svyby(~enroll, ~stype, rclus1, svytotal, deff=TRUE, verbose=TRUE,
vartype=c("se","cv","cvpct","var")))
deff(a)
SE(a)
cv(a)
coef(a)
confint(a, df=degf(rclus1))

## ratio estimates
svyby(~api.stu, by=~stype, denominator=~enroll, design=dclus1, svyratio)

ratios<-svyby(~api.stu, by=~stype, denominator=~enroll, design=dclus1, svyratio,covmat=TRUE)
vcov(ratios)

## empty groups
svyby(~api00,~comp.imp+sch.wide,design=dclus1,svymean)
svyby(~api00,~comp.imp+sch.wide,design=dclus1,svymean,drop.empty.groups=FALSE)

# }
```

Run the code above in your browser using DataCamp Workspace