Compute survey statistics on subsets of a survey defined by factors.
svyby(formula, by ,design,...)
# S3 method for default
svyby(formula, by, design, FUN, ..., deff=FALSE,keep.var = TRUE,
keep.names = TRUE,verbose=FALSE, vartype=c("se","ci","ci","cv","cvpct","var"),
drop.empty.groups=TRUE, covmat=FALSE, return.replicates=FALSE,
na.rm.by=FALSE, na.rm.all=FALSE, stringsAsFactors=TRUE,
multicore=getOption("survey.multicore"))
# S3 method for survey.design2
svyby(formula, by, design, FUN, ..., deff=FALSE,keep.var = TRUE,
keep.names = TRUE,verbose=FALSE, vartype=c("se","ci","ci","cv","cvpct","var"),
drop.empty.groups=TRUE, covmat=FALSE, influence=covmat,
na.rm.by=FALSE, na.rm.all=FALSE, stringsAsFactors=TRUE,
multicore=getOption("survey.multicore"))# S3 method for svyby
SE(object,...)
# S3 method for svyby
deff(object,...)
# S3 method for svyby
coef(object,...)
# S3 method for svyby
confint(object, parm, level = 0.95,df =Inf,...)
unwtd.count(x, design, ...)
svybys(formula, bys, design, FUN, ...)
An object of class "svyby"
: a data frame showing the factors and the results of FUN
.
For unwtd.count
, the unweighted number of non-missing observations in the data matrix specified by x
for the design.
A formula specifying the variables to pass to
FUN
(or a matrix, data frame, or vector)
A formula specifying factors that define subsets, or a list of factors.
A svydesign
or svrepdesign
object
A function taking a formula and survey design object as its first two arguments.
Other arguments to FUN
. NOTE: if any of the
names of these are partial matches to formula
,by
,
or design
, you must specify the formula
,by
,
or design
argument by name, not just by position.
Request a design effect from FUN
If FUN
returns a svystat
object, extract
standard errors from it
Define row names based on the subsets
If TRUE
, print a label for each subset as it is
processed.
Report variability as one or more of standard error, confidence interval, coefficient of variation, percent coefficient of variation, or variance
If FALSE
, report NA
for empty
groups, if TRUE
drop them from the output
If true, omit groups defined by NA
values of the
by
variables
.
If true, check for groups with no non-missing
observations for variables defined by formula
and treat these groups
as empty. Doesn't make much sense without na.rm=TRUE
If TRUE
, compute covariances between estimates for
different subsets. Allows svycontrast
to be used on
output. Requires that FUN
supports either
return.replicates=TRUE
or influence=TRUE
Only for replicate-weight designs. If
TRUE
, return all the replicates as the "replicates" attribute of the result
Return the influence functions of the result
Use multicore
package to distribute subsets over
multiple processors?
Convert any string variables in formula
to factors before calling FUN
, so that the factor levels will
be the same in all groups (See Note below). Potentially slow.
a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered.
the confidence level required.
degrees of freedom for t-distribution in confidence
interval, use degf(design)
for number of PSUs minus number of
strata
An object of class "svyby"
one-sided formula with each term specifying a grouping (rather than being combined to give a grouping
The variance type "ci" asks for confidence intervals, which are produced
by confint
. In some cases additional options to FUN
will
be needed to produce confidence intervals, for example,
svyquantile
needs ci=TRUE
or keep.var=FALSE
.
unwtd.count
is designed to be passed to svyby
to report
the number of non-missing observations in each subset. Observations
with exactly zero weight will also be counted as missing, since that's
how subsets are implemented for some designs.
Parallel processing with multicore=TRUE
is useful only for
fairly large problems and on computers with sufficient memory. The
multicore
package is incompatible with some GUIs, although the
Mac Aqua GUI appears to be safe.
The variant svybys
creates a separate table for each term in
bys
rather than creating a joint table.
svytable
and ftable.svystat
for
contingency tables, ftable.svyby
for pretty-printing of svyby
data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
svyby(~api99, ~stype, dclus1, svymean)
svyby(~api99, ~stype, dclus1, svyquantile, quantiles=0.5,ci=TRUE,vartype="ci")
## without ci=TRUE svyquantile does not compute standard errors
svyby(~api99, ~stype, dclus1, svyquantile, quantiles=0.5, keep.var=FALSE)
svyby(~api99, list(school.type=apiclus1$stype), dclus1, svymean)
svyby(~api99+api00, ~stype, dclus1, svymean, deff=TRUE,vartype="ci")
svyby(~api99+api00, ~stype+sch.wide, dclus1, svymean, keep.var=FALSE)
## report raw number of observations
svyby(~api99+api00, ~stype+sch.wide, dclus1, unwtd.count, keep.var=FALSE)
rclus1<-as.svrepdesign(dclus1)
svyby(~api99, ~stype, rclus1, svymean)
svyby(~api99, ~stype, rclus1, svyquantile, quantiles=0.5)
svyby(~api99, list(school.type=apiclus1$stype), rclus1, svymean, vartype="cv")
svyby(~enroll,~stype, rclus1,svytotal, deff=TRUE)
svyby(~api99+api00, ~stype+sch.wide, rclus1, svymean, keep.var=FALSE)
##report raw number of observations
svyby(~api99+api00, ~stype+sch.wide, rclus1, unwtd.count, keep.var=FALSE)
## comparing subgroups using covmat=TRUE
mns<-svyby(~api99, ~stype, rclus1, svymean,covmat=TRUE)
vcov(mns)
svycontrast(mns, c(E = 1, M = -1))
str(svyby(~api99, ~stype, rclus1, svymean,return.replicates=TRUE))
tots<-svyby(~enroll, ~stype, dclus1, svytotal,covmat=TRUE)
vcov(tots)
svycontrast(tots, quote(E/H))
## comparing subgroups uses the delta method unless replicates are present
meanlogs<-svyby(~log(enroll),~stype,svymean, design=rclus1,covmat=TRUE)
svycontrast(meanlogs, quote(exp(E-H)))
meanlogs<-svyby(~log(enroll),~stype,svymean, design=rclus1,covmat=TRUE,return.replicates=TRUE)
svycontrast(meanlogs, quote(exp(E-H)))
## extractor functions
(a<-svyby(~enroll, ~stype, rclus1, svytotal, deff=TRUE, verbose=TRUE,
vartype=c("se","cv","cvpct","var")))
deff(a)
SE(a)
cv(a)
coef(a)
confint(a, df=degf(rclus1))
## ratio estimates
svyby(~api.stu, by=~stype, denominator=~enroll, design=dclus1, svyratio)
ratios<-svyby(~api.stu, by=~stype, denominator=~enroll, design=dclus1, svyratio,covmat=TRUE)
vcov(ratios)
## empty groups
svyby(~api00,~comp.imp+sch.wide,design=dclus1,svymean)
svyby(~api00,~comp.imp+sch.wide,design=dclus1,svymean,drop.empty.groups=FALSE)
## Multiple tables
svybys(~api00,~comp.imp+sch.wide,design=dclus1,svymean)
Run the code above in your browser using DataLab