# svyglm

##### Survey-weighted generalised linear models.

Fit a generalised linear model to data from a complex survey design, with inverse-probability weighting and design-based standard errors.

- Keywords
- regression, survey

##### Usage

```
# S3 method for survey.design
svyglm(formula, design, subset=NULL,
family=stats::gaussian(),start=NULL, rescale=TRUE, ...)
# S3 method for svyrep.design
svyglm(formula, design, subset=NULL,
family=stats::gaussian(),start=NULL, rescale=NULL, ..., rho=NULL,
return.replicates=FALSE, na.action,multicore=getOption("survey.multicore"))
# S3 method for svyglm
summary(object, correlation = FALSE, df.resid=NULL,
...)
# S3 method for svyglm
predict(object,newdata=NULL,total=NULL,
type=c("link","response","terms"),
se.fit=(type != "terms"),vcov=FALSE,...)
# S3 method for svrepglm
predict(object,newdata=NULL,total=NULL,
type=c("link","response","terms"),
se.fit=(type != "terms"),vcov=FALSE,
return.replicates=!is.null(object$replicates),...)
```

##### Arguments

- formula
Model formula

- design
Survey design from

`svydesign`

or`svrepdesign`

. Must contain all variables in the formula- subset
Expression to select a subpopulation

- family
`family`

object for`glm`

- start
Starting values for the coefficients (needed for some uncommon link/family combinations)

- rescale
Rescaling of weights, to improve numerical stability. The default rescales weights to sum to the sample size. Use

`FALSE`

to not rescale weights. For replicate-weight designs, use`TRUE`

to rescale weights to sum to 1, as was the case before version 3.34.- …
Other arguments passed to

`glm`

or`summary.glm`

- rho
For replicate BRR designs, to specify the parameter for Fay's variance method, giving weights of

`rho`

and`2-rho`

- return.replicates
Return the replicates as the

`replicates`

component of the result? (for`predict`

, only possible if they were computed in the`svyglm`

fit)- object
A

`svyglm`

object- correlation
Include the correlation matrix of parameters?

- na.action
Handling of NAs

- multicore
Use the

`multicore`

package to distribute replicates across processors?- df.resid
Optional denominator degrees of freedom for Wald tests

- newdata
new data frame for prediction

- total
population size when predicting population total

- type
linear predictor (

`link`

) or response- se.fit
if

`TRUE`

, return variances of predictions- vcov
if

`TRUE`

and`se=TRUE`

return full variance-covariance matrix of predictions

##### Details

For binomial and Poisson families use `family=quasibinomial()`

and `family=quasipoisson()`

to avoid a warning about non-integer
numbers of successes. The `quasi' versions of the family objects give
the same point estimates and standard errors and do not give the
warning.

If `df.resid`

is not specified the df for the null model is
computed by `degf`

and the residual df computed by
subtraction. This is recommended by Korn and Graubard and is correct
for PSU-level covariates but is potentially very conservative for
individual-level covariates. To get tests based on a Normal distribution
use `df.resid=Inf`

, and to use number of PSUs-number of strata,
specify `df.resid=degf(design)`

.

Parallel processing with `multicore=TRUE`

is helpful only for
fairly large data sets and on computers with sufficient memory. It may
be incompatible with GUIs, although the Mac Aqua GUI appears to be safe.

`predict`

gives fitted values and sampling variability for specific new
values of covariates. When `newdata`

are the population mean it
gives the regression estimator of the mean, and when `newdata`

are
the population totals and `total`

is specified it gives the
regression estimator of the population total. Regression estimators of
mean and total can also be obtained with `calibrate`

.

##### Value

`svyglm`

returns an object of class `svyglm`

. The
`predict`

method returns an object of class `svystat`

##### References

Lumley T, Scott A (2017) "Fitting Regression Models to Survey Data" Statistical Science 32: 265-278

##### See Also

`glm`

, which is used to do most of the work.

`regTermTest`

, for multiparameter tests

`calibrate`

, for an alternative way to specify regression
estimators of population totals or means

`svyttest`

for one-sample and two-sample t-tests.

##### Examples

```
# NOT RUN {
data(api)
dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
dclus2<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2)
rstrat<-as.svrepdesign(dstrat)
rclus2<-as.svrepdesign(dclus2)
summary(svyglm(api00~ell+meals+mobility, design=dstrat))
summary(svyglm(api00~ell+meals+mobility, design=dclus2))
summary(svyglm(api00~ell+meals+mobility, design=rstrat))
summary(svyglm(api00~ell+meals+mobility, design=rclus2))
## use quasibinomial, quasipoisson to avoid warning messages
summary(svyglm(sch.wide~ell+meals+mobility, design=dstrat,
family=quasibinomial()))
## Compare regression and ratio estimation of totals
api.ratio <- svyratio(~api.stu,~enroll, design=dstrat)
pop<-data.frame(enroll=sum(apipop$enroll, na.rm=TRUE))
npop <- nrow(apipop)
predict(api.ratio, pop$enroll)
## regression estimator is less efficient
api.reg <- svyglm(api.stu~enroll, design=dstrat)
predict(api.reg, newdata=pop, total=npop)
## same as calibration estimator
svytotal(~api.stu, calibrate(dstrat, ~enroll, pop=c(npop, pop$enroll)))
## svyglm can also reproduce the ratio estimator
api.reg2 <- svyglm(api.stu~enroll-1, design=dstrat,
family=quasi(link="identity",var="mu"))
predict(api.reg2, newdata=pop, total=npop)
## higher efficiency by modelling variance better
api.reg3 <- svyglm(api.stu~enroll-1, design=dstrat,
family=quasi(link="identity",var="mu^3"))
predict(api.reg3, newdata=pop, total=npop)
## true value
sum(apipop$api.stu)
# }
```

*Documentation reproduced from package survey, version 3.37, License: GPL-2 | GPL-3*