Learn R Programming

synthpop (version 1.2-1)

summary.fit.synds: Inference from synthetic data

Description

Combines the results of models fitted to each of the m synthetic data sets.

Usage

"summary"(object, population.inference = FALSE, msel = NULL, partly = FALSE, ...)
"print"(x, ...)

Arguments

object
an object of class fit.synds created by fitting a model to synthesised data set using function glm.synds or lm.synds.
population.inference
a logical value indicating whether inference should be made to population quantities. If FALSE inference is made to original data quantities.
msel
index or indices of synthetic data copies for which summaries of fitted models are to be produced. If NULL (default) a summary of combined estimates is produced.
partly
a logical value indicating whether data are partly synthesised.
...
additional parameters.
x
an object of class summary.fit.synds.

Value

An object of class summary.fit.synds which is a list with the following components:
call
the original call to glm.synds or lm.synds.
proper
a logical value indicating whether synthetic data were generated using proper synthesis.
population.inference
a logical value indicating whether inference to population coefficients or to coefficients of the actual (observed) data is made.
fitting.function
function used to fit the model.
m
the number of synthetic versions of the original (observed) data.
coefficients
a matrix with combined estimates. It includes point estimates of coefficients (B.syn), their standard errors (se(B.syn)) and Z scores (Z.syn) for population and observed data quantities respectively. For inference to original data quantities it contains in addition estimates of the actual standard errors based on synthetic data (se(Beta).syn) and standard errors of Z scores (se(Z.syn)).
n
a number of cases in the original data.
k
a number of cases in the synthesised data.
analyses
summary.glm or summary.lm object respectively or a list of m such objects.
msel
index or indices of synthetic data copies for which summaries of fitted models are produced. If NULL a summary of combined estimates is produced.

Details

The mean of the estimates from each of the m synthetic data sets yields asymptotically unbiased estimates of the coefficients if the observed data conform to the distribution used for synthesis. The standard errors are estimated differently depending whether inference is made for the results that would be obtained from the observed data or for the parameters of the population that we assume the observed data are sampled from. The standard errors also differ according to whether synthetic data were produced using simple or proper synthesis (for details see Raab et al. (submitted 2014)).

References

Raab, G.M., Nowok, B. and Dibben, C. (submitted 2014). A simplified approach to synthetic data. http://arxiv.org/abs/1409.0217

See Also

summary,print

Examples

Run this code
ods <- SD2011[1:2000,c("sex","age","edu","ls","smoke")]
  
### simple synthesis
s1 <- syn(ods, m = 5)
f1 <- glm.synds(smoke ~ sex + age + edu + ls, data = s1, family = "binomial")
summary(f1)
summary(f1, population.inference = TRUE)
  
### proper synthesis
s2 <- syn(ods, m = 5, proper = TRUE)
f2 <- glm.synds(smoke ~ sex + age + edu + ls, data = s2, family = "binomial")
summary(f2)
summary(f2, population.inference = TRUE)

Run the code above in your browser using DataLab