Learn R Programming

synthpop (version 1.3-1)

tab.utility: [EXPERIMENTAL] Tabular utility

Description

Produce tables from observed and synthesized data and calculates chi-square statistics to compare them.

Usage

tab.utility(object, data, vars = NULL, ngroups = 5, with.missings = TRUE, ...)

# S3 method for tab.utility print(x, tables = FALSE, digits = 2, …)

Arguments

object

an object of class synds, which stands for 'synthesised data set'. It is typically created by function syn() and it includes object$m synthesised data set(s).

data

the original (observed) data set.

vars

a single string or a vector of strings with the names of variables to be used to form the table.

ngroups

if numerical (non-factor) variables are included they will be classified into this number of groups to form tables. Repeated identical obsevations may produce a smaller number of groups in some cases.

with.missings

a logical value indicating whether the table is to include rows and columns for missing data categories.

additional parameters.

x

an object of class tab.utility.

tables

a logical value that determines if tables of observed, synthesised and Z differences are to be printed.

digits

an integer indicating the number of decimal places for printing tab.Zdiff.

Value

An object of class tab.utility which is a list with the following components:

Chisq

a vector with object$m values for the test.

df

a vector with corresponding degrees of freedom.

ratio

a vector with ratios of chisq to df.

nempty

a vector of length object$m with number of empty cells not contributing to the chi-square statistic.

pval

a vector of length object$m with p-values for the chi-square test.

tab.obs

a table from the observed data.

tab.syn

a table or a list of object$m tables from the synthetic data.

tab.Zdiff

a table or a list of object$m tables of Z statistics for differences between observed and synthesised cells of the tables. Large absolute values indicate a large contribution to lack-of-fit.

Details

Forms tables of observed and synthesised values for the variables specified in vars. A chi-square statistic is calculated from the cells of the tables, as (observed-synthesied)^2/[(observed + synthesised)/2], ignoring those where observed and synthesised are both zero.

See Also

utility.synds

Examples

Run this code
# NOT RUN {
  ods <- SD2011[1:1000, c("sex", "age", "edu", "marital")]
  s1 <- syn(ods, m = 3)
  t1 <- tab.utility(s1, ods, vars = c("marital", "sex"))
  print(t1, tables = TRUE)
# }

Run the code above in your browser using DataLab