Learn R Programming

SynthTools

The goal of SynthTools is to make measuring the utility of partially synthetic and multiple imputed data sets easier. SynthTools includes functions that check to make sure original and derived data sets have comparable attributes, compute overall and variable-specific perturbation rates, and compute standard errors and confidence intervals for continuous and categorical variables.

Installation

You can install the released version of SynthTools from CRAN with:

install.packages("SynthTools")

Example

This is a basic example which shows you how to check the comparability of an observed data set and a data set derived from it. PPA is the observed data set and PPAps1 is the partially synthetic data set derived from PPA:

library(SynthTools)
dataComp(PPA, PPAps1)

Copy Link

Version

Install

install.packages('SynthTools')

Monthly Downloads

155

Version

1.0.1

License

GPL (>= 2)

Maintainer

Charlotte Looby

Last Published

March 11th, 2020

Functions in SynthTools (1.0.1)

oneCatCI

Confidence intervals and standard errors for one synthetic categorical variable of derived with multiply imputed datasets.
twoCatCI

Confidence intervals and standard errors for the cross-tabulation of two categorical variables of derived with multiply imputed datasets.
pertRates

Calculates perturbation rates of overall data set and specific variables.
PPAps3

Characteristics of 1000 People in Pennsylvania, partially synthetic (set 3).
ContCI

Confidence intervals and standard errors of multiple imputation for a specific imputed continuous variable.
PPA

Characteristics of 1000 People in Pennsylvania.
PPAps2

Characteristics of 1000 People in Pennsylvania, partially synthetic (set 2).
PPAm5

A list containing 5 partially synthetic data sets.
PPAps4

Characteristics of 1000 People in Pennsylvania, partially synthetic (set 4).
PPAps1

Characteristics of 1000 People in Pennsylvania, partially synthetic (set 1).
PPAps5

Characteristics of 1000 People in Pennsylvania, partially synthetic (set 5).
dataComp

Checking for equality in the features of two data sets.
logicCheck

Checking for logical consistency between two categorical variables in a synthesized data set.