Learn R Programming

⚠️There's a newer version (1.9-1.1) of this package.Take me there.

synthpop (version 1.2-1)

Generating Synthetic Versions of Sensitive Microdata for Statistical Disclosure Control

Description

A tool for producing synthetic versions of microdata containing confidential information so that they are safe to be released to users for exploratory analysis. The key objective of generating synthetic data is to replace sensitive original values with synthetic ones causing minimal distortion of the statistical information contained in the data set. Variables, which can be categorical or continuous, are synthesised one-by-one using sequential modelling. Replacements are generated by drawing from conditional distributions fitted to the original data using parametric or classification and regression trees models. Data are synthesised via the function syn() which can be largely automated, if default settings are used, or with methods defined by the user. Optional parameters can be used to influence the disclosure risk and the analytical quality of the synthesised data.

Copy Link

Version

Install

install.packages('synthpop')

Monthly Downloads

1,404

Version

1.2-1

License

GPL-2 | GPL-3

Maintainer

Beata Nowok

Last Published

March 15th, 2016

Functions in synthpop (1.2-1)

syn.bag

Synthesis with bagging
read.obs

Importing original data sets form external files
sdc

Tools for statistical disclosure control (sdc)
syn.lognorm, syn.sqrtnorm, syn.cubertnorm

Synthesis by linear regression after transformation of a dependent variable
glm.synds, lm.synds

Fitting (generalized) linear models to synthetic data
summary.synds

Synthetic data object summaries
syn.rf

Synthesis with random forest
syn.normrank

Synthesis by normal linear regression preserving the marginal distribution
syn

Generating synthetic data sets
syn.polyreg

Synthesis by unordered polytomous regression
synthpop-package

Generating synthetic versions of sensitive microdata for statistical disclosure control
syn.norm

Synthesis by linear regression
syn.polr

Synthesis by ordered polytomous regression
syn.sample

Synthesis by simple random sampling
syn.pmm

Synthesis by predictive mean matching
write.syn

Exporting synthetic data sets to external files
utility.synds

Distributional comparison of synthesised and observed data
compare.fit.synds

Compare model estimates based on synthesised and observed data
summary.fit.synds

Inference from synthetic data
replicated.uniques

Replications in synthetic data
syn.logreg

Synthesis by logistic regression
syn.passive

Passive synthesis
syn.ctree, syn.cart

Synthesis with classification and regression trees (CART)
compare

Comparison of synthesised and observed data
syn.survctree

Synthesis of survival time by classification and regression trees (CART)
SD2011

Social Diagnosis 2011 - Objective and Subjective Quality of Life in Poland
compare.synds

Compare univariate distributions of synthesised and observed data