synthpop (version 1.5-1)

syn.normrank: Synthesis by normal linear regression preserving the marginal distribution

Description

Generates univariate synthetic data using linear regression analysis and preserves the marginal distribution. Regression is carried out on Normal deviates of ranks in the original variable. Synthetic values are assigned from the original values based on the synthesised ranks that are transformed from their synthesised Normal deviates.

Usage

syn.normrank(y, x, xp, smoothing, proper = FALSE, ...)

Arguments

y

an original data vector of length n.

x

a matrix (n x p) of original covariates.

xp

a matrix (k x p) of synthesised covariates.

smoothing

smoothing method. See details.

proper

a logical value specifying whether proper synthesis should be conducted. See details.

additional parameters.

Value

A vector of length k with synthetic values of y.

Details

First generates synthetic values of Normal deviates of ranks of the values in y using the spread around the fitted linear regression line of Normal deviates of ranks given x. Then synthetic Normal deviates of ranks are transformed back to get synthetic ranks which are used to assign values from y. For proper synthesis first the regression coefficients are drawn from normal distribution with mean and variance from the fitted model. A Guassian kernel smoothing can be applied by setting smoothing parameter to "density". It is recommended as a tool to decrease the disclosure risk.

See Also

syn, syn.norm, syn.lognorm