syn.normrank: Synthesis by normal linear regression preserving
the marginal distribution
Description
Generates univariate synthetic data using linear regression analysis
and preserves the marginal distribution. Regression is carried out on
Normal deviates of ranks in the original variable. Synthetic values are
assigned from the original values based on the synthesised ranks
that are transformed from their synthesised Normal deviates.
Usage
syn.normrank(y, x, xp, smoothing, proper = FALSE, ...)
Arguments
y
an original data vector of length n
.
x
a matrix (n
x p
) of original covariates.
xp
a matrix (k
x p
) of synthesised covariates.
smoothing
smoothing method. See details.
proper
a logical value specifying whether proper synthesis
should be conducted. See details.
...
additional parameters.
Value
k
with synthetic values of y
.
Details
First generates synthetic values of Normal deviates of ranks of
the values in y
using the spread around the fitted
linear regression line of Normal deviates of ranks given x
.
Then synthetic Normal deviates of ranks are transformed back to
get synthetic ranks which are used to assign values from
y
.
For proper synthesis first the regression coefficients
are drawn from normal distribution with mean and variance
from the fitted model.
A Guassian kernel smoothing can be applied by setting smoothing parameter
to "density"
. It is recommended as a tool to decrease the disclosure
risk.