Learn R Programming

synthesizer (version 0.5.0)

synthesize: Create synthetic version of a dataset

Description

Create n values or records based on the emperical (multivariate) distribution of y. For data frames it is possible to decorrelate synthetic from the original variables by lowering the value for the rankcor parameter.

Usage

synthesize(x, na.rm = FALSE, n = NROW(x), rankcor = 1)

Value

A data object of the same type and structure as x.

Arguments

x

[vector|data.frame] data to synthesize.

na.rm

[logical] Remove missing values before creating a synthesizer. Set to TRUE to avoid synthesizing missing values.

n

[integer] Number of values or records to synthesize.

rankcor

[numeric] in \([0,1]\). Either a single rank correlation value that is applied to all variables, or a vector of the form c(variable1=ut1lity1,...). Variables not explicitly mentioned will have rankcor=1. See also the note below. Ignored for all types of x, except for objects of class data.frame.

See Also

Other synthesis: make_synthesizer()

Examples

Run this code
synthesize(cars$speed,10)
synthesize(cars)
synthesize(cars,25)

s1 <- synthesize(iris, rankcor=1)
s2 <- synthesize(iris, rankcor=0.5)
s3 <- synthesize(iris, rankcor=c("Species"=0.5))

oldpar <- par(mfrow=c(2,2), pch=16, las=1)
plot(Sepal.Length ~ Sepal.Width, data=iris, col=iris$Species, main="Iris")
plot(Sepal.Length ~ Sepal.Width, data=s1, col=s1$Species, main="Synthetic Iris")
plot(Sepal.Length ~ Sepal.Width, data=s2, col=s2$Species, main="Low utility Iris")
plot(Sepal.Length ~ Sepal.Width, data=s3, col=s3$Species, main="Low utility Species")
par(oldpar)


Run the code above in your browser using DataLab