generate.sample4: Sample4 generator of synthetic data
Description
Multivariate normally distributed data synthetic generator.
Data sets with 5 clusters are randomly generated.
n 6000-dimensional examples for each class are generated.
All classes (each one of n examples) have 1000 no-noisy and 5000 noisy features but there is substantial overlapping
between distributions underlying classes 1 and 2 and 1 and 3, while class 4 and 5 are separated.
The first class (first n examples) has its no noisy variables centered in 0.
The second class (second n examples) has its no noisy variables centered in 1.
The third class (third n examples) has its no noisy variables centered in -1.
The fourth class (fourth n examples) has its no noisy variables centered in 5.
The fifth class (fifth n examples) has its no noisy variables centered in -5.
The diagonal of the covariance matrix for all classes has its elements equal to sigma (first 1000 variables) and equal to
2*sigma (last 5000 variables).
Usage
generate.sample4(n = 2, sigma = 1)
Value
a real data matrix with 1000 rows (variables) and n*5 columns (examples)
Arguments
n
number of examples for each class
sigma
standard deviation of the first 1000 variables. The remaining variables have 2*sigma standard deviation