This function runs clusterlab which is a simulator for Gaussian clusters. The default method positions cluster centers on the perimeter of a circle, before creating gaussian clusters around them and projecting the 2D co-ordinates into high dimensional feature space. This method allows control over the spacing, variance, and size of the clusters. Also included is a simple random cluster simulator where the spacing of the clusters cannot be controlled precisely, but the other parameters can.
clusterlab(centers = 1, r = 8, sdvec = NULL, alphas = NULL,
centralcluster = FALSE, numbervec = NULL, features = 500,
seed = NULL, rings = NULL, ringalphas = NULL, ringthetas = NULL,
outliers = NULL, outlierdist = NULL, mode = c("circle", "random"),
pcafontsize = 18, showplots = TRUE)
Numerical value: the number of clusters to simulate (N)
Numerical value: the number of units of the radius of the circle on which the clusters are generated
Numerical vector: standard deviation of each cluster, N values are required
Numerical vector: how many units to push each cluster away from the initial placement, N values are required
Logical flag: whether to place a cluster in the middle of the rest
Numerical vector: the number of samples in each cluster, N values are required
Numerical value: the number of features for the data
Numerical value: fixes the seed if you want to repeat results, set the seed to 123 for example here
Numerical value: the number of concentric rings to generate (previous settings apply to all ring clusters)
Numerical vector: a vector of numbers to push each ring out by, must equal number of rings
Numerical vector: a vector of angles to rotate each ring by, must equal number of rings
Numerical value: the number of outliers to create
Numerical value: a distance value to move the outliers by
Character string: whether to use the standard method (circle), or simple random placement (random)
Numerical value: the font size of the pca
Logical flag: whether to remove the plots
A list, containing: 1) the synthetic data 2) cluster membership matrix
# NOT RUN {
synthetic <- clusterlab(centers=4,r=8,sdvec=c(2.5,2.5,2.5,2.5),
alphas=c(1,1,1,1),centralcluster=FALSE,
numbervec=c(50,50,50,50)) # for a six cluster solution)
# }
Run the code above in your browser using DataLab