RSKC (version 2.4.2)

Clest: An implementation of Clest with robust sparse K-means. CER is used as a similarity measure.

Description

The function Clest performs Clest ( Dudoit and Fridlyand (2002)) with CER as the measure of the agreement between two partitions (in each training set). The following clustering algorithm can be used: K-means, trimmed K-means, sparse K-means and robust sparse K-means.

Usage

Clest(d, maxK, alpha, B = 15, B0 = 5, nstart = 1000,
L1 = 6, beta = 0.1, pca = TRUE, silent=FALSE)

Arguments

d
A numerical data matrix (N by p) where N is the number of cases and p is the number of features. The cases are clustered.
maxK
The maximum number of clusters that you suspect.
alpha
See RSKC.
B
The number of times that an observed dataset d is randomly partitioned into a learning set and a training set. Note that each generated reference dataset is partitioned into a learning and a testing set only once to ease the computational cost.
B0
The number of times that the reference dataset is generated.
nstart
The number of random initial sets of cluster centers at Step(a) of robust sparse K-means clustering.
L1
See RSKC.
beta
0 <= beta
pca
Logical, if TRUE, then reference datasets are generated from a PCA reference distribution. If FALSE, then the reference data set is generated from a simple reference distribution.
silent
Logical, if TRUE, then the number of iteration on progress is not printed.

Value

References

Yumi Kondo (2011), Robustificaiton of the sparse K-means clustering algorithm, MSc. Thesis, University of British Columbia http://hdl.handle.net/2429/37093

S. Dudoit and J. Fridlyand. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology, 3(7), 2002.

Examples

Run this code
## Not run: 
# # little simulation function 
# sim <-
# function(mu,f){
#    D<-matrix(rnorm(60*f),60,f)
#    D[1:20,1:50]<-D[1:20,1:50]+mu
#    D[21:40,1:50]<-D[21:40,1:50]-mu  
#    return(D)
#    }
#  
#  set.seed(1)
#  d<-sim(1.5,100); # non contaminated dataset with noise variables
#  
# # Clest with robust sparse K-means
# rsk<-Clest(d,5,alpha=1/20,B=3,B0=10, beta = 0.05, nstart=100,pca=TRUE,L1=3,silent=TRUE);
# # Clest with K-means
# k<-Clest(d,5,alpha=0,B=3,B0=10, beta = 0.05, nstart=100,pca=TRUE,L1=NULL,silent=TRUE);
# ## End(Not run)

Run the code above in your browser using DataCamp Workspace