
Clest
performs Clest ( Dudoit and Fridlyand (2002)) with CER as the measure of the agreement between two partitions (in each training set).
The following clustering algorithm can be used: K-means, trimmed K-means, sparse K-means and robust sparse K-means.
Clest(d, maxK, alpha, B = 15, B0 = 5, nstart = 1000,
L1 = 6, beta = 0.1, pca = TRUE, silent=FALSE)
N
by p
) where N
is the number of cases and p
is the number of features. The cases are clustered.
RSKC
.
d
is randomly partitioned into a learning set and a training set.
Note that each generated reference dataset is partitioned into a learning and a testing set only once to ease the computational cost.
RSKC
.
beta
TRUE
, then reference datasets are generated from a PCA reference distribution.
If FALSE
, then the reference data set is generated from a simple reference distribution.
TRUE
, then the number of iteration on progress is not printed.
S. Dudoit and J. Fridlyand. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology, 3(7), 2002.
## Not run:
# # little simulation function
# sim <-
# function(mu,f){
# D<-matrix(rnorm(60*f),60,f)
# D[1:20,1:50]<-D[1:20,1:50]+mu
# D[21:40,1:50]<-D[21:40,1:50]-mu
# return(D)
# }
#
# set.seed(1)
# d<-sim(1.5,100); # non contaminated dataset with noise variables
#
# # Clest with robust sparse K-means
# rsk<-Clest(d,5,alpha=1/20,B=3,B0=10, beta = 0.05, nstart=100,pca=TRUE,L1=3,silent=TRUE);
# # Clest with K-means
# k<-Clest(d,5,alpha=0,B=3,B0=10, beta = 0.05, nstart=100,pca=TRUE,L1=NULL,silent=TRUE);
# ## End(Not run)
Run the code above in your browser using DataLab