Standardises cluster validity statistics as produced by
`clustatsum`

relative to results that were achieved by
random clusterings on the same data by
`randomclustersim`

. The aim is to make differences between
values comparable between indexes, see Hennig (2019), Akhanli and
Hennig (2020).

This is mainly for use within `clusterbenchstats`

.

```
cgrestandard(clusum,clusim,G,percentage=FALSE,
useallmethods=FALSE,
useallg=FALSE, othernc=list())
```

clusum

object of class "valstat", see `clusterbenchstats`

.

clusim

list; output object of `randomclustersim`

,
see there.

G

vector of integers. Numbers of clusters to consider.

percentage

logical. If `FALSE`

, standardisation is done to
mean zero and standard deviation 1 using the random clusterings. If
`TRUE`

, the output is the percentage of simulated values below
the result (more precisely, this number plus one divided by the
total plus one).

useallmethods

logical. If `FALSE`

, only random clustering
results from `clusim`

are used for standardisation. If
`TRUE`

, also clustering results from other methods as given in
`clusum`

are used.

useallg

logical. If `TRUE`

, standardisation uses results
from all numbers of clusters in `G`

. If `FALSE`

,
standardisation of results for a specific number of cluster only
uses results from that number of clusters.

othernc

list of integer vectors of length 2. This allows the
incorporation of methods that bring forth other numbers of clusters
than those in `G`

, for example because a method may have
automatically estimated a number of clusters. The first number is
the number of the clustering method (the order is determined by
argument `clustermethod`

in
`clusterbenchstats`

), the second number is the
number of clusters. Results specified here are only standardised in
`useallg=TRUE`

.

List of class `"valstat"`

, see
`valstat.object`

, with standardised results as
explained above.

`cgrestandard`

will add a statistic named `dmode`

to the
input set of validation statistics, which is defined as
`0.75*dindex+0.25*highdgap`

, aggregating these two closely
related statistics, see `clustatsum`

.

Hennig, C. (2019) Cluster validation by measurement of clustering
characteristics relevant to the user. In C. H. Skiadas (ed.)
*Data Analysis and Applications 1: Clustering and Regression,
Modeling-estimating, Forecasting and Data Mining, Volume 2*, Wiley,
New York 1-24,
https://arxiv.org/abs/1703.09282

Akhanli, S. and Hennig, C. (2020) Calibrating and aggregating cluster
validity indexes for context-adapted comparison of clusterings.
*Statistics and Computing*, 30, 1523-1544,
https://link.springer.com/article/10.1007/s11222-020-09958-2, https://arxiv.org/abs/2002.01822

`valstat.object`

, `clusterbenchstats`

, `stupidkcentroids`

, `stupidknn`

, \codestupidkfn, `stupidkaven`

, codeclustatsum

# NOT RUN { set.seed(20000) options(digits=3) face <- rFace(10,dMoNo=2,dNoEy=0,p=2) dif <- dist(face) clusum <- list() clusum[[2]] <- list() cl12 <- kmeansCBI(face,2) cl13 <- kmeansCBI(face,3) cl22 <- claraCBI(face,2) cl23 <- claraCBI(face,2) ccl12 <- clustatsum(dif,cl12$partition) ccl13 <- clustatsum(dif,cl13$partition) ccl22 <- clustatsum(dif,cl22$partition) ccl23 <- clustatsum(dif,cl23$partition) clusum[[1]] <- list() clusum[[1]][[2]] <- ccl12 clusum[[1]][[3]] <- ccl13 clusum[[2]][[2]] <- ccl22 clusum[[2]][[3]] <- ccl23 clusum$maxG <- 3 clusum$minG <- 2 clusum$method <- c("kmeansCBI","claraCBI") clusum$name <- c("kmeansCBI","claraCBI") clusim <- randomclustersim(dist(face),G=2:3,nnruns=1,kmruns=1, fnruns=1,avenruns=1,monitor=FALSE) cgr <- cgrestandard(clusum,clusim,2:3) cgr2 <- cgrestandard(clusum,clusim,2:3,useallg=TRUE) cgr3 <- cgrestandard(clusum,clusim,2:3,percentage=TRUE) print(str(cgr)) print(str(cgr2)) print(cgr3[[1]][[2]]) # }