index.G2

0th

Percentile

Calculates G2 internal cluster quality index

Calculates G2 internal cluster quality index - Baker & Hubert adaptation of Goodman & Kruskal's Gamma statistic

Keywords
cluster
Usage
index.G2(d,cl)
Arguments
d

'dist' object

cl

A vector of integers indicating the cluster to which each object is allocated

Details

See file \$R\_HOME\library\clusterSim\pdf\indexG2\_details.pdf for further details

Value

calculated G2 index

References

Everitt, B.S., Landau, E., Leese, M. (2001), Cluster analysis, Arnold, London, p. 104. ISBN 9780340761199.

Gatnar, E., Walesiak, M. (Eds.) (2004), Metody statystycznej analizy wielowymiarowej w badaniach marketingowych [Multivariate statistical analysis methods in marketing research], Wydawnictwo AE, Wroclaw, p. 339. Available at: http://keii.ue.wroc.pl/pracownicy/mw/2004_Gatnar_Walesiak_Metody_SAW_w_badaniach_marketingowych.pdf.

Gordon, A.D. (1999), Classification, Chapman & Hall/CRC, London, p. 62. ISBN 9781584880134.

Hubert, L. (1974), Approximate evaluation technique for the single-link and complete-link hierarchical clustering procedures, "Journal of the American Statistical Association", vol. 69, no. 347, 698-704. Available at: http://dx.doi.org/10.1080/01621459.1974.10480191.

Milligan, G.W., Cooper, M.C. (1985), An examination of procedures of determining the number of cluster in a data set, "Psychometrika", vol. 50, no. 2, 159-179. Available at: https://dx.doi.org/10.1007/BF02294245.

See Also

index.G1, index.G3, index.S, index.H, index.KL, index.Gap, index.DB

Aliases
  • index.G2
Examples
# NOT RUN {
# Example 1
library(clusterSim)
data(data_ratio)
d <- dist.GDM(data_ratio)
c <- pam(d, 5, diss = TRUE)
icq <- index.G2(d,c$clustering)
print(icq)

# Example 2
library(clusterSim)
data(data_ordinal)
d <- dist.GDM(data_ordinal, method="GDM2")
# nc - number_of_clusters
min_nc=2
max_nc=6
res <- array(0,c(max_nc-min_nc+1, 2))
res[,1] <- min_nc:max_nc
clusters <- NULL
for (nc in min_nc:max_nc)
{
cl2 <- pam(d, nc, diss=TRUE)
res[nc-min_nc+1,2] <- G2 <- index.G2(d,cl2$cluster)
clusters <- rbind(clusters,cl2$cluster)
}
print(paste("max G2 for",(min_nc:max_nc)[which.max(res[,2])],"clusters=",max(res[,2])))
print("clustering for max G2")
print(clusters[which.max(res[,2]),])
write.table(res,file="G2_res.csv",sep=";",dec=",",row.names=TRUE,col.names=FALSE)
plot(res, type="p", pch=0, xlab="Number of clusters", ylab="G2", xaxt="n")
axis(1, c(min_nc:max_nc))
# }
Documentation reproduced from package clusterSim, version 0.47-3, License: GPL (>= 2)

Community examples

Looks like there are no examples yet.