Classification of dataset using kmeans or hclust algorithm and representation of clusters on a map.
clustermap() performs a classification of the sites from the variables called
names.var and computes a bar plot of the clusters calculated.
Classification methods come from
hclust() (hierarchical cluster analysis) and
(k-means clustering) and number of class is chosen with
clustermap(sp.obj, names.var, clustnum, method=c("kmeans","hclust"), type=NULL, centers=NULL, scale=FALSE, names.arg="", names.attr=names(sp.obj), criteria=NULL, carte=NULL, identify=FALSE, cex.lab=0.8, pch=16, col="lightblue3", xlab="Cluster", ylab="Number", axes=FALSE, lablong="", lablat="")
- object of class extending Spatial-class
- a vector of character; attribute names or column numbers in attribute table
- integer, number of clusters
- two methods : `kmeans' by default or `hclust'
- If method=`hclust', type=`complete' by default (the possibilities are given in help(hclust) as `ward', `single', etc). If method=`kmeans', type="Hartigan-Wong" by default (the possibilities are given in help(kmeans) as `Forgy', etc)
- If method='kmeans', user can give a matrix with initial cluster centers.
- If scale=TRUE, the dataset is reducted.
- a vector of character, names of cluster
- names to use in panel (if different from the names of variable used in sp.obj)
- a vector of boolean of size the number of spatial units, which permit to represent preselected sites with a cross, using the tcltk window
- matrix with 2 columns for drawing spatial polygonal contours : x and y coordinates of the vertices of the polygon
- if not FALSE, identify plotted objects (currently only working for points plots). Labels for identification are the row.names of the attribute table row.names(as.data.frame(sp.obj)).
- character size of label
- a vector of symbol which must be equal to the number of group else all sites are printed in pch
- a vector of colors which must be equal to the number of group else all sites and all bars are printed in col
- a title for the graphic x-axis
- a title for the graphic y-axis
- a boolean with TRUE for drawing axes on the map
- name of the x-axis that will be printed on the map
- name of the y-axis that will be printed on the map
The two windows are interactive : the sites selected by a bar chosen on the bar plot are represented on the map in red and the values of sites selected on the map by `points' or `polygon' are represented in red on the bar plot. The dendogram is also drawn for 'hclust' method. In option, possibility to choose the classification method.
In the case where user click on
save resultsbutton, a list is created as a global variable in
obs, a vector of integer, corresponds to the number of spatial units selected just before leaving the Tk window,
vectclass, vector of integer, corresponds to the number of cluster attributed to each spatial unit.
To use the functions
kmeans, we take many arguments by default.
If the user would like to modify these arguments, he should call these functions first and
then use the function
barmap to visualize the calculated clusters.
Thibault Laurent, Anne Ruiz-Gazen, Christine Thomas-Agnan (2012), GeoXp: An R Package for Exploratory Spatial Data Analysis. Journal of Statistical Software, 47(2), 1-23.
Murtagh, F (1985). Multidimensional Clustering Algorithms.
Hartigan, J. A. and Wong, M. A. (1979). A K-means clustering algorithm. Applied Statistics 28, 100-108
Roger S.Bivand, Edzer J.Pebesma, Virgilio Gomez-Rubio (2009), Applied Spatial Data Analysis with R, Springer.
##### # data columbus require("maptools") example(columbus) # a basic example using the kmeans method clustermap(columbus, c("HOVAL","INC","CRIME","OPEN","PLUMB","DISCBD"), 3, criteria=(columbus@data$CP==1), identify=TRUE, cex.lab=0.7) # example using the hclust method clustermap(columbus,c(7:12), 3, method="hclust", criteria=(columbus@data$CP==1),col=colors()[20:22],identify=TRUE, cex.lab=0.7, names.arg=c("Group 1","Group 2","Group 3"), xlab="Cluster")