Learn R Programming

geocmeans

An R package to perform Spatial Fuzzy C-means.

The website of the package is available here

Breaking news

Here we are! We are moving from maptools, sp, rgeos, raster and rgdal to sf, terra and tmap. All the functions and the documentation were modified accordingly. If you spot an error or a bug, please open an issue on github.

Installation

The stable version of geocmeans is available on CRAN. You can install it with the command below.

install.packages("geocmeans")

You can install a development version of the geocmeans package using the command below.

remotes::install_github(repo = "JeremyGelb/geocmeans", build_vignettes = TRUE, force = TRUE)

Authors

Jeremy Gelb, Laboratoire d’Équité Environnemental INRS (CANADA), Email: jeremy.gelb@ucs.inrs.ca

Contributors

Philippe Apparicio, Laboratoire d’Équité Environnemental INRS (CANADA), Email: philippe.apparicio@ucs.inrs.ca

About the package

Provides functions to apply Spatial Fuzzy c-means Algorithm, visualize and interpret results. This method is well suited when the user wants to analyze data with a fuzzy clustering algorithm and to account for the spatial dimension of the dataset. In addition, indexes for measuring the spatial consistency and classification quality are proposed. The algorithms were developed first for brain imagery as described in the articles of Cai and al. 2007 and Zaho and al. 2013. Gelb and Apparicio proposed to apply the method to perform a socio-residential and environmental taxonomy in Lyon (France). The methods can be applied to dataframes or to rasters.

Fuzzy classification algorithms

Four Fuzzy classification algorithms are proposed :

  • FCM: Fuzzy C-Means, with the function CMeans
  • GFCM: Generalized Fuzzy C-Means, with the function GFCMeans
  • SFCM: Spatial Fuzzy C-Means, with the function SFCMeans
  • SGFCM: Spatial Generalized Fuzzy C-Means, with the function SGFCMeans

Each function return a membership matrix, the data used for the classification (scaled if required) and the centers of the clusters.

For each algorithm, it is possible to calculate a “robust version” and to add a noise group (used to catch outliers). See the parameters robust and noise_cluser in the documentation for more details.

Parameter selections

The algorithms available require different parameters to be fixed by the user. The function selectParameters is a useful tool to compare the results of different combinations of parameters. A multicore version, selectParameters.mc, using a plan from the package future is also available to speed up the calculus.

Classification quality

Many indices of classification quality can be calculated with the function calcqualityIndexes:

  • Silhouette.index: the silhouette index (fclust::SIL.F)
  • Partition.entropy: the partition entropy index (fclust::PE)
  • Partition.coeff: the partition entropy coefficient (fclust::PC)
  • Modified.partition.coeff: the modified partition entropy coefficient (fclust::MPC)
  • XieBeni.index: the Xie and Beni index (fclust::XB)
  • FukuyamaSugeno.index: the Fukuyama and Sugeno index (geocmeans::calcFukuyamaSugeno)
  • DavidBoudlin.index: the David-Bouldin index (geocmeans::calcDavidBouldin)
  • CalinskiHarabasz.index: the Calinski-Harabasz index (geocmeans::calcCalinskiHarabasz)
  • GD43.index and GD53.index: two version of the generalized Dunn index (geocmeans::calcGD43 and calcGD53)
  • Negentropy.index: the Negentropy Increment index (geocmeans::calcNegentropyI)
  • Explained.inertia: the percentage of total inertia explained by the solution

Classification consistency

To assess the stability of the obtained clusters, a function for bootstrap validation is proposed: boot_group_validation. The results can be used to verify if the obtained clusters are stable and how much their centres vary.

Reproductibility

Clustering methods like CMeans depend on the initial centers selected. In geocmeans, they are selected randomly, and two runs of the functions can yield different results. To facilitate the reproductibility of the results, the main functions of the package (CMeans, GFCMeans, SFCMeans, SGFCMeans, selectParameters, selectParameters.mc) have a seed parameter. It can be set by the user to ensure that the results of the functions are exactly the same.

Interpretation

Several functions are also available to facilitate the interpretation of the classification:

  • summary statistics for each cluster: summarizeClusters (also accessible with the generic function summary)
  • spider charts: spiderPlots
  • violin plots: violinPlots
  • maps of the membership matrix: mapClusters (support polygon, points and polylines)

There is also a shiny app that can be used to go deeper in the result interpretation. It requires the packages shiny, leaflet, bslib, plotly, shinyWidgets, car.

Spatial diagnostic

Several spatial indices can be calculated to have a better spatial understanding of the obtained clusters, like the global or local Moran I calculated on the membership values, or the join-count-test on the most likely group for each observation. ELSA and Fuzzy ELSA statistics can also be calculated to identify areas with high or low multidimensional spatial autocorrelation in the membership values. See functions spConsistency, calcELSA, calcFuzzyELSA and spatialDiag.

We proposed an index to quantify the spatial inconsistency of a classification (Gelb and Apparicio). If in a classification close observations tend to belong to the same group, then the value of the index is close to 0. If the index is close to 1, then the belonging to groups is randomly distributed in space. A value higher than one can happen in the case of negative spatial autocorrelation. The index is described in the vignette adjustinconsistency. The function spatialDiag does a complete spatial diagnostic of the membership matrix resulting from a classification.

Examples

Detailed examples are given in the vignette introduction

vignette("introduction","geocmeans")

Testing

If you would like to install and run the unit tests interactively, include INSTALL_opts = "--install-tests" in the installation code.

remotes::install_github(repo = "JeremyGelb/geocmeans", build_vignettes = TRUE, force = TRUE, INSTALL_opts = "--install-tests")
testthat::test_package("geocmeans", reporter = "stop")

Contribute

To contribute to geocmeans, please follow these guidelines.

Please note that the geocmeans project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

geocmeans version 0.3.4 is licensed under GPL2 License.

Copy Link

Version

Install

install.packages('geocmeans')

Monthly Downloads

261

Version

0.3.4

License

GPL-2

Issues

Pull Requests

Stars

Forks

Maintainer

Jeremy Gelb

Last Published

September 12th, 2023

Functions in geocmeans (0.3.4)

belongsSFCM

membership matrix calculator for SFCM algorithm
adjustSpatialWeights

Semantic adjusted spatial weights
adj_spconsist_arr_window_globstd

Adjusted spatial inconsistency index for rasters
belongsFCM

membership matrix calculator for FCM algorithm
boot_group_validation.mc

Check that the obtained groups are stable by bootstrap (multicore)
barPlots

Bar plots
belongsSGFCM

membership matrix calculator for SGFCM algorithm
belongsGFCM

membership matrix calculator for GFCM algorithm
boot_worker

Worker function for cluster bootstrapping
boot_group_validation

Check the robustness of a classification by Bootstrap
calcEuclideanDistance

Calculate the Euclidean distance
calcEuclideanDistance2

euclidean distance between rows of a matrix and a vector
calcEuclideanDistance3

euclidean distance between rows of a matrix and a vector (arma mode)
calcCalinskiHarabasz

Calinski-Harabasz index
calcCentroids

Calculate the centroids
calcFGCMBelongMatrix

Calculate the generalized membership matrix
calcGD53

Generalized Dunn’s index (53)
calcGD43

Generalized Dunn’s index (43)
calcDaviesBouldin

Davies-Bouldin index
calcQualIdx

calculate the quality index required
calcNegentropyI

Negentropy Increment index
calcRobustSigmas

Calculate sigmas for the robust version of the c-means algorithm
calcLaggedData

Lagged Data
calcWdataRaster

Calculate lagged values for a raster dataset
calcUncertaintyIndex

Diversity index
calc_jaccard_idx

Jaccard similarity coefficient
calc_jaccard_mat

Jaccard similarity coefficient between columns of two matrices
calc_local_moran_raster

Local Moran I for raster
calc_moran_raster

Global Moran I for raster
calcBelongMatrix

Calculate the membership matrix
calcFuzzyELSA

calculate ELSA statistic for a fuzzy partition
calcFGCMBelongMatrixNoisy

Calculate the generalized membership matrix with a noise cluster
calcSFCMBelongMatrix

Calculate the membership matrix (spatial version)
calcBelongMatrixNoisy

Calculate the membership matrix with a noise cluster
calcqualityIndexes

Quality indexes
cat_to_belongings

Convert categories to membership matrix
div_matrices_bycol

element wise division of two matrices by column
centersSFCM

center matrix calculator for SFCM algorithm
calcSFGCMBelongMatrix

Calculate the generalized membership matrix (spatial version)
calcFuzzyElsa_raster

Local Fuzzy ELSA statistic for raster
calcSFGCMBelongMatrixNoisy

Calculate the generalized membership matrix (spatial version) with a noise cluster
centersSGFCM

center matrix calculator for SGFCM algorithm
calcELSA

calculate ELSA statistic for a hard partition
calcSFCMBelongMatrixNoisy

Calculate the membership matrix (spatial version) with a noise cluster
check_matdist

Check validity of a dissimilarity matrix
elsa_raster

calculate ELSA spatial statistic for raster dataset
geocmeans_env

geocmeans general environment
geocmeans

geocmeans: A package implementing methods for spatially constrained c-means algorithm
calc_raster_spinconsistency

calculate spatial inconsistency for raster
power_mat

power of a matrix
pow_matrices_bycol

element wise power of a matrix by column
local_moranI_matrix_window

Local Moran I calculated on a matrix with a given window
check_raters_dims

Check dimensions of a list of rasters
main_worker

Main worker function
is.FCMres

is method for FCMres
plot.FCMres

Plot method for FCMres object
summarizeClusters

Descriptive statistics by group
summary.FCMres

Summary method for FCMres
predict.FCMres

Predict method for FCMres object
focal_euclidean_arr_window

focal euclidean distance on a matrix with a given window for a cube
kppCenters

kpp centers selection
spiderPlots

Spider chart
elsa_fuzzy_vector

Local Fuzzy ELSA statistic for vector
test_inferior_mat

create a logical matrix with inferior comparison
focal_adj_mean_arr_window

focal mean weighted by inverse of euclidean distance on a cube
calcFukuyamaSugeno

Fukuyama and Sugeno index
max_mat

maximum in a matrix
focal_euclidean

focal euclidean distance on a list of matrices
mapThis

Mapping the clusters
uncertaintyMap

Uncertainty map
calcexplainedInertia

Explained inertia index
prod_matrices_bycol

element wise product of two matrices by column
circular_window

Circular window
standardizer

Standardizing helper
rowmins_mat

minimum of each row of a matrix
sqrt_matrix_bycol

element wise square root of a matrix by column
calcSWFCCentroids

Calculate the centroids of SFCM
check_window

Check the shape of a window
groups_matching

Match the groups obtained from two classifications
undecidedUnits

Undecided observations
select_parameters.mc

Select parameters for clustering algorithm (multicore)
spConsistency

Spatial consistency index
input_raster_data

Raster data preparation
elsa_vector

calculate ELSA spatial statistic for vector dataset
calcSilhouetteIdx

Fuzzy Silhouette index
vecmin

minimum of a vector
spatialDiag

Spatial diagnostic
sp_clust_explorer

Classification result explorer
centersFCM

center matrix calculator for FCM algorithm
centersGFCM

center matrix calculator for GFCM algorithm
print.FCMres

print method for FCMres
eval_parameters

Worker function
mapClusters

Mapping the clusters
vector_out_prod

create a matrix by multiplying a vector by its elements one by one as rows
moranI_matrix_window

Moran I calculated on a matrix with a given window
mapRasters

Mapping the clusters (rasters)
evaluateMatrices

Matrix evaluation
focal_euclidean_mat_window

focal euclidean distance on a matrix with a given window
sanity_check

Parameter checking function
predict_membership

Predict matrix membership for new observations
violinPlots

Violin plots
select_parameters

Select parameters for a clustering algorithm
sub_matrices_bycol

substraction of two matrices by column
output_raster_data

Raster result transformation
FCMres

Instantiate a FCMres object
GCMeans

Generalized C-means
SGFCMeans

SGFCMeans
Elsa_categorical_matrix_window

Elsa statistic calculated on a matrix with a given window
add_matrices_bycol

sum of two matrices by column
Elsa_fuzzy_matrix_window

Fuzzy Elsa statistic calculated on a matrix with a given window
LyonIris

social and environmental indicators for the Iris of the metropolitan region of Lyon (France)
Arcachon

SpatRaster of the bay of Arcachon
CMeans

C-means
SFCMeans

SFCMeans