Learn R Programming

varclust (version 0.9.4)

integration: Computes integration and acontamination of the clustering

Description

Integartion and acontamination are measures of the quality of a clustering with a reference to a true partition. Let \(X = (x_1, \ldots x_p)\) be the data set, \(A\) be a partition into clusters \(A_1, \ldots A_n\) (true partition) and \(B\) be a partition into clusters \(B_1, \ldots, B_m\). Then for cluster \(A_j\) integration is eqaul to: $$Int(A_j) = \frac{max_{k = 1, \ldots, m} \# \{ i \in \{ 1, \ldots p \}: x_i \in A_j \wedge x_i \in B_k \} }{\# A_j}$$ The \(B_k\) for which the value is maximized is called the integrating cluster of \(A_j\). Then the integration for the whole clustering equals is \(Int(A,B) = \frac{1}{n} \sum_{j=1}^n Int(A_j)\) .The acontamination is defined by: $$Acont(A_j) = \frac{ \# \{ i \in \{ 1, \ldots p \}: x_i \in A_j \wedge x_i \in B_k \} }{\# B_k}$$ where \(B_k\) is the integrating cluster for \(A_j\). Then the acontamination for the whole dataset is \(Acont(A,B) = \frac{1}{n} \sum_{j=1}^n Acont(A_j)\)

Usage

integration(group, true_group)

Arguments

group

A vector, first partition.

true_group

A vector, second (reference) partition.

Value

An array containing values of integration and acontamination.

References

M. So<U+0142>tys. Metody analizy skupie<U+0144>. Master<U+2019>s thesis, Wroc<U+0142>aw University of Technology, 2010

Examples

Run this code
# NOT RUN {
sim.data <- data.simulation(n = 20, SNR = 1, K = 2, numb.vars = 50, max.dim = 2)
true_segmentation <- rep(1:2, each=50)
mlcc.fit <- mlcc.reps(sim.data$X, numb.clusters = 2, max.dim = 2, numb.cores=1)
integration(mlcc.fit$segmentation, true_segmentation)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab