correlation_clustering: Hierarchical Correlation Clustering

Description

Performs hierarchical correlation clustering by applying weights, distance metrics, and other parameters to analyze relationships between data points and a target.

Usage

correlation_clustering(
  data,
  target = NULL,
  weight = c(),
  distance_method = "euclidean",
  normalize = TRUE,
  labels = NULL,
  learn = FALSE,
  waiting = FALSE
)

Value

An R object containing:

dendrogram - A hierarchical clustering dendrogram
sortedValues - A data frame with the sorted cluster values
distances - A data frame with the sorted distances

Arguments

data

A data frame containing the main data

target

A data frame, numeric vector or matrix to use as correlation target. Default is NULL.

weight

A numeric vector of weights. Default is empty vector.

distance_method

A string specifying the distance metric to use. Options are:

"euclidean" - Euclidean distance
"manhattan" - Manhattan distance
"canberra" - Canberra distance
"chebyshev" - Chebyshev distance

normalize

A boolean parameter indicating whether to normalize weights. Default is TRUE.

labels

A string vector for graphical solution labeling. Default is NULL.

learn

A boolean indicating whether to show detailed algorithm explanations. Default is FALSE.

waiting

A boolean controlling pauses between explanations. Default is TRUE.

Author

Original authors:

Roberto Alcantara roberto.alcantara@edu.uah.es
Juan Jose Cuadrado jjcg@uah.es
Universidad de Alcala de Henares

Details

This function executes the complete hierarchical correlation method in the following steps:

The function transforms data into useful objects
Creates the clusters
Calculates the distance from the target to every cluster using the specified distance metric
Orders the distances in ascending order
Orders the clusters according to their distance from the previous step
Shows the sorted clusters and the distances used

Examples

Run this code

data <- matrix(c(1,2,1,4,5,1,8,2,9,6,3,5,8,5,4), ncol=3)
dataFrame <- data.frame(data)
target1 <- c(1,2,3)
target2 <- dataFrame[1,]
weight1 <- c(1,6,3)
weight2 <- c(0.1,0.6,0.3)

# Basic usage
correlation_clustering(dataFrame, target1)

# With weights
correlation_clustering(dataFrame, target1, weight1)

# Without weight normalization
correlation_clustering(dataFrame, target1, weight1, normalize = FALSE)

# Using Canberra distance with weights
correlation_clustering(dataFrame, target1, weight2, distance = "canberra", normalize = FALSE)

# With detailed explanations
correlation_clustering(dataFrame, target1, learn = TRUE)

Run the code above in your browser using DataLab