Learn R Programming

UAHDataScienceUC (version 1.0.1)

correlation_clustering: Hierarchical Correlation Clustering

Description

Performs hierarchical correlation clustering by applying weights, distance metrics, and other parameters to analyze relationships between data points and a target.

Usage

correlation_clustering(
  data,
  target = NULL,
  weight = c(),
  distance_method = "euclidean",
  normalize = TRUE,
  labels = NULL,
  learn = FALSE,
  waiting = FALSE
)

Value

An R object containing:

  • dendrogram - A hierarchical clustering dendrogram

  • sortedValues - A data frame with the sorted cluster values

  • distances - A data frame with the sorted distances

Arguments

data

A data frame containing the main data

target

A data frame, numeric vector or matrix to use as correlation target. Default is NULL.

weight

A numeric vector of weights. Default is empty vector.

distance_method

A string specifying the distance metric to use. Options are:

  • "euclidean" - Euclidean distance

  • "manhattan" - Manhattan distance

  • "canberra" - Canberra distance

  • "chebyshev" - Chebyshev distance

normalize

A boolean parameter indicating whether to normalize weights. Default is TRUE.

labels

A string vector for graphical solution labeling. Default is NULL.

learn

A boolean indicating whether to show detailed algorithm explanations. Default is FALSE.

waiting

A boolean controlling pauses between explanations. Default is TRUE.

Author

Original authors:

Details

This function executes the complete hierarchical correlation method in the following steps:

  1. The function transforms data into useful objects

  2. Creates the clusters

  3. Calculates the distance from the target to every cluster using the specified distance metric

  4. Orders the distances in ascending order

  5. Orders the clusters according to their distance from the previous step

  6. Shows the sorted clusters and the distances used

Examples

Run this code
data <- matrix(c(1,2,1,4,5,1,8,2,9,6,3,5,8,5,4), ncol=3)
dataFrame <- data.frame(data)
target1 <- c(1,2,3)
target2 <- dataFrame[1,]
weight1 <- c(1,6,3)
weight2 <- c(0.1,0.6,0.3)

# Basic usage
correlation_clustering(dataFrame, target1)

# With weights
correlation_clustering(dataFrame, target1, weight1)

# Without weight normalization
correlation_clustering(dataFrame, target1, weight1, normalize = FALSE)

# Using Canberra distance with weights
correlation_clustering(dataFrame, target1, weight2, distance = "canberra", normalize = FALSE)

# With detailed explanations
correlation_clustering(dataFrame, target1, learn = TRUE)

Run the code above in your browser using DataLab