Learn R Programming

Clustering (version 1.7.3)

execute_package_parallel: Evaluation clustering algorithm.

Description

Method that evaluates clustering algorithm from a file directory or dataframe.

Usage

execute_package_parallel(
  directory_files,
  df,
  algorithms_execute,
  measures_execute,
  cluster_min,
  cluster_max,
  metrics_execute,
  attributes,
  number_algorithms,
  numberClusters,
  numberDataSets,
  is_metric_external,
  is_metric_internal,
  name_dataframe
)

Arguments

directory_files

It's a string with the route where the datasets are located.

df

data matrix or data frame, or dissimilarity matrix, depending on the value of the argument.

algorithms_execute

character vector with the algorithms to be executed. The algorithms implemented are: fuzzy_cm,fuzzy_gg,fuzzy_gk,hclust, apclusterK,agnes,clara,daisy,diana,fanny,mona,pam,gmm,kmeans_arma, kmeans_rcpp,mini_kmeans,pvclust.

measures_execute

character array with the measurements of dissimilarity to be executed. Depending on the algorithm, one or the other is implemented. Among them we highlight: Euclidena, Manhattan, etc.

cluster_min

minimum number of clusters.

cluster_max

maximum number of clusters. cluster_max must be greater or equal cluster_min.

metrics_execute

character array defining the metrics to be executed. The night metrics implemented are: entropy, variation_information,precision, recall,f_measure,fowlkes_mallows_index,connectivity,dunn,silhouette.

attributes

accepts Boolean values. If true as a result it shows the attribute that behaves best otherwise it shows the value of the executed metric.

number_algorithms

It's a numeric field with the number of algorithms.

numberClusters

It's a numeric field with the difference between clusters.

numberDataSets

It's a numeric field with the number of datasets.

is_metric_external

boolean field to indicate whether to run external metrics.

is_metric_internal

boolean field to indicate whether to run internal metrics.

name_dataframe

name of data.frame when is fill.

Value

returns a list with the result matrix of evaluating the data from the indicated algorithms, metrics and number of clusters.