Learn R Programming

⚠️There's a newer version (1.7.7) of this package.Take me there.

Clustering (version 1.7.6)

Techniques for Evaluating Clustering

Description

The design of this package allows us to run different clustering packages and compare the results between them, to determine which algorithm behaves best from the data provided.

Copy Link

Version

Install

install.packages('Clustering')

Monthly Downloads

730

Version

1.7.6

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Luis Perez Martos

Last Published

April 19th, 2022

Functions in Clustering (1.7.6)

algorithm_apcluster

apcluster package algorithms
best_ranked_internal_metrics

Best rated internal metrics.
algorithms

Method that returns the list of used algorithms
apclusterK_euclidean

Method that runs the apClusterK algorithm using the Euclidean metric to make an external or internal validation of the cluster.
calculate_validation_internal_by_metrics

Method that calculates which algorithm behaves best for the datasets provided.
calculate_validation_external_by_metrics

Method that calculates which algorithm behaves best for the datasets provided.
algorithm_pvclust

pvclust package algorithms
dataframe_by_metrics_evaluation

Method to filter only the external measurement columns
convert_numeric_matrix

Method that converts a matrix into numerical format.
algorithms_package

Method that returns all the algorithms executed by the package
convert_table

Method in charge of creating a table from an array with the values of the variable used as a sample and another with the classification of the values.
calculate_best_validation_internal_by_metrics

Method that calculates which algorithm and which metric behaves best for the datasets provided.
agnes_manhattan_method

Method that runs the agnes algorithm using the manhattan metric to make an external or internal validation of the cluster
calculate_connectivity

Method to calculate the Connectivity
external_validation

Method that applicate differents external metrics about a data frame or matrix, for example precision, recall etc
clara_euclidean_method

Method that runs the clara algorithm using the Euclidean metric to make an external or internal validation of the cluster.
fanny_euclidean_method

Method that runs the fanny algorithm using the Euclidean metric to make an external or internal validation of the cluster.
calculate_best_external_variables_by_metrics

Method that calculates the best rated external metrics.
bolts

Data from an experiment on the affects of machine adjustments on the time to count bolts.
diana_euclidean_method

Method that runs the diana algorithm using the Euclidean metric to make an external or internal validation of the cluster.
clara_manhattan_method

Method that runs the clara algorithm using the Manhattan metric to make an external or internal validation of the cluster.
dunn_metric

Method to calculate the dunn.
algorithm_advclust

Advclust package algorithms
information_external

Method that returns an array with the external information of the cluster
entropy_formula

Method for calculating entropy.
entropy_metric

Method to calculate the entropy.
agnes_euclidean_method

Method that runs the agnes algorithm using the Euclidean metric to make an external or internal validation of the cluster.
fowlkes_mallows_index_metric

Method to calculate the fowlkes and mallows.
apclusterK_manhattan

Method that runs the apclusterK algorithm using the Manhattan metric to make an external or internal validation of the cluster.
apclusterK_minkowski

Method that runs the apclusterK algorithm using the Minkowski metric to make an external or internal validation of the cluster.
fmeasure_metric

Method to calculate the f_measure.
aggExCluster_euclidean

Method that runs the aggExcluster algorithm using the Euclidean metric to make an external or internal validation of the cluster.
daisy_gower_method

Method that runs the daisy algorithm using the Gower metric to make an external or internal validation of the cluster.
detect_definition_attribute

Method in charge of detecting the limit of a dataset header.
daisy_manhattan_method

Method that runs the daisy algorithm using the Manhattan metric to make an external or internal validation of the cluster.
algorithm_clusterr

ClusterR package algorithms
algorithm_cluster

cluster package algorithms
appClustering

Clustering GUI.
calculate_best_validation_external_by_metrics

Method that calculates which algorithm and which metric behaves best for the datasets provided.
clustering

Clustering algorithm.
export_file_internal

Export result of internal metrics in latex.
evaluate_validation_internal_by_metrics

Evaluate internal validations by algorithm.
connectivity_metric

Method to calculate the connectivity.
measure_amap

Metrics of the amap algorithm
measure_advclust

Metrics of the advclust algorithm
calculate_best_internal_variables_by_metrics

Method that calculates the best rated internal metrics.
execute_datasets

Evaluation clustering algorithm.
information_internal

Method that returns an array with the internal information of the cluster
internal_validation

Method that applicate differents internal metrics about a data frame or matrix, for example dunn, connectivity etc.
is_External_Metrics

Method that checks for external metrics
measure_pvclust

Metrics of the pvclust algorithm
measure_package

Method that returns all the measures executed by the package
pvpick_method

Method that runs the pvpick algorithm using an external or internal validation of the cluster.
metrics_calculate

Method in charge of verifying the implemented metrics
evaluate_best_validation_internal_by_metrics

Evaluates algorithms by measures of dissimilarity based on a metric.
silhouette_metric

Method to calculate the silhouette.
pvclust_euclidean_method

Method that runs the pvclust algorithm using the Euclidean metric to make an external or internal validation of the cluster.
sort.clustering

Returns the clustering result sorted by a set of metrics.
fanny_manhattan_method

Method that runs the fanny algorithm using the Manhattan metric to make an external or internal validation of the cluster.
gmm_manhattan_method

Method that runs the gmm algorithm using the Manhattan metric to make an external or internal validation of the cluster.
basketball

This data set contains a series of statistics (5 attributes) about 96 basketball players:
extension_file

Method that return the extension of a file
fill_cluster_vector

Method that fill vector
evaluate_validation_external_by_metrics

Evaluate external validations by algorithm.
hclust_euclidean

Method that runs the hcluster algorithm using the Euclidean metric to make an external or internal validation of the cluster.
metrics_external

Method that returns the list of used external metrics
gmm_euclidean_method

Method that runs the gmm algorithm using the Euclidean metric to make an external or internal validation of the cluster.
precision_metric

Method to calculate the precision.
fuzzy_gk_method

Method that runs the fuzzy_gk algorithm using the Euclidean metric to make an external or internal validation of the cluster.
kmeans_rcpp_method

Method that runs the kmeans_rcpp algorithm using the Euclidean metric to make an external or internal validation of the cluster.
fuzzy_cm_method

Method that runs the fuzzy_cm algorithm using the Euclidean metric to make an external or internal validation of the cluster.
is_Internal_Metrics

Method that checks for internal metrics
fuzzy_gg_method

Method that runs the fuzzy_gg algorithm using the Euclidean metric to make an external or internal validation of the cluster.
metrics_internal

Method that returns the list of used internal metrics
plot_clustering

Graphic representation of the evaluation measures.
path_dataset

Method that return a list of files that exists in a directory
row_name_df_internal

Method in charge of obtaining those metrics that are internal from those indicated.
max_value_metric

Method that return max value of metric.
calculate_dunn

Method to calculate the dunn.
row_name_df_external

Method in charge of obtaining those metrics that are external from those indicated.
metrics_validate

Method that returns the list of used metrics
convert_toOrdinal

Method to convert columns to ordinal.
calculate_result

Method that returns the value or variable depending on where it is in the calculated metrics.
kmeans_arma_method

Method that runs the kmeans_arma algorithm using the Euclidean metric to make an external or internal validation of the cluster.
measure_cluster

Metrics of the cluster algorithm
execute_package_parallel

Evaluation clustering algorithm.
evaluate_all_column_dataset

Method in charge of calculating the average for all datasets using all the algorithms defined in the application.
number_columnas_external

Method that returns how many external metrics there are in the array of metrics used in the calculation
evaluate_best_validation_external_by_metrics

Evaluates algorithms by measures of dissimilarity based on a metric.
daisy_euclidean_method

Method that runs the daisy algorithm using the Euclidean metric to make an external or internal validation of the cluster.
measure_clusterr

Metrics of the ClusterR algorithm
show_result_external_algorithm_by_metric

Method that returns a table with the algorithm and the metric indicated as parameters.
refactorName

Method for refactoring the distance measurement name.
show_result_internal_algorithm_by_metric

Method that returns a table with the algorithm and the metric indicated as parameters.
export_file_external

Export result of external metrics in latex.
show_result_internal_algorithm_group_by_clustering

Method in charge of obtaining a table with the results of the algorithms grouped by clusters, calculating the maximum value of each internal metrics.
resultClustering

Method for filtering clustering results.
initializeExternalValidation

Method that return a list of internal validation initialized to zero.
number_columnas_internal

Method that returns how many internal metrics there are in the array of metrics used in the calculation
initializeInternalValidation

Method that return a list of external validation initialized to zero.
pvclust_correlation_method

Method that runs the pvclust algorithm using the Correlation metric to make an external or internal validation of the cluster.
recall_metric

Method to calculate the recall.
read_file

Method that converts a dataset into a matrix
transform_dataset_internal

Method for filtering internal columns of a dataset.
show_result_external_algorithm_group_by_clustering

Method in charge of obtaining a table with the results of the algorithms grouped by clusters, calculating the maximum value of each external metrics.
transform_dataset

Method for filtering external columns of a dataset.
mini_kmeans_method

Method that runs the mini_kmeans algorithm using the Euclidean metric to make an external or internal validation of the cluster.
measure_calculate

Method that returns all the measures executed by the package from the indicated algorithms
measure_apcluster

Metrics of the apcluster algorithm
result_external_algorithm_by_metric

External results by algorithm.
result_internal_algorithm_by_metric

Internal results by algorithm
pam_manhattan_method

Method that runs the pam algorithm using the Manhattan metric to make an external or internal validation of the cluster.
pam_euclidean_method

Method that runs the pam algorithm using the Euclidean metric to make an external or internal validation of the cluster.
mona_method

Method that runs the mona algorithm using external or internal validation of the cluster.
stulong

The study was performed at the 2nd Department of Medicine, 1st Faculty of Medicine of Charles University and Charles University Hospital. The data were transferred to electronic form by the European Centre of Medical Informatics, Statisticsand Epidemiology of Charles University and Academy of Sciences.
number_variables_dataset

Method that returns the number of variables in a dataset directory
[.clustering

Filter metrics in a clustering object returning a new clustering object.
specify_decimal

Method that format a number with four digits
packages

Method that returns the list of used packages
variation_information_metric

Method to calculate the variation information.
stock

The data provided are daily stock prices from January 1988 through October 1991, for ten aerospace companies.
weather

One of the most known testing data sets in machine learning. This data sets describes several situations where the weather is suitable or not to play sports, depending on the current outlook, temperature, humidity and wind.
algorithm_amap

amap package algorithms
best_ranked_external_metrics

Best rated external metrics.