sprint (version 1.0.7)

ppam: Parallel Partitioning Around Medoids

Description

Parallel implementation of the Partitioning Around Medoids algorithm, based on the cluster "pam" serial function.

Usage

ppam(x, k, medoids = NULL, is_dist = inherits(x, "dist"), cluster.only = FALSE, do.swap = TRUE, trace.lev = 0)

Arguments

x
input data, either a 2D array or an ff object
k
positive integer, indicating for the number of clusters
medoids
vector, with the initial 'k' medoids or NULL to let the algorithm select the initial medoids
is_dist
boolean, whether the input data is a distance or dissimilarity matrix or a symmetric matrix
cluster.only
boolean, whether only the clustering is computed and returned
do.swap
boolean, whether the swap phase of the algorithm is required
trace.lev
positive integer for the level of details returned for diagnostics

Details

The interface and parameters to parallel function ppam() are similar to the serial function pam() but not identical. ppam() requires a distance matrix as input parameters. Although, ppam() does not include the option to calculate the distance matrix, this can easily be done using SPRINT pcor() function with the 'distance' parameter set to TRUE.

N.B. Please see the SPRINT User Guide for how to run the code in parallel using the mpiexec command.

See Also

pam ff pcor SPRINT