
An implementation of Gaussian PD-Clustering GPDC, an extention of PD-clustering adjusted for cluster size that uses a dissimilarity measure based on the Gaussian density.
GPDC(data=NULL,k=2,ini="kmedoids", nr=5,iter=100)
A class FPDclustering list with components
A vector of integers indicating the cluster membership for each unit
A matrix of cluster means
A list of K elements, with the variance-covariance matrix per cluster
A matrix of probability of each point belonging to each cluster
The value of the Joint distance function
The number of iterations
the data set
A matrix or data frame such that rows correspond to observations and columns correspond to variables.
A numerical parameter giving the number of clusters
A parameter that selects center starts. Options available are random ("random"), kmedoid ("kmedoid", by default), and PDC ("PDclust").
Number of random starts when ini set to "random"
Maximum number of iterations
Cristina Tortora and Francesco Palumbo
Tortora C., McNicholas P.D., and Palumbo F. A probabilistic distance clustering algorithm using Gaussian and Student-t multivariate density distributions. SN Computer Science, 1:65, 2020.
C. Rainey, C. Tortora and F.Palumbo. A parametric version of probabilistic distance clustering. In: Greselin F., Deldossi L., Bagnato L., Vichi M. (eds) Statistical Learning of Complex Data. CLADAG 2017. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham, 33-43 2019. doi.org/10.1007/978-3-030-21140-0_4
PDC,PDQ
#Load the data
data(ais)
dataSEL=ais[,c(10,3,5,8)]
#Clustering
res=GPDC(dataSEL,k=2,ini = "kmedoids")
#Results
table(res$label,ais$sex)
plot(res)
summary(res)
Run the code above in your browser using DataLab