Learn R Programming

parallelpam (version 1.0)

ClassifAsDataFrame: ClassifAsDataFrame

Description

Returns the results of the classification returned by ApplyPAM as a R dataframe

Usage

ClassifAsDataFrame(L, fdist)

Value

Df Dataframe with columns PointName, NNPointName and NNDistance. See Details for description.

Arguments

L

The list returned by ApplyPAM with fields L$med and
L$clasif with the numbers of the medoids and the classification of each point

fdist

The binary file containing the symmetric matrix with the dissimilarities between points (usually, generated by a call to CalcAndWriteDissimilarityMatrix or to CalcAndWriteDissimilarityMatrixDouble)

Details

The dataframe has three columns: PointName (name of each point), NNPointName (name of the point which is the center of the cluster to which PointName belongs to) and NNDistance (distance between the points PointName and NNPointName). Medoids are identified by the fact that PointName and NNPointName are equal, or equivalently, NNDistance is 0.

Examples

Run this code
# Synthetic problem: 10 random seeds with coordinates in [0..20]
# to which random values in [-0.1..0.1] are added
M<-matrix(0,100,500)
rownames(M)<-paste0("rn",c(1:100))
for (i in (1:10))
{
 p<-20*runif(500)
 Rf <- matrix(0.2*(runif(5000)-0.5),nrow=10)
 for (k in (1:10))
 {
  M[10*(i-1)+k,]=p+Rf[k,]
 }
}
JWriteBin(M,"pamtest.bin",dtype="float",dmtype="full")
CalcAndWriteDissimilarityMatrix("pamtest.bin","pamDL2.bin",distype="L2",restype="float",nthreads=0)
L <- ApplyPAM("pamDL2.bin",10,init_method="BUILD")
df <- ClassifAsDataFrame(L,"pamDL2.bin")
df
# Identification of medoids:
which(df[,3]==0)
# Verification they are the same as in L (in different order)
L$med
file.remove("pamtest.bin")
file.remove("pamDL2.bin")

Run the code above in your browser using DataLab