Learn R Programming

clustMixType (version 0.1-16)

predict.kproto: k prototypes clustering

Description

Predicts k prototypes cluster memberships and distances for new data.

Usage

"predict"(object, newdata, ...)

Arguments

object
Object resulting from a call of resulting kproto.
newdata
New data frame (of same structure) where cluster memberships are to be predicted.
...
Currently not used.

Value

kmeans like object of class kproto:
cluster
Vector of cluster memberships.
dists
Matrix with distances of observations to all cluster prototypes.

Details

The algorithm like k means iteratively recomputes cluster prototypes and reassigns clusters. Clusters are assigned using $d(x,y) = d_{euclid}(x,y) + \lambda d_{simple\,matching}(x,y)$. Cluster prototypes are computed as cluster means for numeric variables and modes for factors (cf. Huang, 1998).

Examples

Run this code
# generate toy data with factors and numerics

n   <- 100
prb <- 0.9
muk <- 1.5 
clusid <- rep(1:4, each = n)

x1 <- sample(c("A","B"), 2*n, replace = TRUE, prob = c(prb, 1-prb))
x1 <- c(x1, sample(c("A","B"), 2*n, replace = TRUE, prob = c(1-prb, prb)))
x1 <- as.factor(x1)

x2 <- sample(c("A","B"), 2*n, replace = TRUE, prob = c(prb, 1-prb))
x2 <- c(x2, sample(c("A","B"), 2*n, replace = TRUE, prob = c(1-prb, prb)))
x2 <- as.factor(x2)

x3 <- c(rnorm(n, mean = -muk), rnorm(n, mean = muk), rnorm(n, mean = -muk), rnorm(n, mean = muk))
x4 <- c(rnorm(n, mean = -muk), rnorm(n, mean = muk), rnorm(n, mean = -muk), rnorm(n, mean = muk))

x <- data.frame(x1,x2,x3,x4)

# apply k prototyps
kpres <- kproto(x, 4)
predicted.clusters <- predict(kpres, x) 


Run the code above in your browser using DataLab