bmrm (version 4.1)

mmc: Convenient wrapper function to solve max-margin clustering problem on a dataset

Description

Solve max-margin clustering problem with multiple random starting points to avoid being trap by local minima. The random starting points are determined by randomly assigning N0 samples to each cluster and solving for multi-class SVM

Usage

mmc(x, k = 2L, N0 = 2L, LAMBDA = 1, seeds = 1:50,
  nrbmArgsSvm = list(maxCP = 10L, MAX_ITER = 100L),
  nrbmArgsMmc = list(maxCP = 20L, MAX_ITER = 300L),
  mc.cores = getOption("mc.cores", 1L), ...)

Arguments

x

numeric matrix representing the dataset (one sample per row)

k

an integer specifying number of clusters to find

N0

number of instance to randomly assign per cluster when determining a random starting point. The classification dataset it defines is used to train a multi-class SVM whose solution is used as the starting point of current MMC iteration.

LAMBDA

the complexity parameter for nrbm()

seeds

the random seeds to use

nrbmArgsSvm

arguments to nrbm() when solving for multi-class SVM problem

nrbmArgsMmc

arguments to nrbm() when solving for max-margin clustering problem

mc.cores

number of core to use when running the random iterations in parallel

...

additional arguments are passed to mmcLoss()

Value

the MMC model matrix

Examples

Run this code
# NOT RUN {
   # -- Prepare a 2D dataset to cluster with an intercept
   x <- cbind(intercept=100,scale(data.matrix(iris[c(1,3)]),center=TRUE,scale=FALSE))

   # -- Find max-margin clusters
   y <- mmc(x,k=3,LAMBDA=0.001,minClusterSize=10,seeds=5)
   table(y,iris$Species)
   
   # -- Plot the dataset and the MMC decision boundaries
   gx <- seq(min(x[,2]),max(x[,2]),length=100)
   gy <- seq(min(x[,3]),max(x[,3]),length=100)
   Y <- outer(gx,gy,function(a,b){predict(y,cbind(100,a,b))})
   image(gx,gy,Y,asp=1,main="MMC clustering",xlab=colnames(x)[1],ylab=colnames(x)[2])
   points(x[,-1],pch=19+y)
# }

Run the code above in your browser using DataCamp Workspace