Kernel mean shift clustering for 2- to 6-dimensional data.

```
kms(x, y, H, max.iter=400, tol.iter, tol.clust, min.clust.size, merge=TRUE,
keep.path=FALSE, verbose=FALSE)
```# S3 method for kms
summary(object, ...)

x

matrix of data values

y

matrix of candidate data values for which the mean shift will
estimate their cluster labels. If missing, `y=x`

.

H

bandwidth matrix/scalar bandwidth. If missing,
`Hpi(x,deriv,order=1)`

is called by default.

max.iter

maximum number of iterations. Default is 400.

tol.iter

distance under which two successive iterations are
considered convergent. Default is 0.001*min marginal IQR of `x`

.

tol.clust

distance under which two cluster modes are considered
to form one cluster. Default is 0.01*max marginal IQR of `x`

.

min.clust.size

minimum cluster size (cardinality). Default is `0.01*nrow(y)`

.

merge

flag to merge clusters which are smaller than
`min.clust.size`

. Default is TRUE.

keep.path

flag to store the density gradient ascent paths. Default is FALSE.

verbose

flag to print out progress information. Default is FALSE.

object

object of class `kms`

...

other parameters

A kernel mean shift clusters set is an object of class `kms`

which is a list with fields:

data points - same as input

matrix of final iterates starting from `y`

bandwidth matrix

vector of cluster labels

number of clusters

frequency table of cluster labels

matrix of cluster modes

variable names

tuning parameter values - same as input

list of density gradient ascent paths where `path[[i]]`

is
the path of `y[i,]`

(only if `keep.path=TRUE`

)

Mean shift clustering belongs to the class of modal or density-based clustering methods. The mean shift recurrence of the candidate point \({\bold x}\) is \({\bold x}_{j+1} = {\bold x}_j + \bold{{\rm H}} {\sf D} \hat{f}({\bold x}_j)/\hat{f}({\bold x}_j)\) where \(j\geq 0\) and \({\bold x}_0 = {\bold x}\). The sequence \(\{{\bold x}_0, {\bold x}_1, \dots \}\) follows the density gradient ascent paths to converge to a local mode of the density estimate \(\hat{f}\). Hence \({\bold x}\) is iterated until it converges to its local mode, and this determines its cluster label.

The mean shift recurrence is terminated if successive iterations are
less than `tol.iter`

or the maximum number of iterations
`max.iter`

is reached. Final iterates which are less than
`tol.clust`

distance apart are considered to form a single
cluster. If `merge=TRUE`

then the clusters whose cardinality is less
than `min.clust.size`

are iteratively merged with their nearest cluster.

If the bandwidth `H`

is missing, then
the default bandwidth is the plug-in selector for the density gradient
`Hpi(x,deriv.order=1)`

. Any bandwidth that is suitable for the
density gradient is also suitable for the mean shift.

Chacon, J.E. & Duong, T. (2013) Data-driven density estimation, with
applications to nonparametric clustering and bump hunting. *Electronic
Journal of Statistics *, **7**, 499-532.

Comaniciu, D. & Meer, P. (2002). Mean shift: a robust approach
toward feature space analysis. * IEEE Transactions on Pattern
Analysis and Machine Intelligence*, **24**, 603-619.

```
# NOT RUN {
library(MASS)
data(crabs)
kms.crabs <- kms(x=crabs[,c("FL", "CW")])
plot(kms.crabs$x, col=kms.crabs$label)
summary(kms.crabs)
# }
```

Run the code above in your browser using DataCamp Workspace