shipunov (version 1.13)

SM.dist: Simple Match distance

Description

Calculates simple match distance

Usage

SM.dist(data, zeroes=TRUE, cut=FALSE)

Arguments

data

Matrix (or data frame) with variables that should be used in the computation of the distance between rows.

zeroes

If FALSE (not default), zeroes will be ignored, so if data is binary, result will be close to the asymmetric binary distance ('dist(..., method="binary")').

cut

If TRUE (not default), attempt will be made to discretize all numeric columns with number of breaks default to hist(); zeroes will be saved.

Value

Distance object with distances among rows of 'data'

Details

If argument is the data frame, SM.dist() internally converts it into the matrix. If there are character values, they will be converted column-wise to factors and then to integers.

SM.dist() ignores NAs when computing the distance values, and treates zeroes the same way if 'zeroes=FALSE'.

See Also

dist

Examples

Run this code
# NOT RUN {
(mm <- rbind(c(1, 0, 0), c(1, NA, 1), c(1, 1, 0)))
SM.dist(mm)
SM.dist(mm, zeroes=FALSE)
dist(mm, method="binary")

ii <- cluster::pam(SM.dist(sapply(iris[, -5], round)), k=3)
Misclass(ii$clustering, iris$Species, best=TRUE)

i2 <- cluster::pam(SM.dist(iris), k=3) # SM.dist() "consumes" all types of data
Misclass(i2$clustering, iris$Species, best=TRUE)

# }

Run the code above in your browser using DataLab