Matrix Distance/Similarity Computation
These functions compute and return the auto-distance/similarity matrix between either rows or columns of a matrix/data frame, or a list, as well as the cross-distance matrix between two matrices/data frames/lists.
dist(x, y = NULL, method = NULL, ..., diag = FALSE, upper = FALSE, by_rows = TRUE, auto_convert = TRUE) simil(x, y = NULL, method = NULL, ..., diag = FALSE, upper = FALSE, by_rows = TRUE, auto_convert = TRUE)
as.dist(x, FUN = NULL) as.simil(x, FUN = NULL)
Missing values are allowed but are excluded from all computations involving the rows within which they occur. If some columns are excluded in calculating a Euclidean, Manhattan, Canberra or Minkowski distance, the sum is scaled up proportionally to the number of columns used.
Distance measures can be used with
simil, and similarity
dist. In these cases, the result is transformed
accordingly using the specified conversion functions (default:
$pr_simil2dist(d) = 1 - s$ and $pr_dist2simil(s) = 1 / (1 - d)$).
Objects of class
dist can be converted one in
Distance and similarity objects can conveniently be subsetted (see examples).
Auto distances/similarities are returned as an object of class
cross-distances/similarities as an object of class
Anderberg, M.R. (1973), Cluster analysis for applications, 359 pp., Academic Press, New York, NY, USA. Cox, M.F. and Cox, M.A.A. (2001), Multidimensional Scaling, Chapman and Hall. Sokol, R.S. and Sneath P.H.A (1963), Principles of Numerical Taxonomy, W. H. Freeman and Co., San Francisco.
dist for compatibility information.
### show available proximities summary(pr_DB) ### binary data x <- matrix(sample(c(FALSE, TRUE), 8, rep = TRUE), ncol = 2) dist(x, method = "Jaccard") ### for real-valued data dist(x, method = "eJaccard") ### for positive real-valued data dist(x, method = "fJaccard") ### cross distances dist(x, x, method = "Jaccard") ### this is the same but less efficient as.matrix(stats::dist(x, method = "binary")) ### numeric data x <- matrix(rnorm(16), ncol = 4) ## test inheritance of names rownames(x) <- LETTERS[1:4] colnames(x) <- letters[1:4] dist(x) dist(x, x) ## custom distance function f <- function(x, y) sum(x * y) dist(x, f) ## working with lists z <- unlist(apply(x, 1, list), recursive = FALSE) (d <- dist(z)) dist(z, z) ## subsetting d[[1:2]] subset(d, c(1,3,4)) ## row and column indexes row.dist(d) col.dist(d)