# dist

##### Matrix Distance/Similarity Computation

These functions compute and return the auto-distance/similarity matrix between either rows or columns of a matrix/data frame, or a list, as well as the cross-distance matrix between two matrices/data frames/lists.

- Keywords
- cluster

##### Usage

```
dist(x, y = NULL, method = NULL, ..., diag = FALSE, upper = FALSE,
by_rows = TRUE, auto_convert = TRUE)
simil(x, y = NULL, method = NULL, ..., diag = FALSE, upper = FALSE,
by_rows = TRUE, auto_convert = TRUE)
```pr_dist2simil(x)
pr_simil2dist(x)

as.dist(x, FUN = NULL)
as.simil(x, FUN = NULL)

##### Details

The interface is fashioned after `dist`

, but can
also compute cross-distances, and allows user extensions by means of
registry of all proximity measures (see `pr_DB`

).

Missing values are allowed but are excluded from all computations involving the rows within which they occur. If some columns are excluded in calculating a Euclidean, Manhattan, Canberra or Minkowski distance, the sum is scaled up proportionally to the number of columns used.

Distance measures can be used with `simil`

, and similarity
measures with `dist`

. In these cases, the result is transformed
accordingly using the specified conversion functions (default:
$pr_simil2dist(d) = 1 - s$ and $pr_dist2simil(s) = 1 / (1 - d)$).
Objects of class `simil`

and `dist`

can be converted one in
another using `as.dist`

and `as.simil`

, respectively.

Distance and similarity objects can conveniently be subsetted (see examples).

##### Value

Auto distances/similarities are returned as an object of class `dist`

/`simil`

and
cross-distances/similarities as an object of class `crossdist`

/`crosssimil`

.

##### References

Anderberg, M.R. (1973), *Cluster analysis for applications*,
359 pp., Academic Press, New York, NY, USA.
Cox, M.F. and Cox, M.A.A. (2001), *Multidimensional Scaling*,
Chapman and Hall.
Sokol, R.S. and Sneath P.H.A (1963), *Principles of Numerical
Taxonomy*, W. H. Freeman and Co., San Francisco.

##### See Also

`dist`

for compatibility information.

##### Examples

```
### show available proximities
summary(pr_DB)
### binary data
x <- matrix(sample(c(FALSE, TRUE), 8, rep = TRUE), ncol = 2)
dist(x, method = "Jaccard")
### for real-valued data
dist(x, method = "eJaccard")
### for positive real-valued data
dist(x, method = "fJaccard")
### cross distances
dist(x, x, method = "Jaccard")
### this is the same but less efficient
as.matrix(stats::dist(x, method = "binary"))
### numeric data
x <- matrix(rnorm(16), ncol = 4)
## test inheritance of names
rownames(x) <- LETTERS[1:4]
colnames(x) <- letters[1:4]
dist(x)
dist(x, x)
## custom distance function
f <- function(x, y) sum(x * y)
dist(x, f)
## working with lists
z <- unlist(apply(x, 1, list), recursive = FALSE)
(d <- dist(z))
dist(z, z)
## subsetting
d[[1:2]]
subset(d, c(1,3,4))
## row and column indexes
row.dist(d)
col.dist(d)
```

*Documentation reproduced from package proxy, version 0.1, License: GNU GENERAL PUBLIC LICENSE Version 2*