Learn R Programming

cba (version 0.2-5)

dists: Matrix Distance Computation

Description

This function computes and return the auto-distance matrix between the rows of a matrix, as well as the cross-distance matrix between two matrices.

Usage

dists(x, y = NULL, method = "minkowski", p = 2)

dapply(x, y = NULL, FUN, ...) dapply.list(x, y = NULL, FUN, ...)

Arguments

x
a numeric matrix object.
y
NULL, or a numeric matrix object.
method
a mnemonic string referencing the distance measure.
p
Minkowski metric parameter.
FUN
a user supplied function.
...
further arguments to the user supplied function.

Value

  • Auto distances are returned as an object of class dist and cross-distances as an object of class matrix.

Warning

This interface is deprecated. Use package proxy instead.

Details

The interface is fashioned after dist: you have to specify a method to use, i.e. a (not so) mnemonic name.

Methods that are also implemented in dist are: minkowski, maximum, canberra, and binary. See the documentation there. Note that for binary the arguments x (and y) must be logical.

For method canberra note that the distances are weighted by the ratio of the total number of columns to the number of columns not excluded from the sum. Also note that if one of the values is Inf and the other is finite, the pair gets included in the sum with value 1.

Additional methods implemented are: [object Object],[object Object],[object Object]

Missing values are allowed but are excluded from all computations involving the rows within which they occur. However, rows (and columns) of NAs are not dropped as in dist.

For compatibility the distance is zero instead of NA in the case two (near) zero vectors are involved in the computation of binary, ebinary, and angular. Note that this is inconsistent with the coding of NA by as.dummy.

Function dapply allows the user to apply arbitrary distance functions that take as arguments at least two vectors (i.e. rows of x, etc.) and return a scalar real value. Note that further arguments to FUN must be named if y = NULL.

Function dapply.list works as above except that argument x (and y) is a list.

See Also

dist for compatibility information.

Examples

Run this code
### binary data
x <- matrix(sample(c(FALSE,TRUE),8,rep=TRUE), ncol=2)
dists(x, method="binary")
### for real-valued data
dists(x, method="ebinary")
### for positive real-valued data
dists(x, method="fbinary")
### cross distances
dists(x, x, method="binary")
### this is the same but less efficient
as.matrix(dists(x, method="binary"))
## test inheritance of names
rownames(x) <- LETTERS[1:4]
dists(x)
dists(x,x)
## custom distance function
f <- function(x, y) sum(x*y)
dapply(x, FUN=f)
dapply(x,x, FUN=f)
## working with lists
z <- unlist(apply(x, 1, list), recursive = FALSE)
dapply.list(z, FUN=f)
dapply.list(z, z, FUN=f)

Run the code above in your browser using DataLab