Learn R Programming

fingerprint (version 2.1)

distance: Calculates the Distance Between Two Fingerprints

Description

A number of distance metrics can be calculated for binary fingerprints. These metrics can be used to evaluate similarity/dissimilarity between fingerprints and hence are useful for clustering purposes. The function currently allows the evaluation of 4 distance metrics
  • Euclidean
  • Tanimoto
  • Dice
  • Modified Tanimoto
The default metric is the Tanimoto coefficient. In the case of the last 3, the value is actually a similarity value and hence the distance metric is obtained by subtracting the obtained value from 1.0.

Usage

distance(fp1, fp2, method)

Arguments

fp1
An object of class fingerprint
fp2
An object of class fingerprint
method
The type of distance metric desired. Alternative values are euclidean and dice and mt. Partial matching is supported and the deault is tanimoto

Value

  • Numeric value representing the distance in the specified metric between the supplied fingerprint objects

References

Fligner, M.A.; Verducci, J.S.; Blower, P.E.; A Modification of the Jaccard-Tanimoto Similarity Index for Diverse Selection of Chemical Compounds Using Binary Strings, Technometrics, 2002, 44(2), 110-119

Examples

Run this code
# make a 2 fingerprint vectors
fp1 <- new("fingerprint", nbit=6, bits=c(1,2,5,6))
fp2 <- new("fingerprint", nbit=6, bits=c(1,2,5,6))

# calculate the tanimoto coefficient
distance(fp1,fp2) # should be 1

# Invert the second fingerprint
fp3 <- !fp2

distance(fp1,fp3) # should be 0

Run the code above in your browser using DataLab